University of Zagreb Faculty of Electrical Engineering and Computing PROGRAMMING IN HASKELL Academic Year 2013/2014 LECTURE 7: Custom data types 1 v1.0 (c) 2013 Jan Šnajder ============================================================================== > import Data.List === DATA TYPES =============================================================== > data Tricolor = Red | Green | Blue 'Tricolor' is a new type. 'Red', 'Green', and 'Blue' are DATA CONSTRUCTORS The names of types and data constructors are capitalized. Let's try this out: ghci> :t Red Data types like this one, in which we explicitly enumerate the possible values, are called ALGEBRAIC DATA TYPES. This is different from defining an alias for an existing type, which we do as follows: > type Word = [Char] > type Coord = (Int,Int) Up until now we've already seen a couple of data types: data Bool = True | False data Ordering = LT | EQ | GT Let's recall: compare :: Ord a => a -> a -> Ordering We can of course use our custom data types when we define new functions: > warmColor :: Tricolor -> Bool > warmColor Red = True > warmColor _ = False > myColor = Red What happens if we try to evaluate the value of 'myColor' in ghci? The value cannot be shown because its type is not a member of the 'Show' type class. We can fix this easily: data Tricolor = Red | Green | Blue deriving Show This makes the 'Tricolor' type a member of the 'Show' type class (more precisely, it automatically derives an instance of the 'Show' class for this type). IMPORTANT: Data constructors must be unique! The same data constructor cannot be used to define different types. This won't work: data Tricolor = Red | Green | Blue data WarmColors = Red | Orange | Yellow The constructors 'Red', 'Green', and 'Blue' are actually values. We call such constructors NULLARY CONSTRUCTORS. A type that only contains nullary constructors is similar to ENUMERATIONS in other programming languages. But constructors can also be binary, ternary, etc. E.g., (example borrowed from LYAHFGG): > data Shape = > Circle Double Double Double > | Rectangle Double Double Double Double > deriving Show 'Circle' is a ternary constructor: it takes three real numbers (the coordinates and the radius). 'Rectangle' is a quaternary constructor: it takes four numbers (two pairs of coordinates). This reveals that data constructors are actually functions. What is the type of the 'Circle' constructor? Circle :: Float -> Float -> Float -> Shape We can also pattern match against constructors with multiple arguments: > isCircle :: Shape -> Bool > isCircle (Circle _ _ _) = True > isCircle _ = False Btw., we could also have defined: data Shape = Circle (Float, Float) Float | Rectangle (Float, Float) (Float, Float) Let's look at some examples of functions (and values) that use 'Circle' and 'Rectangle' types: > myCircle = Circle 5 5 10 > myRectangle = Rectangle 10 10 100 200 > unitCircle x y = Circle x y 1 A function to compute the area of a shape: > area :: Shape -> Double > area (Circle _ _ r) = r ^ 2 * pi > area (Rectangle x1 y1 x2 y2) = (abs $ x1 - x2) * (abs $ y1 - y2) Because data constructors 'Circle' and 'Rectangle' give values of the same type, those values can be combined in a single list: > myShapes = [myCircle,myRectangle,unitCircle 0 0,Rectangle 0 0 5 5] This is actually how one circumvents the problem of not being able to construct heterogeneous lists in Haskell. We can now define: > totalArea :: [Shape] -> Double > totalArea = sum . map area A bit better approach to define the above types: > data Point = Point Double Double > deriving Show > data Shape2 = Circle2 Point Double | Rectangle2 Point Point > deriving Show Note that we use 'Point' as both a type name and a data constructor. This is totally OK (and in fact common for types with only one data constructor). > myCircle2 = Circle2 (Point 5 5) 10 > myRectangle2 = Rectangle2 (Point 10 10) (Point 10 10) > area2 :: Shape2 -> Double > area2 (Circle2 _ r) = r ^ 2 * pi > area2 (Rectangle2 (Point x1 y1) (Point x2 y2)) = > (abs $ x1 - x2) * (abs $ y1 - y2) === EXERCISE 1 =============================================================== 1.1. - Define a 'Date' structure with the appropriate fields. - Define a function that shows a date in the DD.MM.YYYY format (without leading zeroes). showDate :: Date -> String 1.2. - Define a function translate :: Point -> Shape2 -> Shape2 that translates a shape into the direction of vector (x,y). 1.3. - Write a function 'inShape' that tests whether a point is contained within a given shape (or on its border). inShape :: Shape2 -> Point -> Bool - Write a function 'inShapes' that tests if the point is within any shape from the list of shapes. inShapes :: [Shape] -> Point -> Bool 1.4. - Define your type 'Vehicle' that can be a 'Car', 'Truck', 'Motorcycle', or 'Bicycle'. The first two store a name of the manufacturer (String) and horsepower (Int). - Write a function 'totalHorsepower' that adds up the horsepower of the vehicles, assuming that bicycle's horsepower is 0.3 === RECORDS ================================================================== > data Level = Bachelor | Master | PhD deriving (Show,Eq) > data Student2 = Student2 String String String Level Double deriving Show > firstName2 :: Student2 -> String > firstName2 (Student2 f _ _ _ _) = f > lastName2 :: Student2 -> String > lastName2 (Student2 _ l _ _ _) = l > studentId2 :: Student2 -> String > studentId2 (Student2 _ _ i _ _) = i This is somewhat tedious and long-winded. It is better to use RECORDS: > data Student = Student { > firstName :: String, > lastName :: String, > studentId :: String, > level :: Level, > avgGrade :: Double } deriving Show This automatically gives us: firstName :: Student -> String lastName :: Student -> String studentId :: Student -> String level :: Student -> Level avgGrade :: Student -> Double We can now define one record: > bestStudent = Student { > studentId = "0036491215", > firstName = "John", lastName = "Doe", > level = Master, avgGrade = 5.0 } Let's define a function to show some data from the record: > showStudent :: Student -> String > showStudent s = studentId s ++ " " ++ firstName s ++ " " ++ lastName s or > showStudent2 :: Student -> String > showStudent2 s = intercalate " " [studentId s,firstName s,lastName s] We can also define it like this: > showStudent3 :: Student -> String > showStudent3 (Student {studentId=id,firstName=f,lastName=l}) = > intercalate " " [id,f,l] Let's write a function to select students whose average grade is above a given threshold: > aboveStudents :: Double -> [Student] -> [Student] > aboveStudents x = filter ((>=x) . avgGrade) When we define a record, we need not define all fields. Those that we don't define will be 'undefined'. The code will compile, but we'll get a warning. > someStudent = Student { firstName = "Marko", avgGrade = 4.3 } Will 'showStudent someStudent' work? Will 'map firstName $ aboveStudents 4.0 [bestStudent, someStudent]' work? We can modify field values in a record: > bestStudent2 = bestStudent { avgGrade = 4.9 } > someStudent2 = someStudent { lastName = "Markov", studentId = "0036365438" } This is useful when we want to define default values: > bachelorStudent = Student { level = Bachelor } > masterStudent = Student { level = Master } > phdStudent = Student { level = PhD } > newStudent = bachelorStudent { > firstName = "Zoran", lastName = "Zoki", > studentId = "00364532350", avgGrade = 4.5 } We can define a record in a shorter way, respecting the order of the fields: > newStudent2 = Student "Petar" "Perić" "00364542345" Master 3.5 === EXERCISE 2 =============================================================== 2.1. - Define a function that increases the average grade of the student by 1.0, but not above 5.0. improveStudent :: Student -> Student 2.2. - Write a function to compute the average grade of students for the different study levels. avgGradePerLevels :: [Student] -> (Double,Double,Double) 2.3. - Write a function that returns a list of matriculation numbers for a given study level, sorted by average grade in descending order. rankedStudents :: Level -> [Students] -> [String] 2.4. - Write a function addStudent :: Student -> [Student] -> [Student] that adds a student to a list of students. If a student with an identical matriculation number already exists in the list, the function should return an error. === PARAMETRIZED TYPES ======================================================= > data OldLevels = Graduate | Doctorate deriving Show > data GeneralStudent a = Student3 String String String a Double deriving Show 'GeneralStudent' has a type parameter 'a'. Depending on what type we choose for 'a', we will get different types: > type BolognaStudent = GeneralStudent Level > type FER1Student = GeneralStudent OldLevels We call such types, which take parameters as input, TYPE CONSTRUCTORS. Parametrized data types are typically data containers of some sort. E.g.: > data MyBox a = InBox a So, 'MyBox' is a type constructor that we can use to define different types. E.g.: > type StringBox = MyBox String > type IntBox = MyBox Int 'InBox' is a data constructor that we can use to construct different values of different types. Haskell will automatically determine the correct type: > b1 = InBox 1.2 > b2 = InBox "Haskell" > b3 = InBox (1,3) What are the types of these expressions? A better way to accomplish the same: > data Box a = Box { unbox :: a } deriving Show A parametrized type can have multiple parameters: > data MyPair a b = MyPair (a,b) deriving Show So we can have: MyPair 1 1 :: MyPair Int Int MyPair "bla" 1.2 :: MyPair String Double We can, for example, define: > type IntMyPair = MyPair Int Int > type MyPairType a = MyPair a String A function to take the first element from 'MyPair': > fstMyPair :: MyPair a b -> a > fstMyPair (MyPair (x,_)) = x Now, there is one very important parametrized type (defined in 'Data.Maybe'): data Maybe a = Nothing | Just a For instance, we can have: Just 5 :: Maybe Int Just "bla" :: Maybe String Just (1,1) :: Maybe (Int,Int) The 'Maybe' data type is used for situations in which (1) a value is optional, (2) an error can occur. CASE 1: Optionality > data Employee = Employee { > name :: String, > salary :: Maybe Double } deriving Show We can now define: > showSalary :: Employee -> String > showSalary e = case salary e of > Nothing -> "unknown" > Just n -> show n ++ " kn" A function to concatenate two 'Maybe String': > concatMaybeStrings :: Maybe String -> Maybe String -> Maybe String > concatMaybeStrings (Just s1) (Just s2) = Just $ s1 ++ s2 > concatMaybeStrings s@(Just _) Nothing = s > concatMaybeStrings Nothing s@(Just _) = s > concatMaybeStrings _ _ = Nothing CASE 2: Error handling > safeHead :: [a] -> Maybe a > safeHead [] = Nothing > safeHead (x:_) = Just x Also useful is the 'Either' type: data Either a b = Left a | Right b It is commonly used when we also want to return an error message. We return 'Right b' is there was no error, otherwise we return 'Left a', where 'a' is typically a 'String' (the error message). > safeHead2 :: [b] -> Either String b > safeHead2 [] = Left "empty list" > safeHead2 (x:_) = Right x === EXERCISE 3 =============================================================== 3.1. - Define your own parametrized type 'MyTriplet' that contains the values of three different types. Do this using a record. - Define a function toTriplet :: MyTriplet a b c -> (a,b,c) that converts a 'MyTriplet' value into an ordinary triplet. 3.2. - Define a function totalSalaries :: [Employee] -> Double that sums the known salaries of employees (salaries that are not 'Nothing'). 3.3. - Write a function 'addStudent2' that works like 'addStudent' from problem 2.4 but returns a 'Maybe' type instead of an error. addStudent2 :: Student -> [Student] -> Maybe [Student] - Write 'addStudent3' that returns an 'Either'. === FMAP ===================================================================== Consider the following data structures: > data Customer = Customer { > customerName :: String, > customerAge :: Int, > customerAddress :: Maybe Address } > deriving (Eq,Show) > data Address = Address { > streetName :: String, > streetNumber :: Int, > zipCode :: String, > city :: String } > deriving (Eq,Show) > c1 = Customer "Ivo" 22 (Just $ Address "Bauerova" 10 "10000" "Zagreb") > c2 = Customer "Ana" 30 Nothing Let's write a function that returns the city in which a customer lives, provided the address is known: > customerCity :: Customer -> Maybe String > customerCity c = case customerAddress c of > Just a -> Just $ city a > Nothing -> Nothing Similarly, a function that returns a street name and number: > customerStreet :: Customer -> Maybe (String,Int) > customerStreet c = case customerAddress c of > Just (Address s n _ _) -> Just (s,n) > Nothing -> Nothing There's a recurring pattern in the above functions: we want to apply some function 'f' to a datum wrapped with 'Just' and return 'Just (f x)' or 'Nothing', if there's no datum. If there is a datum, we need to unwrap it, apply a function, and then wrap it up again into 'Just'. There is a function that does exactly this: fmap :: (a -> b) -> Maybe a -> Maybe b fmap f (Just x) = Just $ f x fmap _ Nothing = Nothing (Actually, 'fmap' is not really defined like this, rather a bit more generic, but more on this later.) We now can define: > customerCity2 :: Customer -> Maybe String > customerCity2 = fmap city . customerAddress > customerStreet2 :: Customer -> Maybe (String,Int) > customerStreet2 = fmap (\(Address s n _ _) -> (s,n)) . customerAddress