Difference between revisions of "Quick impression"

From Clean
Jump to navigationJump to search
Line 142: Line 142:
 
Note that we have used the pattern ("Amsterdam",to,dist) in the generator of this list expression. Only elements of the list that match this pattern contribute to the result of this list comprehension.
 
Note that we have used the pattern ("Amsterdam",to,dist) in the generator of this list expression. Only elements of the list that match this pattern contribute to the result of this list comprehension.
  
As a matter of fact you might not be satisfied with this program. You will usually read a distance table like in two directions. The fact that the distance form Paris to Amsterdam is 490 Km usually implies that the distance between Amsterdam and Paris is also 490 Km. Nevertheless the tuple ("Paris","Amsterdam",490) does not match the pattern. We can solve this by considering also the tuples from the table where the towns from and to are flipped:
+
As a matter of fact you might not be satisfied with this program. You will usually read a distance table like in two directions. The fact that the distance form Paris to Amsterdam is 490 km usually implies that the distance between Amsterdam and Paris is also 490 km. Nevertheless the tuple ("Paris","Amsterdam",490) does not match the pattern. We can solve this by considering also the tuples from the table where the towns from and to are flipped:
  
 
  Start =  
 
  Start =  
Line 175: Line 175:
 
           ]
 
           ]
  
Since we have no idea what the upper bound of the number of towns reachable from Amsterdam is, we have omitted the upper bound of the dot dot expression [1..]. This dot dot expression denotes an infinite list of integers. Thanks to the lazy evaluation strategy of Clean we can handle such potential infinite datastructures. Lazy evaluation implies that an expression is only evaluated as far as needed to produce the result of the program.
+
Since we have no idea what the upper bound of the number of towns reachable from Amsterdam is, we have omitted the upper bound of the dot dot expression [1..]. This dot dot expression denotes an infinite list of integers. Thanks to the lazy evaluation strategy of Clean we can handle such potentially infinite datastructures. Lazy evaluation implies that an expression is only evaluated as far as needed to produce the result of the program.
  
 
We will illustrate the use of list comprehensions by a number of additional examples below.
 
We will illustrate the use of list comprehensions by a number of additional examples below.

Revision as of 07:46, 21 July 2010

We show here some very elementary examples of Clean for those people who have never seen a functional programming language before or for those of you who just want to have an impression how Clean code looks like in order to compare it with other functional languages.

Although the real power of the language only pops up in serious applications, these tiny examples do give an impression of the compact and elegant language constructs. For a complete language description we refer to the Clean Documentation. A thorough introduction to functional programming in Clean can be found in the Documentation.

The text of this overview is copied from the Hilt web pages. The following language constructs are briefly discussed: Like any new language, Clean contains special syntax and language constructs that may appear unfamiliar to you. Please do take the time to read the explanations when you are puzzled by the program code in the examples. The nature of this language is somewhat different than the character of C, C++ and Java. Do not blame the language when you do not understand its constructs at first glance. It is much easier to learn functional programming in Clean than to become a equally qualified programmer in C++ or Java.

We assume that you are familiar with the basic notions of programming.

Basic function construction

Function definitions in functional languages like Clean look very similar to function definitions in mathematics. Much of the syntactical ballast that is customary in many programming languages can be avoided. Even the parenthesis surrounding the arguments may be avoided. The function to square numbers can be defined as:

square :: Int -> Int // The type of the function: from Int to Int
square n = n * n     // The value is the argument multiplied by itself.

The first line gives the type of this function: one integer as argument and an integers as result. The second line defines how the result of the application of the function square applied to an arbitrary argument n should be computed: multiply that arguments by itself.

To express choice one can define a function in several function alternatives. For instance the factorial function can be defined as:

fac :: Int -> Int 
fac 0 = 1
fac n = n * fac (n-1)

In such a definition, the first alternative that is applicable to the actual arguments will be used. This strategy forces that the first alternative to be applied to the expression fac 0, although also the second alternative matches this expression. Each program in Clean computes the value of the expression Start. By providing an appropriate definition for this function, the value of any expression can be computed. The factorial of six is computed by the program:

Start :: Int 
Start = fac 6

The value of the Start expression is shown to the user. For this example the program will write 720 to the console. Since the value of this expression is shown to the user, the famous "Hello world"-program is just:

Start :: String 
Start = "Hello world"

There are of course many functions that require more than one argument. A simple example is the function that raises x to the power n for some integers x and n.

power :: Int Int -> Int 
power x 0 = 1
power x n = x * power x (n-1)

This function is usually indicated by the infix operator ^ in programming languages. An infix operator is in fact just an ordinary function of two arguments that is written between its arguments: x^n. In Clean you can define your own infix operators. An infix operator is defined just like every other function. There are two slight differences: in the type of the function we indicate that it is an infix operator and the operator is written as an ordinary function between parenthesis in the left-hand side of the definition.

(^) infixr 8 :: Int Int -> Int 
(^) x 0 = 1
(^) x n = x * x ^ (n-1)

The keyword infixr in the type of declaration indicates that this function is a right associative infix operator. The number 8 indicates the binding power. This binding power is chosen in accordance to the mathematical habits. This implies that there are no parenthesis needed in the body of the second alternative of the operator to distinguish (x*x)^(n-1) and x*(x^(n-1)). In fact this operator is defined as a type class in the standard environment of Clean.

Guards

Although distinguishing various cases in the definition of a function by patterns is very powerful, it is not enough. For instance, using patterns it is impossible to check whether an integer argument is positive, or larger than another argument. This can be solved using guards. A guard is a boolean expression that can be inserted between the patterns of a function alternative and the symbol =. The symbol | separates the patterns and the guard. The alternative is only applied when the guards yields True. Each function alternative can have a sequence of guarded right-hand sides. Consider the function maximum as example, this function yields the largest of its two arguments.

maximum :: Int Int -> Int
maximum n m
  | n < m  = m
  | n >= m = n

When you know that the last guard inside such a function alternative will always yield True, as in the given example, it can be omitted or replaced by the keyword otherwise. The type of the function maximum indicates that it can be used for any two elements of the same type t provided that arguments of this type can be compared.

Higher order functions

Functions can be the arguments and results of other functions. As a somewhat extreme example we consider the function twice. The functions twice takes an other function, f, and an argument, x, for f as arguments. The function f is applied twice to the arguments x:

twice f x = f (f x)

Using this function we can compute the square of the square of 2 by the program:

Start = twice square 2

This program yields 16. In the same style we can compute the value of square (square (square (square 2))) by:

Start = twice twice square 2

This program yields 65536.

Just like polymorphism and type classes, higher order functions makes programs easier to understand, modify and reuse.

Polymorphism

For many functions the type of the argument is irrelevant. The simplest example is the identity function I. The argument of this function is its result. The type of this argument is irrelevant. To indicate this we use a type variable in the type definition of the type of such a polymorphic (having many forms) function.

I :: t -> t  // Type: from type t to type t.
I x = x

Also the function twice defined above is polymorphic. In our examples we used it with an x of type Int as well as an x of type Int -> Int. In effect the type of the function square defined above is more restrictive than necessary. This function can be used for any type t, provided that the multiplication is defined for that type t. This is indicated by a type restriction of the form | * t. The more general type is used in the next definition for the function square:

square :: t -> t | * t // Type: from t to t provided that there is a * for t.
square n = n*n

Also the function maximum defined above can have a more general type:

maximum :: t t -> t | < t
maximum n m
  | n < m  = m
  | n >= m = n

The Clean System knows how to compute greater or equal, >=, when less than, <, is defined. Functions that can process arguments of various types are called polymorphic. When there is a type restriction, like * t, on one or more of the type variables involved we have an implicit type class rather than a polymorphic function.

Lists

Lists are one of the basic compound data structures of Clean. A list can have any number of elements. All elements of a list should have the same type. Since lists are so commonly used in functional programming, there is some special syntax for lists. A list of integers can be written as an enumeration: [7,12] or [1,2,3,4,5]. It is also possible to use the list generator like [1..5] in expressions. Such a phrase is called a dot dot expression The empty list is written as []. Finally, a list with head element x and tail xs is written as [x:xs] (note the difference between [x,xs] (a list of two elements) and [x:xs]).

This list notation can be used in the patterns of a function. For instance, the function product that computes the product of the elements of a list. The type declaration states that this function takes a list of an arbitrary type t as argument and produces an value of type t, provided that type t is member of type class one and type class *. This just implies that multiplication and its unit element, one, should be defined for type t.

product :: [t] -> t | one, * t 
product []    = one 
product [x:r] = x * product r

The first alternative of this function states that the product of a empty list is one. The second alternative states that the product of a list consisting of element x and tail r is equal to the multiplication of the first element and the product of the tail of the list. The compiler selects the appropriate instances of the classes one and * for each application.

This can be used to give an alternative definition of the factorial function. The factorial of some number n is equal to the product of the list of numbers from 1 to n.

fac :: Int -> Int 
fac n = product [1..n]

The same function product can be used for many other types, for instance to compute the product of a list of reals.

List comprehensions

The list comprehensions in the functional programming language Clean are a very compact and powerful way to express list manipulation. For example the square of all integers from 1 to 10 that are not dividable by three are computed by the program:

Start = [n*n \\ n <- [1..10] | n rem 3 <> 0]

As desired this program yields [1,4,16,25,49,64,100]. In general a list comprehension yields the list of values indicated between [ and \\. These values are computed for each element of the generators (between // and |), that obeys the condition between | and ]. For more realistic program we define a distance table as a list of tuples:

:: From :== String
:: To   :== String
:: Km   :== Int
table :: [(From, To, KM)]
table = [ ("Amsterdam", "Nijmegen",122 )
        , ("Paris"    , "Amsterdam" ,490 )
        , ("Paris"    , "Rome"      ,1140)
        , ("Berlin"   , "Amsterdam" ,705 )
        , ("Amsterdam", "Kobenhaven",764 )
        , ("Amsterdam", "Rome"      ,1640)
        , ("Moscow"   , "Amsterdam" ,2523)
        ]

Using a list comprehension we can select all towns closer than 1000 km to Amsterdam, with the distance, by the program:

Start = 
  [  (to, dist)
  \\ ("Amsterdam", to, dist) <- table
  |  dist < 1000
  ]

Note that we have used the pattern ("Amsterdam",to,dist) in the generator of this list expression. Only elements of the list that match this pattern contribute to the result of this list comprehension.

As a matter of fact you might not be satisfied with this program. You will usually read a distance table like in two directions. The fact that the distance form Paris to Amsterdam is 490 km usually implies that the distance between Amsterdam and Paris is also 490 km. Nevertheless the tuple ("Paris","Amsterdam",490) does not match the pattern. We can solve this by considering also the tuples from the table where the towns from and to are flipped:

Start = 
  [  (to, dist)
  \\ ("Amsterdam", to, dist) <- table ++ [(t2, t1, d) \\ (t1, t2, d) <- table]
  |  dist < 1000
  ]

These examples have a single generator, a source of values to compute the list elements. In general there can be many generators. For instance we can compute the towns that can be reached from Amsterdam via some other town.

Start =
  [  (to, dist1 + dist2)
  \\ ("Amsterdam", t2, dist1) <- connections
  ,  (t3         , to, dist2) <- connections
  |  t2 == t3
  && to <> "Amsterdam"
  ]
  where connections = table ++ [(t2, t1, d) \\ (t1, t2, d) <- table] 

Here we have used a local definition to share the definition of the connections to be considered. The scope of local definitions is the function alternative where they are defined. Local definitions are used to limit the scope of definitions, to share a value or to name an expression. The local definitions used in this text are preceded by the keyword where. The language Clean has a rich palette of local definitions.

The last condition in the list comprehension excludes routes that ends in Amsterdam. In this example the generators are separated by a semicolon, this implies that all possible combinations of elements from the generators are considered. It is also possible to separate the generators by a &-symbol. This implies that the generators are used synchronously. This is for instance useful to label the towns reachable from Amsterdam with consecutive numbers.

Start =
  [  (n, to, dist)
  \\ (to, dist) <- fromAmsterdam
  &  n <- [1..]
  ]
  where fromAmsterdam = 
          [  (to, dist)
          \\ ("Amsterdam", to, dist) <- table ++ [(t2, t1, d) \\ (t1, t2, d) <- table]
          ]

Since we have no idea what the upper bound of the number of towns reachable from Amsterdam is, we have omitted the upper bound of the dot dot expression [1..]. This dot dot expression denotes an infinite list of integers. Thanks to the lazy evaluation strategy of Clean we can handle such potentially infinite datastructures. Lazy evaluation implies that an expression is only evaluated as far as needed to produce the result of the program.

We will illustrate the use of list comprehensions by a number of additional examples below.

Records

Clean has a notion of records similar to relational database systems and many modern programming languages. A simple example is the record type Product. Records of this type contain three fields: a product name of type String, a price of type Real, and a supplier of type String:

:: Product = { name     :: String 
             , price    :: Real 
             , supplier :: String 
             }

A concrete record of this type can be defined by specifying values for all fields. For example:

beer :: Product 
beer = { name     = "Grolsch Beer" 
       , price    = 0.80 
       , supplier = "Groenlo Brewery" 
       }

Of course it is allowed to use any type you want for the fields of a record. The compiler will always check the type consistency of all fields of any record that will occur in your program. A simple example of records with compound types is Order. The type of the field contents of the record type Order is a list of records of type Item. In database terms this is non-first normal form (NFNF).

Order = { customer :: String 
        , date     :: Date 
        , contents :: [Item] 
        } 
:: Item = { product  :: String 
          , quantity :: Int 
          }

An example of a small order of this type is:

myOrder :: Order 
myOrder = { customer = "Pieter" 
          , date     = "30 jan '98" 
          , contents = [ { product  = "Grolsch Beer" 
                         , quantity = 5 
                         } 
                       , { product  = "Pizza" 
                         , quantity = 1 
                         } 
                       ] 
          }

List comprehensions are very useful to handle lists of these records. This is illustrated by the function total. This function takes an order and a list of products as arguments and computes the total price of the ordered items.

total :: Order [Product] -> Real 
total order products =
  sum [   toReal i.quantity * p.price 
      \\  i <- order.contents 
      ,   p <- products 
      |   p.name == i.product 
      ]

This is similar to the SQL query:

SELECT sum (i.quantity * p.price) 
FROM   myOrder.contents i, Product p 
WHERE  p.name = i.product

In contrast with embedded SQL in any host language, list comprehensions are completely integrated in the language Clean. The examples of this text show that list comprehensions can be part of recursive functions. The result of a list comprehension is an ordinary list. Such a list can be processed by any list processing function. This implies that one is not restricted to the famous five aggregate functions of SQL.

On the other hand it is important to realize that Clean is not a database system. For instance there are no primitives for transaction management, nor to handle extraordinary large lists efficiently.

Classes and overloading

Clean has a very powerful notion of classes. It is important to understand that this concept of classes is different from the classes you might know from object oriented languages. In Clean a class is a family of functions with the same name. The difference between these family members is the type processed. As a very simple example consider the class of increment functions.

class inc t :: t -> t

This says that the class inc has type variable t. There is only a single manipulation function in this class, which is also named inc. The type of this increment function is t -> t. Instances of this class for integers and reals are defined by:

instance inc Int where
  inc i = i+1 
instance inc Real where
  inc r = r+1.0

Even the record Item, an element of an order, can be incremented:

instance inc Item where 
  inc i = {i & quantity = inc i.quantity}

This reads the increment of item i, is that item with the field quantity set to the increment of that field from the argument record.

Also basic operations like +, == and < are defined as a type class, in fact we have used this to define the functions square and product shown above. This implies that it is possible to define these basic operators for your own datatypes. For instance we can define the comparison of products (the record Product form above) as the comparison of their names:

instance < Product where 
  (<) p1 p2 = p1.name < p2.name

Using list comprehensions we can easily define the famous quick sort algorithm for lists of elements of the class <. The quick sort algorithm states that an empty list is always sorted. A non empty list is split in a fraction of elements smaller than the first element and a fraction of elements larger than or equal to the first element. These sub-lists are sorted separately and the resulting lists are glued together by the append operator ++:

qsort :: [t] -> [t] | < t 
qsort []    = [] 
qsort [e:r] = qsort [ x \\ x <- r | x<e ] 
              ++ [e] ++ 
              qsort [ x \\ x <- r | x>=e ]

The operator >= is derived from the operator <:

class >= a where
  (>=) infix 4 :: a a -> Bool | < a
  (>=) x y :== not (x<y)

This implies that >= is defined for a class as soon as the operator < is defined. This function qsort is able to sort lists of integers as well as list of products, and any other member of the class <. To be member of the same class, datatypes need not have any relation. It is sufficient that the appropriate functions are defined for that datatype. As you will understand now, the classes in Clean are more general than the notion of classes in most object oriented languages. It is easy to emulate the notion of classes from those object oriented languages using Clean's concept of classes. implicit classes

In fact you define a class each time you define a polymorphic function with a type restriction. Consider the function to add VAT to prices as integers as well as real numbers. This function converts its argument to a real number, computes the price with VAT, and converts the real number back to the original type.

addVAT :: t -> t | toReal, fromReal t
addVAT p = fromReal ((1.0 + VAT) * toReal p)
VAT :== 0.175

The exact type conversions needed are determined by the use of the function. In the program below we use this function to add VAT to an integers and a real number.

Start = (addVAT 100, addVAT 100.0)

This program yields (118, 117.5).

Algebraic Datatypes

In its simplest version an algebraic datatype is just an enumeration of the possible values. For instance the algebraic datatype Day contains the seven obvious constructors:

:: Day = Sun | Mon | Tue | Wed | Tur | Fri | Sat

The advantage of such an algebraic datatype above representing days by numbers or strings is that the compiler is able to guarantee that only valid values will occur during program execution.

Algebraic datatypes are much more general than only enumeration types. The constructors of such an algebraic type can take any number of arguments. These arguments can be any type, including the type of the constructor itself. So, algebraic types can be recursive. By providing one or more type variables as argument to the algebraic type, this type becomes polymorphic. The obvious example of an polymorph algebraic datatype is the type list:

:: List t = Nil | Cons t (List t)

Since lists are very heavily used in functional programming languages this type is predefined, as well as special syntax and language constructs for lists (like list comprehensions). Another example is the type

Tree to represent trees of any type. Such a tree is either Empty, or it is a Node containing a left sub-tree of type Tree a, an element of type a and a right sub-tree.

:: Tree a = Empty | Node (Tree a) a (Tree a)

Since Tree is a polymorph datatype any type of elements can be stored in a tree. However, the type system guarantees that each instance of tree contains only elements of a given type. For example a tree containing lists of integers is:

a_tree :: Tree [Int] 
a_tree = Node (Node Empty [] Empty) [1..3] Empty

The top node of this tree contain the list with the integers 1, 2 and 3, its left sub-tree contains only one node with an empty list of integers and the right sub-tree of the top node is empty.

Trees can be manipulated by ordinary functions. Often it is convenient to use different function alternatives for the various constructors and to use a pattern match to select the arguments of a constructor. For instance the function to flip a tree is defined as:

Flip :: (Tree a) -> Tree a 
Flip Empty = Empty 
Flip (Node left elem right) = Node (Flip right) elem (Flip left)

Flipping an empty tree is the empty tree. Flipping a tree that consists of a node with left sub-tree left, element elem and right sub-tree right, is another node. The left sub-tree of the new node is obtained by flipping the sub-tree right. The element in the node is the original element, and the right sub-tree is the flipped version of the original left sub-tree.

It is simple to define a function that transforms a tree to a list by a preorder traversal (left to right, depth first) of the tree.

toList :: (Tree t) -> [t] 
toList Empty           = [] 
toList (Node l elem r) = toList l ++ [elem] ++ toList r

In order to transform lists to sorted trees, search trees, we first define a function to insert a single element in a sorted tree. Of course it is requires that the elements of the tree can be compared. Hence, we have the type context < a.

insertTree :: a (Tree a) -> Tree a | < a 
insertTree x Empty = Node Empty x Empty 
insertTree x (Node l y r) 
  | x < y     = Node (insertTree x l) y r 
  | otherwise = Node l y (insertTree x r)

A list of elements can be transformed to a sorted tree by the function toTree. An empty list becomes an empty tree. A list with head hd and tail tl, is the tree that is obtained by inserting hd in the tree that is obtained by transforming tl to a tree.

toTree :: [a] -> Tree a | < a 
toTree []      = Empty 
toTree [hd:tl] = insertTree hd (toTree tl)

The function tree sort is the function composition (indicated by the infix operator o) of the functions toList and toTree. Function composition is defined identical to mathematics: (f o g) x = f (g x).

tsort :: ([a] -> [a]) | < a 
tsort = toList o toTree

The power of such a polymorph algebraic datatypes can be approximated by templates in languages like C++. However, algebraic datatypes in functional languages are much easier to understand and use (no pointer handling and dangling pointers), more general and completely type save.

Final remarks

Using a number of toy examples we have given you an impression of functional programming in Clean . Although the clear and compact notation might look a bit unfamiliar to you, it is much more than a curious toy. The terse syntax makes that you have a clear view over what is actually happening in this function.

As you might have noticed there are no side-effects. This referential transparency is a general property of the language: the value of an expression is only depended of the functions and their arguments that forms this term. This makes it possible to argue about programs by equational reasoning.

The static type system of Clean guarantees that runtime type errors cannot occur. This eliminates the need of extensive tests whether the program is free of this kind of errors. Verification of the type consistency by the compiler is faster and more secure. You can spent your time on validating or improving the behavior of the program.

Together these properties makes it much easier to write good algorithms, to understand what others have written, and to change existing functions. This makes functional languages like Clean very well suited for incremental or evolutionary development. Maintenance becomes much easier an reuse is finally feasible.