University of Zagreb
Faculty of Electrical Engineering and Computing

PROGRAMMING IN HASKELL

Academic Year 2016/2017

LECTURE 5: Recursive functions 1

v1.1

(c) 2016 Jan Snajder

==============================================================================

> import Data.Char
> import Data.List

== RECAP =====================================================================

Last week we've covered the full syntax of Haskell functions. You now know how
to write arbitrary complex functions, at least in terms of syntax. However, to
really unleash the full computational power of Haskell, you need to know how
to write recursive functions.

== INTRO =====================================================================

Recursion: function calls itself.

In FP, many problems are solved using recursion. The main idea: divide the
problem into smaller subproblems and try to solve these subproblems as simplest
cases first.

The simplest case is called "the base case".

A common example is the factorial function:

> fact :: (Eq a, Num a) => a -> a
> fact x = if x==0 then 1 else x * fact (x-1)

Or, better:

> fact' :: (Eq a, Num a) => a -> a
> fact' 0 = 1
> fact' x = x * fact' (x-1)

Another typical example is the Fibonacci Number:

> fib 0 = 0
> fib 1 = 1
> fib n = fib (n-1) + fib (n-2)

(This definition is not the best one. Do you know why?)

Haskellers are quite proud of the quicksort definition:

> quicksort :: (Ord a) => [a] -> [a]
> quicksort [] = []
> quicksort (x:xs) = quicksort ys ++ [x] ++ quicksort zs
>   where ys = [y | y <- xs, y <= x]
>         zs = [z | z <- xs, z  > x]

In what follows, we focus on recursion over a data structure. This is typically
a LIST or a TREE, but more generally it can be any recursive structure. We call
such a recursion STRUCTURAL RECURSION. We use structural recursion to process a
data structure (something we would do using a loop in an imperative language).

The main idea: recurse down the structure by gradually decomposing it using
pattern matching, and combining the results:

> sum' :: Num a => [a] -> a
> sum' []     = 0
> sum' (x:xs) = x + sum' xs

> length' :: [a] -> Int
> length' []     = 0
> length' (_:xs) = 1 + length' xs

> incList :: Num a => [a] -> [a]
> incList []     = []
> incList (x:xs) = (x + 1) : incList xs

(The last function can be defined via list comprehension. How?)

> concat' :: [[a]] -> [a]
> concat' []       = []
> concat' (xs:xss) = xs ++ concat' xss

> maximum' :: Ord a => [a] -> a
> maximum' [x]    = x
> maximum' (x:xs) = x `max` maximum' xs

(What would happen if we were to apply this function to an empty list?)

What is the time complexity of the above functions? (Except for
the 'fib' and 'quicksort' functions.)

For lists of length 'n', the time complexity is O(n).

Notice that there is a recurring pattern in the above functions: 

foo ...                             <-- base case
foo (x:xs) = f x `operator` foo xs  <-- general case

== EXERCISE 1 ================================================================

1.1.
- Define a recursive function to compute the product of a list of elements.
  product' :: Num a => [a] -> a

1.2.
- Define a recursive function 'headsOf' that takes a list of lists and
  returns a list of their heads.
  headsOf :: [[a]] -> [a]
  headsOf [[1,2,3],[4,5],[6]] => [1,4,6]

==============================================================================

A recursive function can of course have many arguments. Arguments that remain
unchanged throughout the recursive calls serve merely to store the state. We
call such arguments CONTEXT VARIABLES. For example:

> addToList :: Num a => a -> [a] -> [a]
> addToList _ []     = []
> addToList n (x:xs) = x + n : addToList n xs

Of course, if required, we can change the variables in each recursive call:

> incIncList :: Num a => a -> [a] -> [a]
> incIncList _ []     = []
> incIncList n (x:xs) = x + n : incIncList (n+1) xs

What if we wanted to define a function that increments the first element by 0,
the second by 1, etc.?

incIncList' [3,2,1] => [3,3,3]

We don't want the user to always provide 0 as the argument. Instead, we define
a WRAPPER FUNCTION:

> incIncList' :: Num a => [a] -> [a]
> incIncList' xs = inc 0 xs
>   where inc _ []     = []
>         inc n (x:xs) = x + n : inc (n+1) xs

== EXERCISE 2 ================================================================

2.1.
- Define a recursive function 'modMult n m xs' that multiplies each element of
  a list 'xs' with 'n' modulo 'm'.

2.2.
- Define a function 'addPredecessor' that adds to each element of a list the
  value of the preceding element. The first element gets no value added.
  addPredecessor :: Num a => [a] -> [a]
  addPredecessor [3,2,1] => [3,5,3]

==============================================================================

In the recursive case we can test additional conditions and act based on these.
This is how we can implement the filtering of a list:

> numPositives :: (Num a, Ord a) => [a] -> Int
> numPositives []     = 0
> numPositives (x:xs) | x > 0     = 1 + numPositives xs
>                     | otherwise = numPositives xs

== EXERCISE 3 ================================================================

3.1.
- Define 'equalTriplets' that filters from a list of triplets (x,y,z) all
  triplets for which x==y==z.
  equalTriplets [(1,2,3),(2,2,2),(4,5,6)] => [(2,2,2)]

3.2.
- Define your own version of the replicate function:
  replicate' :: Int -> a -> [a]

==============================================================================

Let's define 'take':

> take' :: Int -> [a] -> [a]
> take' 0 _      = []
> take' _ []     = []
> take' n (x:xs) = x : take' (n-1) xs

Does this work as expected if n<0 (does it return an unaltered list)?

How can we extend the above definition so that, if n > length xs, the last
element of the list gets repeated?
take'' 5 [1,2,3] => [1,2,3,3,3]

How would you define the same function using standard functions from Prelude?

== EXERCISE 4 ================================================================

4.1.
- Define your own recursive version of the drop function:
  drop' :: Int -> [a] -> [a].
- Define drop'' (a wrapper function) so that for n < 0 the function drops
  the elements from the end of the list. You can use 'reverse'.

4.2.
- Define a recursive function 'takeFromTo n1 n2 xs'.
  takeFromTo :: Int -> Int -> [a] -> [a]

==============================================================================

Here's how 'zip' function is defined:

> zip' :: [a] -> [b] -> [(a,b)]
> zip' []     _      = []
> zip' _      []     = []
> zip' (x:xs) (y:ys) = (x, y) : zip' xs ys

How can we extend this so that it only pairs up (x,y) for which x==y?

We don't always need to process the elements one by one. For example, a
function that takes a list and pairs up the consecutive elements would be
defined like this:

> pairUp :: [a] -> [(a,a)]
> pairUp (x:y:xs) = (x, y) : pairUp xs
> pairUp _        = []

== EXERCISE 5 ================================================================

5.1.
- Define a recursive function 'eachThird' that retains every third element
  in a list.
  eachThird :: [a] -> [a]
  eachThird "zagreb" => "gb"

5.2.
- Define a recursive function 'crossZip' that zips two lists in a "crossing"
  manner:
  crossZip [1,2,3,4,5] [4,5,6,7,8] => [(1,5),(2,4),(3,7),(4,6)]

== ACCUMULATORS ==============================================================

Let's look at the definition of the factorial function again:

> fact1 :: (Eq a, Num a) => a -> a
> fact1 0 = 1
> fact1 x = x * fact1 (x-1)

This function is executed as follows: we go down until we "hit" the base case,
and then build up the result incrementally as we return from the recursive
calls. Actually, the result is built up while on our way back.

But another solution is possible, one in which we recurse down and "accumulate"
the solution incrementally as we descend. When we "hit" the base case, we
simply return the solution accumulated thus far. There is no need to go back,
because there is nothing left to be done on the way back. We can simply "jump
out" of the recursion.

> fact2 :: (Eq a, Num a) => a -> a -> a   -- the second arg. is the accumulator
> fact2 0 n = n
> fact2 x n = fact2 (x-1) (x*n)

We also need a wrapper function that will set the initial value (which equals
one):

> fact3 :: (Eq a, Num a) => a -> a
> fact3 x = fact2 x 1

All recursive function defined above were defined using "standard" recursion
(without the accumulator). Many of those can be defined using the accumulator.
For instance, instead of:

> sum1 :: Num a => [a] -> a
> sum1 []     = 0
> sum1 (x:xs) = x + sum1 xs

> sum2 :: Num a => [a] -> a
> sum2 xs = sum xs 0
>   where sum []     s = s              -- 's' is the accumulator
>         sum (x:xs) s = sum xs (x + s)

But why? Isn't it all the same?

It's not. In principle, the accumulator-style definitions are of less
complexity (space complexity, but sometimes also time complexity).

A somewhat lengthy explanation:

When a recursive call is a part of a larger expression, to be able to compute
the value of the expression we first have invoke the recursive function. It is
only after the recursive call returns that we can compute the value of the whole
expression. Conceptually, this means that we gradually build up a larger and
larger expression, and we can start reducing this expression only after we've
reached the base case. (Technically, this is accomplished by storing the
context and the return address on stack for each recursive call.) The
intermediate structure grows with each recursive call. If there is only one
recursive call, and it's invoked 'n' times, then the space complexity will be
O(n).

On the other hand, if the recursive call is NOT a part of a larger expression,
then no structure is built as we recurse down. In this case, we don't even need
to store the return address on stack. Instead, we can simply make the recursive
call with new parameters (as if were using a GOTO with parameters), because
there is nothing that is waiting to be computed AFTER the recursive call is
completed. We call such recursive functions TAIL RECURSIVE. The compiler will
detect that a function is tail recursive and optimize the code (TAIL CALL
OPTIMIZATION). Since we don't have to store anything on stack, the space
complexity is constant, O(1).

Brent Yorgey:
"A recursive function is tail recursive if the final result of the recursive
call is the final result of the function itself.  If the result of the
recursive call must be further processed (say, by adding 1 to it, or consing
another element onto the beginning of it), it is not tail recursive."
http://www.haskell.org/pipermail/haskell-cafe/2009-March/058607.html

Thus: 'sum1' has a space complexity of O(n), while 'sum2' has a space
complexity of O(1).

Let's define an accumulator-style version of 'maximum'. The standard version
is:

> maximum1 :: Ord a => [a] -> a
> maximum1 []  = error "empty list"
> maximum1 [x] = x
> maximum1 (x:xs) = x `max` maximum' xs

Accumulator-style version:

> maximum2 :: (Ord a, Num a) => [a] -> a
> maximum2 [] = error "empty list"
> maximum2 xs = maximum xs 0
>   where maximum []     m = m
>         maximum (x:xs) m = maximum xs (max x m)

There's actually no need to limit ourselves to the 'Num' typeclass here, so
let's give a slightly more generic definition:

> maximum3 :: (Ord a, Bounded a) => [a] -> a
> maximum3 [] = error "empty list"
> maximum3 xs = maximum xs minBound
>   where maximum []     m = m
>         maximum (x:xs) m = maximum xs (max x m)

Actually, a more clever definition avoids using a bound altogether:

> maximum4 :: (Ord a) => [a] -> a
> maximum4 []     = error "empty list"
> maximum4 (x:xs) = maximum xs x
>   where maximum []     m = m
>         maximum (x:xs) m = maximum xs (max x m)

The difference between standard and accumulator-style recursion is quite
drastic in the case of the 'reverse' function. The standard definition is as
follows:

> reverse1 :: [a] -> [a]
> reverse1 []     = []
> reverse1 (x:xs) = reverse1 xs ++ [x]

Space complexity is O(n) but time complexity is as much as O(n^2).
Can you say why?

Accumulator-style version:

> reverse2 :: [a] -> [a]
> reverse2 xs = rev xs []
>   where rev []     ys = ys
>         rev (x:xs) ys = rev xs (x:ys)

What is the complexity of this function?

Another advantage of accumulator-style definitions is that in some cases it is
perhaps easier to comprehend the behavior of such functions (accumulator-style
definitions are closer to imperative-style programming).

So it seems that using accumulator-style recursion is a good idea. But there is
a caveat to it. Because Haskell is lazy, the accumulator will not be fully
evaluated and might grow gradually in memory, causing a MEMORY LEAKAGE. Thus,
using accumulators only makes sense if they are STRICT. More on this in later
lectures.

== EXERCISE 6 ================================================================

6.1.
- Write an accumulator-style recursive definition of
  length' :: [a] -> Int

6.2
- Write an accumulator-style recursive definition of
    maxUnzip :: [(Int, Int)] -> (Int, Int)
  that returns the maximum element at the first position and the maximum
  element at the second position in a pair, i.e., it's equivalent to:
    maxUnzip zs = (maximum xs, maximum ys)
      where (xs,ys) = unzip zs
  If the list is empty, return an "empty list" error.
- Now write a standard recursive definition (without an accumulator).

== NEXT ======================================================================

Next week, we'll continue talking about recursion, and also look into how to
make Haskell less lazy and more strict.