Notions of Category Theory i n Functional P r o g r a m m i n g by Shauna C. A . Gammon B.Sc, Memorial University of Newfoundland, 2005 A THESIS S U B M I T T E D IN P A R T I A L F U L F I L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E OF Master of Science in The Faculty of Graduate Studies (Mathematics) The University Of British Columbia August, 2007 © Shauna C. A . Gammon 2007 Abstract We present a detailed examination of applications of category theory to functional programming languages, with a primary focus on monads in Haskell. First, we explore E. Moggi's work in categorical semantics, which provides the theoretical foundation for employing monads in functional languages. In particular, we examine his use of Kleisli triples to model notions of computation. We then study P. Wadler's implementation of Moggi's ideas as a means to mimic side-effects in the purely functional language Haskell. We explicitly demonstrate the connections between Kleisli triples, categorytheoretic monads, and Haskell monads. It is our principal aim to provide a coherent translation between the abstracted concept of monads that exists in category theory, and the formulation of monads as type-constructors that is implemented in Haskell. n Table of Contents Abstract ii Table of Contents iii Acknowledgements v Dedication vi 1 Introduction 1.1 Background 1.2 Composition of Natural Transformations 1.2.1 Vertical Composition 1.2.2 Horizontal Composition 1.3 Monads in Category Theory 1.3.1 Definition 1.3.2 Examples 1.4 Category Theory in Haskell 1.4.1 Functional Languages and the A-Calculus 1.4.2 Cartesian Closed Categories 1.4.3 Types in Haskell 1.5 Haskell 1.5.1 Haskell's Type System 1 1 3 4 4 7 7 7 11 11 14 15 20 20 2 Moggi's Monads 2.1 Motivation for Monads in Haskell 2.1.1 Heuristics 2.2 Kleisli constructions 2.2.1 Example: State Updates 2.3 Triples 2.3.1 Example Revisited: State Updates 2.4 Strong Monads 23 23 23 25 28 30 33 36 iii Table of Contents 3 Wadler's Monads 3.0.1 The M a y b e M o n a d 3.0.2 Definition of a Monad 3.1 Examples 39 39 40 42 4 Equivalence 4.1 Preliminary Connections 4.1.1 Comparative Motivations for » = , j o i n , and .* . . . . 4.2 M o n a d Laws 4.2.1 Alternate Laws 4.2.2 Law Equivalence 4.3 Concluding Connections 4.3.1 List in Translation 4.3.2 Distributive Laws 51 51 54 55 55 58 61 61 62 Bibliography 69 iv Acknowledgements I would like to acknowledge my supervisor, John MacDonald, for his guidance, encouragement, and kindness. In addition, I am grateful to my second thesis examiner, Kee Lam, for his helpful suggestions. Finally, I would like to acknowledge Steve Wolfman and the ELMers for their enthusiastic and enlightening dinnertime discussions. Dedication To Leonard, Art & Adam. Chapter 1 Introduction We begin with a brief overview of the key concepts from category theory that will be needed to investigate monads. Notably, we will not provide an independent section pertaining to background Haskell material, choosing instead to offer relevant introductory concepts and syntax within the contexts they are needed. We refer the Haskell-interested reader to [4] for an excellent introduction. 1.1 Background A monoid (M, -,e) is a set M together w i t h an associative binary operation • : M x M —> M and identity element e £ M, such that V m 6 M,e • m = m = m • e. T h e properties of this familiar algebraic construct can be expressed v i a commutative diagrams—which abound i n category theory—as follows. L e t p\ : M x M —> M and p2 : M x M be projections defined by: V m , m ' € M, p\(m,m') = m = p2{m',m). We define a monoid M to be a set M together with functions fi : M x M —> M and n : 1 —> M, where 1 = {e} w i t h e € M, such that the following diagrams commute: MxMxM-^^MxM i x M - ^ M x M ^ M x l lixl M x M pi M M M M The theoretical settings i n which we work are categories: the structures central to the study of category theory. A category C is a 6-tuple (OQ, AC, dom, cod, o, id), where: •. Oc and Ac are sets of objects and morphisms (or arrows), respectively; • dom and cod are functions-that assign to each morphism f € A objects d o m / and cod/, the domain and codomain of /, respectively. For an arrow / with d o m / = a, c o d / = b, we write / : a —> b; 1 Chapter 1. Introduction • id is a function that assigns to each object c G Oc an identity morphism i d G Ac, w i t h i d : c —> c satisfying the (standard algebraic) identity law; • o is an associative binary operation that assigns to a pair of morphisms (/ : a —+ b, g : b —> c) G Ac x Ac a morphism 5 o / : a —> c. c c The associativity of morphism composition can be used to prove the following useful (albeit, simple) property of commutative digrams. Theorem 1.1.1. Let A,B,C,D,E,F G Oc and f,g,h,q,r,s,t G Ac, with domains and codomains as shown below. If the squares ABED and BCFE are commutative, then the outer rectangle ACFD, is also commutative. B D C E Proof. The commutativity of the left-hand square implies g°q = sof, and the commutativity of the right-hand square implies h o r = t o g. Thus, by the associativity of o, we have h o (r o q) = [h o r ) o q = (t o g) o q = t o (g o q) = t o (s o / ) = (t o s) o / . • M a n y constructions i n category theory capture set-theoretic ideas with the language of objects and arrows. A n example that we will encounter i n the context of computational semantics is that of a terminal object, which is the category-theoretic generalization of a singleton set. A n object 1 i n a category C is a terminal object if for every C-object c there is exactly one arrow from c to 1 i n C. A n y two terminal objects i n a category C must be isomorphic because if 1 and 1' are both terminal objects then there must exist C-arrows / : 1 —> l ' and g : l —> 1. Since 1 and l ' are terminal, idi and idj/ are the only arrows from 1 —> 1 and 1' —> 1', respectively. Thus, fog = idi, 0 o / = idi' and so / : 1 = 1'. 1 A functor is a structure-preserving map between categories. Explicitly, a (covariant) functor F : C —» V is a function that satsfies: • Vc G Oc, 3d G O , with F ( c ) = d; v • V / : c i -> c G A c 3 F ( / ) : F ( c i ) -> F ( c ) G A , 2 2 i. Vc G Oc, F ( i d ) = i d c F ( c ) v such that: ; ii. F(g o /) = F ( g ) o F(f), whenever g o f e Ac is defined. 2 Chapter 1. Introduction T h e r e are s e v e r a l f u n c t o r s t h a t w i l l p r o v i d e ' n i c e e x a m p l e s i n o u r s t u d y of m o n a d s . 1. F i r s t , let F : Set —> Grp m a p a set X g e n e r a t e d b y t h e e l e m e n t s o f X. X and Y G Set t o t h e free g r o u p F r e e ( X ) N o t e t h a t a n y f u n c t i o n b e t w e e n sets c a n b e v i e w e d as a m a p b e t w e e n t h e g e n e r a t o r s o f F r e e ( X ) a n d t h o s e o f F r e e ( y ) , w h i c h e x t e n d s t o w o r d s i n X, preserves c o m p o s i - t i o n a n d e q u i v a l e n c e s , a n d is t h e u n i q u e l y d e t e r m i n e d h o m o m o r p h i s m b e t w e e n these g r o u p s w i t h t h e s e e x t e n s i o n define F to be the free group functor properties. by setting is a n a r b i t r a r y f u n c t i o n b e t w e e n o b j e c t s i n Set T h u s , we c a n F(f) = hf, a n d hf where / is t h e c o r r e - s p o n d i n g g r o u p h o m o m o r p h i s m as d e s c r i b e d . 2. W e define a n o t h e r f u n c t o r U : Grp — > Set to be the forgetful functor, w h i c h m a p s a g r o u p (G, •) t o its u n d e r l y i n g set o f e l e m e n t s a n d a g r o u p h o m o m o r p h i s m to the c o r r e s p o n d i n g f u n c t i o n between the u n d e r l y i n g sets. 3. The power set functor V : Set — > Set its p o w e r set a n d a f u n c t i o n / V(f) : V{X) G i v e n functors -> V(Y) : X is d e f i n e d b y m a p p i n g a set —> Y i n d u c e d b y t a k i n g / - i m a g e s o f sets i n F,G : C —» T>, a we h a v e T(, O F(f) — G(f) V(X). natural transformation r : F —> G a m a p p i n g w i t h t h e p r o p e r t y t h a t V a G Oc, 3T b G Ac, to b e t w e e n sets t o t h e f u n c t i o n q G A-r> s u c h t h a t V / : a —+ o r . I n t u i t i v e l y , r c a n b e t h o u g h t o f as a a m a p t h a t uses t h e s t r u c t u r e o f T> t o s l i d e F - d i a g r a m s o n t o G - d i a g r a m s . T h e s i t u a t i o n is d e s c r i b e d b y t h e f o l l o w i n g d i a g r a m , i n w h i c h t h e s q u a r e is commutative: V Fa Gf Ff Fb 1.2 Ga Gb. Composition of Natural Transformations T h e d e f i n i t i o n of a m o n a d i n c l u d e s c o m p o s i t i o n s o f n a t u r a l t r a n s f o r m a tions. T h e r e are t w o w a y s t h a t n a t u r a l t r a n s f o r m a t i o n s c a n b e c o m b i n e d 3 is Chapter 1. Introduction ( d e p e n d i n g o n the d o m a i n a n d range functors of the n a t u r a l t r a n s f o r m a t i o n s i n v o l v e d ) , w h i c h we n o w i n t r o d u c e . 1.2.1 Vertical Composition G i v e n a n y t h r e e f u n c t o r s F,G, a : F —> G, T : G —> H, H :C T> a n d n a t u r a l t r a n s f o r m a t i o n s v i e w e d v e r t i c a l l y as V, H it is e a s y t o see h o w t o define a c o m p o s i t e F T —> H. C For any object : Gc —> He define ( r • a) c natural transformation r • a : c £ C, t h e r e e x i s t "D-arrows a c : Fc —> Gc and (the c c o m p o n e n t s o f t h e g i v e n n a t u r a l t r a n s f o r m a t i o n s ) . = r c o a c We to be the c o m p o n e n t s of the vertical c o m p o s i t i o n of n a t u r a l t r a n s f o r m a t i o n s a a n d r . T o s h o w t h a t s u c h p r o p o s e d c o m p o n e n t s for r • a i n d e e d c o m p r i s e a n a t u r a l t r a n s f o r m a t i o n , we m u s t s h o w c o m m u t i v i t y o f t h e o u t e r r e c t a n g l e i n f o l l o w i n g d i a g r a m , for / Fc Gc 4 Gf Ff He Hf Gd Fd : b —> c i n C: + Hd. T h e respective naturalities of a a n d T i m p l y the c o m m u t a t i v i t y of the left a n d r i g h t s q u a r e s , a n d t h e c o m m u t a t i v i t y of t h e o u t e r r e c t a n g l e i m m e d i a t e l y follows f r o m T h e o r e m 1.1.1. 1.2.2, Horizontal Composition W e n o w define a n o t h e r t y p e o f c o m p o s i t i o n for n a t u r a l t r a n s f o r m a t i o n s . Consider functors F,G : B —> C, H,J:C—^T>, and natural transformations a : F —> G, r : H - A J , v i e w e d h o r i z o n t a l l y as F H > G D e t a i l s are as follows: Gb, and C-arrow : Fb > J for a n y S - o b j e c t b, t h e r e e x i s t C - o b j e c t s Fb —> Gb. and W e h a v e t w o w a y s b y w h i c h we c a n m a p 4 Chapter 1. Introduction t h e s e i t e m s t o V; i n V, n a m e l y , v i a H o r J. T h u s , we h a v e t h e f o l l o w i n g d i a g r a m w h i c h is c o m m u t a t i v e b y t h e n a t u r a l i t y o f r : HFb JFb H<7 Jcr b b HGb -^*-> JGb T h a t is, H B i" C T> followed by B followed by B {« C C H I'D equals H C I' D B W e define t h e c o m p o n e n t s be the composition b components of t h e h o r i z o n t a l c o m p o s i t i o n of a a n d r o f a r r o w s f r o m HFb d i a g r a m ; e x p l i c i t l y , ( r o cr) V. = TQb to JGb H&b = J^b 0 as s h o w n in the TFb- T o v e r i f y t h a t these 0 d o c o m p r i s e a n a t u r a l t r a n s f o r m a t i o n , we m u s t s h o w t h a t f o l l o w i n g d i a g r a m c o m m u t e s for / to above the : b —> c i n B: i l ^ K job H F b HFf JGf HFc JGc. B u t given the definition of the c o m p o n e n t s for ( r o o~) , t h i s d i a g r a m c a n c be b r o k e n into two squares: HFb -^_» H F f Hcr c S i n c e a is n a t u r a l , we h a v e Gf a n d arrows to D w i t h the JGb HGf \ HFc HGb JGf TGc HGc o Cb = o~ ° Ff. f u n c t o r H, c JGc. B y m a p p i n g these objects we e s t a b l i s h the c o m m u t a t i v i t y of 5 Chapter 1. Introduction the left-hand square. S i n c e Gf : Gb —> G c is a C - a r r o w , t h e of r implies the c o m m u t a t i v i t y of the r i g h t - h a n d square. rectangle naturality T h u s , the above commutes. A Note on Notation W e take this o p p o r t u n i t y to emphasize a n o t a t i o n a l usage of key i m p o r t in the s t u d y of m o n a d s . C o n s i d e r the following arrangement: F > F' Let b £ TF'b 0 B - Then, p b : Fb is a n a r r o w i n V. -> F'b is a n a r r o w i n C a n d T(p ) : TF -+ TF' Tp, = IT 0 -+ denote the n a t u r a l {Tp) = T(p ). t r a n s f o r m a t i o n w i t h s u c h P - a r r o w s as its c o m p o n e n t s : t h u s r e c o g n i z e t h a t Tp : TFb b W e t h u s let Tp, b is a n e x a m p l e o f h o r i z o n t a l c o m p o s i t i o n . b We Specifically, p, w h e r e l r is t h e i d e n t i t y n a t u r a l t r a n s f o r m a t i o n o n t h e f u n c t o r 1 T: F T > > F' T N e x t , consider the arrangement: T B -^-> C~[^V. > T> A g a i n , let b £ 0 W e let vF : TF given precisely B - T h e n , Fb —+ T'F by the £ C and v b • TFb F -* T'Fb is a n a r r o w i n denote the n a t u r a l t r a n s f o r m a t i o n w i t h F - i m a g e components of v : {vF)b a n o t h e r e x a m p l e o f h o r i z o n t a l c o m p o s i t i o n , w h e r e vF F F This is ; T > F — vp^. = v o\ V. components > T 1 T h u s , w h i l e t h e s y m m e t r y o f s u c h n o t a t i o n s as 'pT' a n d 'Tp' reflects the c o n c e p t u a l s y m m e t r y of the h o r i z o n t a l c o m p o s i t i o n s discussed, care m u s t be t a k e n w h e n a p p l y i n g these m a p s d u e to the relatively a s y m m e t r i c n a t u r e of the c o m p o n e n t - w i s e formulations of the n a t u r a l t r a n s f o r m a t i o n s . 6 Chapter 1. Introduction 1.3 Monads in Category Theory 1.3.1 Definition W e n o w give t h e first f o r m u l a t i o n o f t h e c o n c e p t t h a t is o f p r i m a r y i n terest t o us. I n c a t e g o r y t h e o r y , a m o n a d i n a c a t e g o r y C is a m o n o i d i n the category of endofunctors of C , w i t h the b i n a r y o p e r a t i o n b e i n g functor c o m p o s i t i o n a n d the unit element b e i n g the identity endofunctor. W e can present the definition of a m o n a d d i a g r a m m a t i c a l l y using c o m m u t a t i v e d i a g r a m s like t h o s e g i v e n p r e v i o u s l y i n t h e d i a g r a m m a t i c d e f i n i t i o n of a m o n o i d . T h a t is, g i v e n f u n c t o r s IQ,T,T , 2 r\ : IQ —+ T, fi : T 3 : C —> C a n d n a t u r a l t r a n s f o r m a t i o n s —> T , t h e 3-tuple (T, T 2 rj, fi) is monad i n C if a the following diagrams commute: T 3 f l T T 2 IT >T T > ^ > T 2 < T TI v ll I'- T 2 M ' T = T. F o r n o t a t i o n a l b r e v i t y , we w i l l o f t e n d e n o t e a m o n a d (T, n, //,) b y its u n d e r l y ( i n g e n d o f u n c t o r T . I n a d d i t i o n , we w i l l refer t o t h e e q u a t i o n s c o r r e s p o n d i n g t o t h e left a n d r i g h t d i a g r a m s a b o v e as m o n a d l a w s 1 a n d 2, r e s p e c t i v e l y . 1.3.2 Examples 1. W e b e g i n b y w o r k i n g t h r o u g h t h e d e t a i l s o f a s i m p l e e x a m p l e . (M, •, e) b e a m o n o i d a n d T : Set M x X for a n y X £ Set. — > Set Let a f u n c t o r d e f i n e d b y TX D e f i n e t h e n a t u r a l t r a n s f o r m a t i o n s fix '• M = x M x I - > M x I b y ^ ( ( m , n,x)) = (m-n,x) a n d rjx : X —> M x X b y Vx{x) = (e,a;). T h i s gives t h e f o l l o w i n g s i t u a t i o n : r 2 Set- X ' Set > T(T X) (l,m,n,x) TX T(TX) 2 T (TX) (l,m,n,x) 2 i—> (l,m • n,x) » T(TX) t—» (I • m, n, x). i l r x 7 Chapter 1. Introduction We can thus demonstrate the commutativity of the first defining diagram for a monad by showing that elements are mapped via pertinent paths in the same way.: M x M x M x X M xM x X > illF)x {TlAx f-X M xM tX l xX (l,m,n,x) M x X > >lTX (l-m,n,x) MX T(Wf) (l,m-n,x) ^ x > (l-m-n,x). Similarly, for monad law 2 we have the situation: Set Set 1" X TX T(IX) (m,x) i—> TiTX) I(TX) (m,x) T{TX) <—> (e,m,x). (m, e,x) These mappings give rise to the following verification of commutativity: TX TX TX 2 t*x TX (m,x) TX : TX > (e,m,x) I'X (e -m,x) (m,e,x) < (m,x) fix = (m • e,: 8 Chapter 1. Introduction T h e r e f o r e , ( T , n, JJL) is a m o n a d . Let (P, < ) b e a p a r t i a l l y o r d e r e d set. of P t o b e o b j e c t s , If we c o n s i d e r t h e elements a n d t h e r e l a t i o n s h i p x < y w h e r e x, y G P t o b e e q u i v a l e n t t o a n a r r o w f r o m x t o y, t h e n P c a n b e v i e w e d as a category. B y the definition of a poset a n d t h a t of arrows i n the category P, there c a n b e at most one arrow between any two objects. a n y e n d o f u n c t o r d e f i n e d o n P is a m o n o t o n i c f u n c t i o n . Thus, Let T be s u c h a functor a n d observe t h a t the c o n d i t i o n s d e s c r i b i n g the n a t u r a l t r a n s f o r m a t i o n s rj a n d fi for a m o n a d are e q u i v a l e n t t o t h e c o n d i t i o n s x < Tx a n d T(T(x)) < T ( x ) , V x G P. B y a p p l y i n g T t o t h e first o f t h e s e i n e q u a l i t i e s , we h a v e T(x) < T(T(x)). C o m b i n e d w i t h t h e s e c o n d T(x) i n e q u a l i t y a n d t h e d e f i n i t i o n o f a p o s e t , we h a v e — T{T(x))\ t h a t is, a m o n a d o n a p o s e t P is a c l o s u r e o p e r a t i o n . R e c a l l t h e d e f i n i t i o n s o f t h e free g r o u p f u n c t o r F : Set —> G r p a n d t h e f o r g e t f u l f u n c t o r U : G r p —» Set. B y c o m p o s i n g t h e s e f u n c t o r s , we h a v e a n e n d o f u n c t o r T = U o F : Set —> Set, w h i c h m a p s a set X t o t h e u n d e r l y i n g set o f F r e e ( X ) . W e define a of finite l e n g t h o f e l e m e n t s f r o m XuX , -1 is a set o f f o r m a l inverses o f e l e m e n t s word i n where X~ x X - 1 i n X. T h e n TX under deletion of string segments ' r a - 1 X} is t h e set o f e q u i v a l e n c e classes o f w o r d s m a d e u p of s y m b o l s x a n d x X—equivalent to be a string = {x |a; G ' for a l l x G _ 1 a n d x~ x,' i 1 for a n y x G X. If w is a w o r d i n X, we d e n o t e its e q u i v a l e n c e class b y [w] a n d t y p i c a l l y refer t o t h e e n t i r e "equivalence class o f t h e w o r d w" as s i m p l y t h e " w o r d w." If we define rjx b y s e n d i n g x i—> [x] a n d fix b y t h e c o n c a t e n a t i o n of w o r d s i n TX, t h e n T is a m o n a d . N o t e t h a t a w o r d i n T X 2 of w o r d s f r o m TX, brackets, w i t h our notation. then [[2 yx][a; y]] _ 1 _ 1 is c o m p o s e d a n d concatenating words amounts to removing GTX 2 F o r example, if [z j/a;], G TX, _1 a n d ^ j v ( [ [ z y a ; ] [ a : j / ] ] ) = [z~ yxx~ y] = _1 _1 l 1 [z~ yy]l W e p r o v i d e illustrations of the c o m m u t a t i v i t y of the m o n a d d i a g r a m s . F o r m o n a d l a w 1: [[[xy][zxx}[y]][[x-'}[y]]} J^U [{xy\[zxx][y}[x-'}[y}\ PX [[xyzxxy}[x y}] 1 > l x > [xyzxxyx y] 1 9 Chapter 1. Introduction N o t i c e that the c o m m u t a t i v i t y of this d i a g r a m c a n be s u m m a r i z e d by s t a t i n g t h a t t h e r e m o v a l of i n n e r b r a c k e t s c a n b e d o n e i n e i t h e r o r d e r . F o r m o n a d l a w 2: TX * TX TX 2 P-x -. TX TX [xyz' -li VTX [[xyz' ]] [[xyz- ]] 1 [xyz- ] 1 v-x [xyz- H TX 1 n x - [xyz- H W h i l e the above d i a g r a m appears rather unenlightening, it is i n t e r - e s t i n g t o n o t e t h a t t h e m a p p i n g a l o n g t h e u p p e r left a r r o w a d d s the o u t e r set o f b r a c k e t s , w h i l e m a p p i n g v i a t h e u p p e r r i g h t a r r o w a d d s t h e i n n e r set. The power set m o n a d is d e s c r i b e d with natural transformations x € X,T]x{x) = {x}. D e f i n e \xx of L. Let L by the rj a n d fi, power defined set functor as follows. along For any b e a set w h o s e e l e m e n t s are s u b s e t s o f X . to be the m a p w h i c h takes L to the u n i o n of the elements Note that fix{L) C X, as is necessary. T h i s e x a m p l e l o o k s s i m i - lar t o t h a t p r e c e d i n g , b u t t h e r e are k e y differences w h i c h we i l l u s t r a t e in the e x a m p l e below. S p e c i f i c a l l y , ^z's d e f i n i t i o n i n v o l v e s t a k i n g u n i o n o f s u b s e t s ; t h e r e f o r e , e l e m e n t s t h a t are m e m b e r s o f m o r e the than o n e s u b s e t w i l l s h o w u p as o n l y a s i n g l e e l e m e n t i n t h e u n i o n . I n t h e case of s t r i n g c o n c a t e n a t i o n , a n y ' r e p e a t e d ' e l e m e n t is n o t i n t o a s i n g l e c o p y of t h a t e l e m e n t , b u t c a n c e l l a t i o n s condensed c a n o c c u r i n ac- c o r d w i t h the given equivalence r e l a t i o n of w o r d r e d u c t i o n . Further, u n d e r the o p e r a t i o n of u n i o n , the order of elements in the resultant set is u n i m p o r t a n t ; b y c o n t r a s t , t h e o r d e r is i m p o r t a n t i n s t r i n g c o n catenation. C o m p a r e the d i a g r a m s b e l o w to those of the preceding example. 10 Chapter 1. Introduction {{{x,y}{z,x}{y}}{{x-i}{y}}} J^U {{x,y}{z,x}{y}{x^}} T(Mx) MX {{x,y,z}{x~ ,y}} l {x,y,z^} ™U >XX { { W ^ } } 1 I jy 1.4 { { W MX {x ,z^} » {x,x~ ,y,z} l 1 } } { W 1 } iix = { ,y,z- }. 1 x Category Theory i n Haskell T h e m a i n focus o f o u r s t u d y is t h e a p p l i c a t i o n o f c a t e g o r y t h e o r y t o t h e p r o g r a m m i n g l a n g u a g e H a s k e l l . H a s k e l l is a p o l y m o r p h i c a l l y t y p e d , p u r e l y functional language. In the appropriate contexts, we w i l l define each of these terms. W e b e g i n w i t h a brief discussion of the f u n d a m e n t a l c o n n e c t i o n between category theory a n d the functional p r o g r a m i n g p a r a d i g m . 1.4.1 Ftinctional Languages and the A-Calculus T r a d i t i o n a l approaches to the design a n d i m p l e m e n t a t i o n of p r o g r a m m i n g languages have been centred a r o u n d the i d e a of a a s s o c i a t i o n b e t w e e n a n a m e a n d v a l u e s [9]. perative variable: S u c h languages a modifiable are c a l l e d of c o m m a n d s . E a c h c o m m a n d u s u a l l y c o n s i s t s o f t h e e v a l u a t i n g o f a n ex- pression a n d assigning the r e t u r n value to a variable's n a m e . Moreover, a c o m m a n d f o u n d at a n y s t e p o f t h e c o m p u t a t i o n s e q u e n c e c a n b e on im- b e c a u s e p r o g r a m s b a s e d o n t h i s c o n c e p t are c o m p r i s e d o f s e q u e n c e s dependent variables u s e d — a n d potentially m o d i f i e d — i n preceding c o m m a n d s . In t h i s way, v a l u e s c a n b e p a s s e d d o w n t h r o u g h a c o m m a n d s e q u e n c e , a n d t h e v a l u e a s s i g n e d to a n a m e i n o n e c o m m a n d c a n b e r e p l a c e d b y a n e w v a l u e in a subsequent command. T h e s e ideas i m p l y t h a t t h e o r d e r o f c o m m a n d e x e c u t i o n t y p i c a l l y affects a c o m p u t a t i o n r e s u l t . In contrast, the functional p r o g r a m m i n g p a r a d i g m is f o u n d e d o n s t r u c - t u r e d f u n c t i o n calls, a f u n c t i o n call b e i n g a f u n c t i o n n a m e followed by a list o f a r g u m e n t s i n p a r e n t h e s e s . of n e s t e d P r o g r a m s i n functional languages consist f u n c t i o n c a l l s o r , i n o t h e r w o r d s , are c o m p o s i t i o n s o f f u n c t i o n s . 11 Chapter 1. Introduction As such, function names behave as they do in the context of mathematical functions: they are formal parameters of other functions and are associated with values through calls to functions that have been given actual values. Once an association between a function name and a value has been made, it cannot be modified. In summary, a functional program is a single expression and executing such a program corresponds to evaluating the expression. This mathematical approach offers the following advantages to programming with functional languages [14]: • Declarativeness — The order of function execution need not be specified by the programmer but rather can be inferred by the system based on relations that must be satisfied by evaluated functions. • Referential Transparency — If two functions f and g produce the same values when given the same arguments, then any occurence of f can be replaced by g without changing the meaning of the program. • Higher-Order Functions — Functions can accept functions as arguments and can output new functions. • Polymorphism — Individual algorithms can be written in such a way that they apply to several types of input (in certain cases, this is also referred to as function overloading). This is somewhat akin to saying that we can define the "same function" on different domains: an idea which makes mathematical sense in the context of a function being defined on a set and certain supersets. The most obvious candidate for a polymorphic function is the identity function. We can define the identity function for any input a by returning a unchanged. It is often more convenient to think of a single, "umbrella" identity function than to define a new identity function for each type of input being considered. Thus, in Haskell for example, the identity function i d is a polymorphic function that accepts an input of any type and returns a copy of this input. Many approaches to the semantics of polymorphic type systems are founded on the typed X-calculi; in particular, Haskell is based on A-calculus. The two core concepts of the A-calculus are those of function abstraction and application. The former refers to the generalization of expressions via the introduction of names, and the latter refers to the assignment of particular values to names as a means to evaluate generalized expressions [14]. In this work, we will not provide a comprehensive introduction to or detailed explanations of the A-calculus. We will, however, include some references to the A-calculus on occassion, with aims to benefit those already familiar and 12 Chapter 1. Introduction entice unfamiliar readers to learn more independently. Formally, a typed A-calculus is an axiomatic system consisting of terms, types, variables and equations. For every term a in a typed A-calculus, there is a corresponding primitive type A called the type of a, with the correspondence denoted by a 6 A. Since Haskell's type system will be of recurring importance throughout our study, we cite the A-calculus axioms pertaining to term- and type-formation here as general background information [2] (for the 11 other rules necessary to define a typed A-calculus, also see [2]): There is a unit type 1. There is a term * of type 1. If A and B are types, then there exists a product type A x B and a functional type [A —> B], where [A —> B] denotes the set of functions from A to B. • If a and b are terms of type A and 73, respectively, then there is a term (a, b) of type Ax B. • If c is a term of type Ax B, there are terms proj (c) and proj (c) of type A and B, respectively. • If a is a term of type A and / is a term of type [A —> B\, then there is a term / a of type B. • For each type A, there is a countable set of terms xf of type A called variables of type A. • If x is a variable of type A and 4>(x) is a term of type B, then ^x€A<t>{%) € [A —• B\. Here, cfi(x) denotes a term < / > that might contain the variable x. • • • . 1 2 In addition, we define a free variable to have the following properties: If x is a variable, then x is free in the term x. If an occurrence of x is free in either of the terms a or b, then that occurrence of x is free in (a, b). If x occurs freely in either / or a, then that occurrence of x is free in / a. Finally, there exists at least one occurrence of x that is not free—called bound—in X 4>{x). Note that the last item in the above list offers a notation for defining functions in the A-calculus. Another notation that is commonly found in the literature is Ax : <$>{x), which would be equivalent to X cf)(x) in the above notation. It can be shown that it is possible to define every computable function using' function abstraction and application of the A-calculus. This means that the A-calculus is a theoretical, universal machine code for programming languages. It is further suited for describing programming languages in that its purely mathematical nature facilitates constructing proofs about x x 13 Chapter 1. Introduction t h e l a n g u a g e b e i n g d e s c r i b e d . L a s t l y , t h e s i m p l i c i t y of t h e A - c a l c u l u s m a k e s its i m p l e m e n t a t i o n i n l a n g u a g e s r e l a t i v e l y e a s y 1.4.2 [9]. Cartesian Closed Categories T h e A - c a l c u l u s p r o v i d e s t h e first l a r g e l i n k b e t w e e n f u n c t i o n a l p r o g r a m m i n g languages a n d category theory. T h e c o n n e c t i o n is i n t h e c o r r e s p o n - dence between t y p e d A-calculi a n d cartesian closed categories. A category C is a cartesian closed category if it satisfies t h e f o l l o w i n g a x i o m s [2]: ' • T h e r e is a t e r m i n a l o b j e c t 1. • F o r e v e r y p a i r o f C - o b j e c t s A a n d B, t h e p r o d u c t o f A a n d B, Ax B there exists a C - o b j e c t A x w i t h p r o j e c t i o n s p\ : A x B B, —> A a n d p2 • -> B. • F o r every pair of objects A a n d B, there exists a n object [A —> B] a n d a n a r r o w e v a l : [A —> B] x A —+ B w i t h t h e p r o p e r t y t h a t for a n y arrow / : C x A —> B, t h e r e is a u n i q u e a r r o w A / : C —> [A —> B] s u c h that the composite arrow C x A [A -> B] x A^ B is e q u a l t o / . W i t h r e s p e c t t o t h e t h i r d p r o p e r t y , we n o t e t h a t t h e e x p o n e n t i a l [A —> — ] is a r i g h t a d j o i n t t o adjunction. — x A : C —-> C a n d t h a t e v a l is t h e c o u n i t o f t h i s N o t e also t h a t t h e c o m p o s i t i o n o f a f u n c t o r F a d j o i n t U d e t e r m i n e s a m o n a d ( T , r/, fj.) w i t h T — w i t h its r i g h t UF. A A - t h e o r y is a t y p e d A - c a l c u l u s t h a t is e x t e n d e d w i t h s o m e a d d i t i o n a l t e r m s , t y p e s , or e q u i v a l e n c e s b e t w e e n e x p r e s s i o n s [13]. It c a n b e s h o w n t h a t t h e c o n c e p t s o f t y p e d A - c a l c u l u s a n d c a r t e s i a n c l o s e d c a t e g o r y are e q u i v a l e n t i n t h e f o l l o w i n g sense: (i) G i v e n a A - t h e o r y L, we c a n c o n s t r u c t a c a r t e s i a n closed category i n w h i c h each object corresponds to a t y p e i n L a n d i n w h i c h a n a r r o w b e t w e e n o b j e c t s A a n d B is a n e q u i v a l e n c e class of t e r m s o f t y p e B w i t h o n e free v a r i a b l e o f t y p e A; (ii) g i v e n a c a r t e s i a n c l o s e d c a t e g o r y C t h a t is a s s u m e d t o h a v e finite p r o d u c t s , w e c a n define a t y p e d A - c a l c u l u s w h i c h h a s as t y p e s t h e o b j e c t s o f C. In particular, the types required by t h e first a n d s e c o n d t y p e d A - c a l c u l u s a x i o m s g i v e n a b o v e are t h e o b j e c t s A x B and 1, [A —> B). N o t e t h a t the equivalence p r o o f shows the c o n n e c t i o n between the n o t a t i o n u s e d for f u n c t i o n s i n t h e A - c a l c u l u s a n d t h e c h o i c e o f t h e f u n c t i o n n a m e 14 Chapter 1. Introduction Xf i n t h e last l i s t e d i t e m . I n fact, / under the a d j u n c t i o n ( - x a n d Xf are s i m p l y c o r r e s p o n d i n g m a p s A, -) = (-, [A -»-]): C x C —> Set. W e refer t h e r e a d e r t o , for e x a m p l e , [2] for a v e r s i o n o f t h e e q u i v a l e n c e p r o o f . 1.4.3 Types i n Haskell A n y p r o g r a m m i n g language has a m e a n s b y w h i c h to g r o u p s i m i l a r k i n d s of values a n d expressions. A s y n t a c t i c m e t h o d for o r g a n i z i n g a n d i m p l e - type system, w i t h t h e types. T y p i c a l e x a m p l e s m e n t i n g levels o f a b s t r a c t i o n i n p r o g r a m s is c a l l e d a c o n s t r a i n e d d a t a v a l u e s r e f e r r e d t o as d a t a t y p e s or of types i n p r o g r a m m i n g languages gers, i n c l u d e p r i m i t i v e t y p e s ( s u c h as floating point numbers a n d characters), tuples, lists, inte- function types a n d abstract d a t a types. H a s k e l l is a s t a t i c a l l y t y p e d p r o g r a m m i n g l a n g u a g e . T h i s means that e v e r y e x p r e s s i o n i n H a s k e l l is a s s i g n e d a t y p e , a n d t y p e c h e c k i n g is c o m p l e t e d as p a r t of t h e c o m p i l e - t i m e , r a t h e r t h a n r u n - t i m e , p r o c e s s . A s a result, a c o m p i l e - t i m e e r r o r w i l l b e g e n e r a t e d if a n y f u n c t i o n i n t h e p r o g r a m is g i v e n a n a r g u m e n t o f t h e w r o n g t y p e . T h i s is h e l p f u l i n l o c a t i n g a n d r e d u c i n g t h e n u m b e r of p r o g r a m m i n g errors (bugs) that c a n o c c u r w h e n p r o g r a m m i n g . T h e t y p e s i g n a t u r e of a f u n c t i o n is a s p e c i f i c a t i o n o f its a r g u m e n t a n d r e t u r n types. F o r e x a m p l e , t h e f u n c t i o n t o U p p e r is a H a s k e l l f u n c t i o n t h a t t a k e s a c h a r a c t e r as i n p u t a n d r e t u r n s t h e c o r r e s p o n d i n g u p p e r case c h a r a c t e r as o u t p u t . W e denote (Prelude) notation t o U p p e r the t y p e signature of this f u n c t i o n w i t h :: Char -> f u n c t i o n f o f n v a r i a b l e s is d e n o t e d b y f r , where t j , . . . t n Char. :: ti the T h e t y p e signature of a -> t-2 - > • • • -> t n -> are t h e ( p o s s i b l y d i s t i n c t ) i n p u t t y p e s a n d r is t h e r e t u r n t y p e . E q u i v a l e n t l y , we c a n t h i n k o f f as a f u n c t i o n t h a t a c c e p t s k a r g u m e n t s t j , . . . tfc a n d r e t u r n s a f u n c t i o n o f n - fc a r g u m e n t s t ^ + i , . . . t , n with the o u t p u t t y p e of the r e t u r n f u n c t i o n of course b e i n g r . A f u n c t i o n a l p r o g r a m m i n g l a n g u a g e c a n b e c o n s t r u e d as a c a t e g o r y if we v i e w t h e t y p e s o f t h e l a n g u a g e t o b e t h e o b j e c t s a n d t h e f u n c t i o n s de- f i n e d b e t w e e n t y p e s t o be t h e m o r p h i s m s . A s a s i m p l e e x a m p l e , c o n s i d e r a functional language with primitive types Bool h a v i n g o n e o f t h e v a l u e s t r u e o r false Char characters Float floating-point numbers: real numbers w i t h no fixed n u m b e r o f d i g i t s b e f o r e or after t h e i r d e c i m a l p o i n t s Int (fixed-precision) integers, a n d primitive functions 15 Chapter 1. f l o o r :: F l o a t i d :: * - • inc-i n t greatest integer less than or equal to input identity (returns an identical copy of input) increment an integer by 1 increment a floating-point number by 1 test for zero square a number convert an integer to a floating-point number convert a lower-case character to upper-case. Int * :: I n t —> I n t incpioat :: F l o a t —> F l o a t i s z e r o : : I n t —> B o o l V- S : Float Int Introduction Float Int t o F l o a t :: I n t -> F l o a t toUpper :: Char —•> Char Clearly, the polymorphic function i d will correspond to the identity morphism for any object in the category. Excluding identity and composition arrows, the category corresponding to the above data would look like: Int mc I n t , sqr iszero Char floor toFloat Bool toUpper Float incpioat, s q r We can neatly display the commutativity of certain operations in a language using commutative diagrams. For example, Int > toFloat Float Int toFloat inCFloat > Float summarizes the equivalence between incrementing an integer and then converting it to a floating-point number with converting an integer to a floatingpoint number and incrementing the result. 16 Chapter 1. Introduction Overloaded Functions and Implicit Conversions There is an interesting connection between the transformations whose commutativity is considered semantically valid i n a given language and the use of overloading i n that language. Overloading is a case of type polymorphism in which an operator has a different implementation depending on the type of its arguments. T h e main purpose of overloading is to i n crease the intuitiveness or convenience of a language's syntax. For example, it would be disheartening if programmers had to distinguish between two operations ( + i t ) and (+Fioat) when trying to add two "numbers." T h e same idea is true for the s q r function shown i n the category above (note the overloading implemented there). Yet, the action that a compiler should take when faced w i t h an expression such as l e t x i n F l o a t = i + j , where i and j are integers is not obvious, the two possibilities being x = t o F l o a t ( i ) + F i o a t ' t o F l o a t ( j ) and x = t o F l o a t ( i + i j ) . Of course, from a mathematical standpoint these options are equivalent, which suggests that perhaps a compiler should be indifferent to the interpretation of these expressions as well. T h i s has motivated approaches to formalizing proper designs of overloaded functions and implicit conversions, so that their commutivity upon insertion into a compiler is semantically irrelevant. In the case of i n c , s q r or (+) discussed above, the 'semantic commut i v i t y ' that arises does so because Z C K. In our programming context, this means that data of type I n t can be implicitly converted to that of type F l o a t . In many cases, an implicit conversion function corresponds to the inclusion function between appropriate types. This mathematical analogy does not nicely extend to all cases of implicit conversions, however. For example, it may be useful to consider integer expressions as a type; that is, a mathematical expression i n which all terms are integers (for example, x + y, where x and y are of type Int). Here, the integer variables (for example, x, where x is of type I n t ) can be viewed as a subtype of the integer expressions, when viewing an integer variable as a subtype of integer expressions. T h e difficulty in defining a "set of integer expressions" prohibits a natural extension of the inclusion analogy for these types. n n t To generalize the notion, we briefly introduce Reynold's approach of considering an alternate construction of a category representative of a functional language [15]. We again let the objects of the category be the types of the language, but we use them to form a partial order i n which the arrows between types correspond to valid implicit conversions. A s an example, we use the same objects as in the type category figure above, along w i t h a 'universal' object Value. T h i s object is a contrived m a x i m u m for the type partial 17 Chapter 1. order, Introduction it b e i n g d e n n e d as a s u p e r t y p e of a l l o t h e r t y p e s i n t h e c a t e g o r y . W i t h t h i s , t h e p a r t i a l o r d e r 0, w o u l d l o o k like: Value Char Float Bool Int W e define a carrier C : Cl —> Set t o b e a f u n c t o r that maps a type t£ to t h e set o f v a l u e s o f t y p e t. T h e m o r p h i s m s t < t' i n $1 a r e m a p p e d t o i m p l i c i t c o n v e r s i o n f u n c t i o n s b e t w e e n sets o f t y p e s , w h o s e b e h a v i o u r is r e s t r i c t e d b y t h e d e n n i n g p r o p e r t i e s o f t h e f u n c t o r C. E x p l i c i t l y , t h e d e f i n i t i o n o f a f u n c t o r C : il, —•> Set o n a p a r t i a l o r d e r f i is a m a p t h a t satisfies: (a) C{t < t') e H o m ( C ( t ) , C ( t ' ) ) (b) C(t <t) = (c) If t < t! a n d *' < t", t h e n C(t Let I c{t) C : fl —> Set 2 2 2 < t') = C{t' 2 2 o n a r r o w s t\ < t[, t 2 2 : Set —> Set 2 t\, ^2 b y (C(ti),C(*2)) < t' ) = (C(h < * i ) , C ( t 2 2 < t' )). 2 d e n o t e t h e f u n c t o r d e n n e d o n sets S i , ^ b y x {Si,S ) 2 2 and 1 2 2 x t ). < t' b y C\h < t[,t Let < denote the functor denned o n types C (ti,i ) = and < t") o C(t o n set f u n c t i o n s /1, / 2 =Si x 5 2 • by x\h,f2) =h X/2. In t h e c o n t e x t o f i m p l i c i t c o n v e r s i o n s , we a r e i n t e r e s t e d i n c o n v e r t i n g a t y p e t\ i n t o a n o t h e r t y p e t , w h e r e t\ < i 2 2 . F o r e x a m p l e , we m i g h t c o n s i d e r t h e c o n v e r s i o n f r o m integers t o r e a l n u m b e r s o r f r o m a n i n t e g e r t o a n integer 18 Chapter 1. Introduction expression (for the latter, consider converting the integer 1 to the expression '1' that evaluates to the integer 1). Therefore, from a categorical standpoint, we always assume that the partial order fi has a largest (catchall) element as described above-call it Value-that semantically encompasses all other types. This is the terminal object of the partial order. We thus let T$ • fi2 — > fi denote the functor denned on any two types h,t by 2 r ( i i , t ) = m&x{ti,t } d 2 2 and on arrows t\ < t[,t < t' by 2 2 t[,t r*(*i < 2 < t' ) = (r (h,t ) 2 s < r^ti.ta)). 2 The compiler-recognized commutivity of overloaded functions and implicit conversions can then be expressed by stating that for any types t\,t with t\ < t an overloaded function 7,5 is a natural transformation between 9 c 2 £ 9 r<5 c the composite functors fi ^—> Set —> Set and fi —>fi—> Set: 2 2 2 2 x oc (t t ) 2 ^ C o r ^ , ^ ) 2 1: 2 Cor x oC (t <t[,t <t' ) 2 2 1 2 s 2 x oC (t' ,t' ) 2 2 1 2 ——* t <t' ) 2 2 CoTs^tti 7*(ti.*2) The preceding discussion regarding the overloaded function (+)—or, in our new notation, 7+—can now be understood as an instance of the above diagram. If we let t = t = Int, then C {t t ) = (C{t ),C(t )) = (C(lnt),C(Int)) and x o (C(lnt), C(Int)) = C(Int) x C(lnt). Further, in the partial order category there is an identity arrow I n t < Int, which corresponds intuitively to the idea that any value of type Int can be implicitly converted (trivially) to a value of type Int. Thus, T ^ I n t , Int) = Int, and C o r<5(Int, Int) = C(lnt). Similar derivations are achieved by setting t' — t' = F l o a t , and defining the conversion function C ( I n t < F l o a t ) to be t o F l o a t gives rise to the following diagram: 2 x 2 u 2 x 2 2 x 2 C(Int) x C(Int) — — • — C ( I n t ) toFloatxtoFloat C(Float) x C(Float) toFloat 7+'(Float, Float) > C(Float). 19 Chapter 1. Introduction 1.5 Haskell 1.5.1 Haskell's Type System Type Classes A s p r e v i o u s l y m e n t i o n e d , a n o v e r l o a d e d f u n c t i o n is a f u n c t i o n t h a t c a n accept p a r a m e t e r s of m o r e t h a n one type. for T h i s is s o m e w h a t a g i v e n o p e r a t i o n is d e f i n e d for. C o n s i d e r first t h e i d e n t i t y f u n c t i o n . f u n c t i o n c a n a c c e p t a r g u m e n t s of a l l t y p e s . id problematic a s t a t i c t y p e c h e c k e r , s i n c e t h e r e c a n b e a m b i g u i t y as t o w h i c h t y p e s : : type. a -> a, w h e r e a is a This I n H a s k e l l , its t y p e s i g n a t u r e is type variable t h a t c a n s t a n d for a n y d e f i n e d A s s u c h , a m b i g u i t y is a v o i d e d a n d c o m p i l e r c o m p l i c a t i o n s n e e d not arise. H o w e v e r , c o n s i d e r t h e a d d i t i o n f u n c t i o n . W h i l e it is d e s i r a b l e for t h e f u n c t i o n (+) t o b e d e f i n e d for t y p e s s u c h as I n t o r F l o a t , it s h o u l d n o t b e d e f i n e d for, say, B o o l or Char. T h u s , a t y p e s i g n a t u r e s u c h as -> A :: a a is t o o g e n e r a l for t h e o v e r l o a d e d o p e r a t i o n we h a v e i n m i n d . a -> The (+) s y s t e m u s e d i n H a s k e l l t o a d d r e s s t h i s issue is t h a t o f t y p e classes. type class is a set o f f u n c t i o n (or c o n s t a n t ) names, together w i t h their r e s p e c t i v e t y p e s , w h i c h m u s t b e g i v e n d e f i n i t i o n s for e v e r y t y p e t h a t is t o b e l o n g t o t h e class. I n o t h e r w o r d s , it is a set o f c o n s t r a i n t s for a type v a r i a b l e . S p e c i f i c d e f i n i t i o n s of t h e f u n c t i o n n a m e s o f a t y p e class are c a l l e d methods. O n c e e a c h of t h e m e t h o d s c o r r e s p o n d i n g t o t h e f u n c t i o n n a m e s o f a t y p e class has b e e n g i v e n , a n is, t h e t y p e of a n i n s t a n c e instance of t h e class h a s b e e n d e f i n e d ; t h a t i n s t a n c e o f t y p e class C l a s s is i n s t a n c e : : Class. F o r e x a m p l e , t h e class Eq is i n t e n d e d t o b e a class for t y p e s for w h i c h it m a k e s sense t o test for e q u a l i t y o r i n e q u a l i t y . T h e class d e c l a r a t i o n is g i v e n by c l a s s E q a where (==), (/=) a -> a -> B o o l T h e t y p e s Char, Double, F l o a t , I n t a n d I n t e g e r are i n s t a n c e s o f Eq, e a c h o f t h e m p r o v i d e s d e f i n i t i o n s for t h e m e t h o d s ( = ) and (/=). where N o w , the d o m a i n o r r a n g e t y p e s o f a f u n c t i o n c a n b e c o n s t r a i n e d t o a t y p e class t h a t suits the desired b o u n d e d n e s s of the p o l y m o r p h i s m . F o r e x a m p l e , the of (+) i n H a s k e l l is c o r r e c t l y w r i t t e n as (+) w h e r e Num : : Num a => type a ->. a -> a, is a class w i t h i n s t a n c e s Double, F l o a t , I n t a n d I n t e g e r , a n d => d e n o t e s t h e r e s t r i c t i o n o f t h e t y p e v a r i a b l e a t o t h e Num class. 20 Chapter 1. The Introduction Functor Type Class Let [x] d e n o t e a list of e l e m e n t s o f t y p e x. is d e f i n e d w i t h s i g n a t u r e map :: I n t h e P r e l u d e , a f u n c t i o n map (x -> y) -> [ y ] . Its a c t i o n is [x] -> t o a c c e p t a f u n c t i o n a n d a list, a n d o u t p u t t h e list o f i m a g e e l e m e n t s u n d e r t h e g i v e n f u n c t i o n ; t h a t is map For e x a m p l e , map negate f [xl, x2, . . . ] = [f ( x l ) , f (x2) ,...]. [-1,2,-3] . T h e map f u n c t i o n , t h e n , [1,-2,3] = is a m e a n s t o t r a n s f o r m a f u n c t i o n t h a t n o r m a l l y a c t s o n e l e m e n t s o f t y p e a i n t o a f u n c t i o n t h a t a c t s o n lists o f e l e m e n t s o f t y p e a w i t h o u t c h a n g i n g t h e essence o f t h e f u n c t i o n ' s b e h a v i o u r . O f course, this idea alludes to a n i m p l e m e n t a t i o n of the category-theoretic concept of a functor. As a n t i c i p a t e d , we define Hask to be a category w i t h Haskell types as o b j e c t s a n d H a s k e l l f u n c t i o n s as m o r p h i s m s . N o t e t h a t t h e ( a s s o c i a t i v e ) o p e r a t i o n of f u n c t i o n c o m p o s i t i o n i n H a s k e l l is d e n o t e d . a n d t h a t H a s k e l l h a s a p a r a m e t r i c p o l y m o r p h i c identity function. W e c a n conceive a n endofunctor on Hask by considering ( [ ] ) : : a -> [a] t o b e t h e "object p a r t " o f t h e f u n c t o r a n d map t o b e t h e "arrow p a r t " o f t h e f u n c t o r . The i d e a is g e n e r a l i z e d t o fmap i n H a s k e l l ' s b u i l t - i n Functor class. It has the following d e c l a r a t i o n : c l a s s Functor fmap :: ( f :: * -> *) where ( a -> b) -> ( f a -> f b ) . T h a t is, to define a n i n s t a n c e of that accepts any function g type signature f g : : function that accepts Functor, f a -> a m e t h o d fmap m u s t b e d e f i n e d a -> b a n d r e t u r n s a f u n c t i o n f g w i t h :: f b. E q u i v a l e n t l y , we c a n v i e w fmap as a a function g :: a n d a functor element f a ->b a, a n d r e t u r n s a n o t h e r f u n c t o r e l e m e n t f b. T h e n a m e o f t h e Functor class is m o t i v a t e d b y t h e a n a l o g y b e t w e e n t h e d e f i n i t i o n of t h e class a n d t h e d e f i n i t i o n of a f u n c t o r in c a t e g o r y t h e o r y . A n y instance of Functor is Hask), c o m p r i s e d of a f u n c t i o n f t h a t is d e f i n e d o n H a s k e l l t y p e s ( o b j e c t s i n and f a means to m a p a n arrow g a -> f b t h a t is also i n Hask. :: a -> b i n Hask to an arrow f g : : T y p i c a l l y , category theorists w o u l d label t h e m e t h o d fmap d e f i n e d o n a r r o w s as f , t h e s a m e n a m e as t h e p a r t o f t h e f u n c t o r t h a t a c t s o n o b j e c t s . I n H a s k e l l , h o w e v e r , f a n d fmap t a k e n t o g e t h e r are w h a t c o r r e s p o n d t o a c a t e g o r y - t h e o r e t i c f u n c t o r , w h e r e f is t h e p a r t o f t h e f u n c t o r t h a t a c t s o n o b j e c t s a n d fmap is t h e p a r t t h a t a c t s o n m o r p h i s m s . Of c o u r s e , t h e d e f i n i t i o n of a n y f u n c t o r / two e q u a t i o n s t h a t / m u s t satsify. in category theory includes F o r a n instance of the Functor class t o behave predictably a n d consistently in a Haskell p r o g r a m , analogous laws m u s t h o l d . W e d e m o n s t r a t e the consistency of ideas b y g i v i n g the following 21 Chapter 1. Introduction functor laws, which must hold for any instance of Functor. Recall that i d is a polymorphic function and so can act on objects both of type a and of type f a (where, analogous to lists, f a denotes a datatype in which the type of the contained data is a). Then, for f and fmap as given above, as well as functions g,h : : * -> *, the Haskell functor laws are given as: (Fl) fmap i d (F2) fmap g . h f a = f a = f a (fmap g . fmap h) f a. As a demonstration of these laws, recall the list example. This is really just an instance of the Functor class with [] being the 'f : : * -> *' from the class declaration, and map fulfilling the definition for the method fmap. In this case, • map • map map i d [ a ] ; = [ i d a] = [a] = i d [a] g . h [a] = C(g . h) a] = map g [h a] g (map h [ a ] ) = (map g . map h) [a] . = Intuitively, we can think of fmap as a function that applies a given operation to objects inside a "container" and then returns a container of the same shape. The shape of the container is left unchanged by fmap (Fl) and the contents are not shuffled relative to one another under the mapping operation (F2). That is, fmap preserves the structure of the category H a s k . 22 Chapter 2 Moggi's Monads 2.1 Motivation for Monads in Haskell Haskell is a purely functional language. This means that whenever a function is executed, no effects other than the computation of the function output will occur. This is a product of the fact that Haskell abides by the principle of referential transparency (see Section 1.4.1), and that all dataflow is made explicit. Since a program written in a functional language consists of a set of equations, the latter guarantees that the value of a function is only dependent on the function's free variables (Section 1.4.1). It also ensures that the order in which computations are made does not affect a program's meaning [14]. This is not the general case for impure functional or imperative languages, and any state changes unrelated to the output of a function are called side-effects. Printing to the screen, reading a file, or updating a datum assignment to a variable are examples of side-effects. Of course, it is hard to imagine doing much useful programming without such capabilities! Haskell employs monads to achieve the usefulness and convenience of impure effects without compromising the purity of the language. While these facts motivate our study, the introduction of monads to computer science actually began with Eugenio Moggi's work in formalizing programming language semantics using category-theoretic concepts, which he called a "categorical semantics of computations" [12]. Moggi's work provides the foundational theory for monads in Haskell, making a detailed examination of it both interesting and important in our context. His approach began with the adoption of the view taken by the denotational theory of semantics, in which programs are thought of as morphisms in a category with the types of a given language as objects. 2.1.1 Heuristics A classical approach to computational semantics is to view a program as a function that maps an input value to an output value. Heuristically, 23 Chapter 2. Moggi's Monads a p r o g r a m (or a p r o g r a m i n a n e n v i r o n m e n t ) c o m p o s i t i o n of functions / A -\ n —> 4 n = / a n d V i £ { 1 . . . n) n / c o u l d b e d e s c r i b e d as _ i ° • • • ° / i , w h e r e ,f\ the : A\ —> A2, • • •, f -i are t y p e s o f a g i v e n l a n g u a g e . t h e p r o g r a m w o u l d a c c e p t a n i n p u t v a l u e a\ £ A\ a '• n Thus, a n d o u t p u t a value a n £ A. n M o g g i ' s t w i s t o n t h e c l a s s i c m o d e l was t o c o n s i d e r p r o g r a m s n o t as m a p s f r o m v a l u e s t o v a l u e s , b u t r a t h e r as m a p s f r o m v a l u e s t o computations. Here, a c o m p u t a t i o n is t o b e t h o u g h t o f as t h e d e n o t a t i o n o r m e a n i n g o f a p r o g r a m a n d t h e t e r m " n o t i o n o f c o m p u t a t i o n " w i l l refer t o a q u a l i t a t i v e d e s c r i p t i o n of the d e n o t a t i o n of a p r o g r a m . In accord w i t h denotational semantics, we c a n r e p h r a s e t h i s i d e a as: a c o m p u t a t i o n is t o b e t h o u g h t o f as t h e m e a n i n g of a n arrow i n a category of types. t h e n TA If T is a g i v e n n o t i o n o f c o m p u t a t i o n , w i l l d e n o t e a c o m p u t a t i o n w i t h r e t u r n t y p e A, t o as a c o m p u t a t i o n o f t y p e a n d will be referred A. B e a r i n g i n m i n d t h a t t h e g o a l is t o c o n s t r u c t a c a t e g o r y w h o s e m o r p h i s m s are p r o g r a m s , we n o w h a v e t h e c h o i c e t o m o d e l p r o g r a m s as m a p s o f t h e form f : Ai —> TA n or g : TA\ —* TA . W e choose the former because n s u b s u m e s t h e l a t t e r : a n y c o m p u t a t i o n o f t y p e A\ is a v a l u e of t y p e TA\. s u m m a r y , t h e c a t e g o r i c a l c o n s t r u c t i o n u s e d b y M o g g i is as follows In [11]: • W e consider a category C w i t h a n operation T that carries a C-object A, r e p r e s e n t a t i v e o f t h e set o f v a l u e s of t y p e A, o f t h e set o f c o m p u t a t i o n s o f t y p e A. t o TA, representative T c a n b e t h o u g h t o f as a u n a r y t y p e - c o n s t r u c t o r t h a t is s o m e n o t i o n o f c o m p u t a t i o n , w i t h t h e e l e m e n t s o f TA b e i n g v i e w e d as t h e d e n o t a t i o n s o f p r o g r a m s o f t y p e A. • A p r o g r a m t h a t t a k e s a v a l u e o f t y p e A as i n p u t , p e r f o r m s a c o m p u t a t i o n , a n d r e t u r n s a v a l u e o f t y p e B, C-arrow f r o m A to can then be identified w i t h a TB. • A m i n i m u m set of r e q u i r e m e n t s for v a l u e s a n d c o m p u t a t i o n s is f o u n d so t h a t p r o g r a m s i n d e e d c o r r e s p o n d to m o r p h i s m s of a suitable category. B u t w h a t sorts of p r o g r a m m i n g concepts c a n notions of c o m p u t a t i o n s c a p t u r e ? W e c a n t h i n k o f t h e m as a d d i t i o n a l effects t h a t c a n o c c u r d u r i n g the e v a l u a t i o n of a f u n c t i o n . C o m p u t a t i o n s w i t h exceptions and computa- t i o n s w i t h s t a t e m o d i f i c a t i o n s are b o t h e x a m p l e s o f n o t i o n s of c o m p u t a t i o n s . I n t h e f o r m e r , t h e d e n o t a t i o n of a p r o g r a m ( t h a t is, its s e m a n t i c i n t e r p r e t a t i o n ) is e i t h e r a v a l u e or a n e x c e p t i o n , d e p e n d i n g o n w h e t h e r t h e e x e c u t i o n o f the p r o g r a m involves the o r d i n a r y c a l c u l a t i o n of a value or the c o n s i d e r a t i o n o f a u s e r - d e f i n e d s p e c i a l case (for e x a m p l e , a d i v i s i o n b y z e r o ) , r e s p e c t i v e l y ; i n t h e l a t t e r , a p r o g r a m d e n o t e s a m a p f r o m a store t o a p a i r c o m p r i s e d o f a 24 it Chapter 2. Moggi's Monads v a l u e a n d a m o d i f i e d s t o r e . T h e s e n o t i o n s of c o m p u t a t i o n s are p r e c i s e l y t h e k i n d s of effects t h a t w e r e p r e v i o u s l y p a r t i c u l a r t o i m p e r a t i v e l a n g u a g e s a n d t h u s l a y t h e c o n c e p t u a l f o u n d a t i o n s for i m p l e m e n t a t i o n o f s u c h features i n specific p u r e l a n g u a g e s , s u c h as H a s k e l l . N o w r e c a l l t h a t , i n g e n e r a l , we c o n s i d e r a p r o g r a m t o b e a f u n c t i o n t h a t is t h e c o m p o s i t i o n o f s e v e r a l f u n c t i o n s , a n d c o n s i d e r t h e t a s k of m o d i f y i n g s u c h a p r o g r a m so t h a t at s o m e s t a g e o f c a l c u l a t i o n ( t h a t is, at s o m e level o f c o m p o s i t i o n ) , a n o t i o n o f c o m p u t a t i o n is i n t r o d u c e d . F o r e x a m p l e , let us i n i t i a l l y a s s u m e t h e classic m o d e l , w i t h a p r o g r a m g i v e n b y ALB^C^D. S a y t h a t we w a n t t o i n t r o d u c e a side-effecting c o m p u t a t i o n T t o t h e f u n c t i o n g, b u t o t h e r w i s e leave t h e p r o g r a m u n c h a n g e d . o u t p u t t y p e o f g f r o m C t o TC W e thus want to alter the a n d t h e o u t p u t t y p e o f h f r o m D t o TD (the l a t t e r so t h a t t h e side-effect is c a r r i e d t h r o u g h t o t h e e n d o f t h e p r o g r a m ' s evaluation). This alone cannot be done, however, because e a s i l y e n v i s i o n r e p l a c i n g g b y a f u n c t i o n g : B —> TC, a n d not w h i l e we can C t h e d o m a i n o f h is TC. AU B ^TC TD. W e t h u s n e e d a m e a n s t o (i) c o m p u t e t h e d e s i r e d side-effect; (ii) a l t e r t h e f u n c t i o n h so t h a t it c a n b e c o m p o s e d w i t h g; (iii) leave t h e b e h a v i o u r o f the p r o g r a m otherwise unchanged. T h e s e guidelines lead to the category-theoretic concepts of K l e i s l i triples a n d K l e i s l i c a t e g o r i e s for use i n p r o g r a m m i n g s e m a n t i c s . 2.2 Kleisli constructions A Kleisli triple o v e r a c a t e g o r y C is a t r i p l e (T,n,.*), w h e r e T : Oc —» Oc, VA- A-+TA a n d / * : TA —> TB, for A,B,C £ O , a n d f : A -* TB, c a n d the following equations hold: (KTI) (KT2) rf = i&TA f*°VA = f (KT3) g* o / * = (g* o / ) * A w i t h g : B —> TC £ A K l e i s l i triple models a n o t i o n of c o m p u t a t i o n a n d the above properties c a n b e i n t u i t i v e l y u n d e r s t o o d i n t h e f o l l o w i n g m a n n e r . T h e m a p p i n g TJA is t h e i n c l u s i o n m a p p i n g of values into c o m p u t a t i o n s (any value c a n be t h o u g h t of 25 Ac- Chapter 2. Moggi's Monads as a "constant" c o m p u t a t i o n ) a n d / * is t h e e x t e n s i o n o f a f u n c t i o n / values to c o m p u t a t i o n s to one f r o m c o m p u t a t i o n s from to c o m p u t a t i o n s . Intu- itively, this extension c a n b e t h o u g h t o f as first e v a l u a t i n g a c o m p u t a t i o n and then applying / to the r e t u r n value. T h e formal function definition for / * c a n b e g i v e n i n t e r m s o f t h e l e t - c o n s t r u c t o r , w h i c h is i n t r o d u c e d i n Section 2.4. A n y K l e i s l i t r i p l e g i v e s rise t o a c o r r e s p o n d i n g K l e i s l i c a t e g o r y : Kleisli triple ( T , 77, Kleisli category CT _*) o v e r C , t h e given a is d e f i n e d i n t e r m s o f o b j e c t s a n d a r r o w s i n C . S i n c e we a r e c o n s t r u c t i n g CT f r o m C, we m a y use t h e s a m e l a b e l t o refer t o a n o b j e c t or a r r o w i n CT as o n e i n C. I n t h e case of possible a m b i g u i t y , the category of context w i l l be m a d e explicit. The c o n s t r u c t i o n of t h e K l e i s l i c a t e g o r y is d e f i n e d i n t e r m s o f C - o b j e c t s A, B a n d C-arrows / : A -> TB,g : B —> TC is as follows: • 0 = Oc Ct • Uom (A,B) = Homc(A,TB), CT where V 4 , B £ C an a r r o w A -> B £ CT is a n a r r o w A —> TB. • \/A 3 id A £ AQ £ 0c , s u c h t h a t i d ^ = T)A, w h e r e VA '• A —• TA T T • T h e composition A —> TC go f CT of T a r r o w s is g i v e n b y t h e c o m p o s i t i o n € Ac- g*of: i n C. W i t h r e g a r d t o t h e s e a x i o m s , n o t e t h a t t h e d e f i n e d c o m p o s i t i o n g* o f be thought o f as: c o m p u t a t i o n fa; (i) t a k i n g a value a £ A; (ii) (iii) e x e c u t i n g or e v a l u a t i n g fa applying / can to o b t a i n a to o b t a i n a value b £ B; (iv) a p p l y i n g g t o b t o p r o d u c e t h e r e s u l t a n t c o m p u t a t i o n . W e c a n use t h e c o n s t r u c t i o n a x i o m s for CT t o e x p r e s s t h e left u n i t , r i g h t u n i t , a n d a s s o c i a t i v i t y a x i o m s for CT solely i n t e r m s o f o b j e c t s a n d a r r o w s f r o m C as follows. L e t A,B,C c o r r e s p o n d i n g t o i d s , let / f £ b e o b j e c t s i n CT- F o r t h e left u n i t axiom : A —> B b e a n a r r o w i n CT, SO t h a t i d s o T CT- W e c a n e x p r e s s t h e l e f t - h a n d side o f t h e p r e c e d i n g e q u a t i o n f = in terms of C-objects a n d C - a r r o w s using the t h i r d a n d f o u r t h c o n s t r u c t i o n axioms listed above: ids o f T = rj A c o r r e s p o n d i n g t o id/3, let g '• B m o v i n g f r o m CT to C , we have: g o T F i n a l l y , the associativity o f. S i m i l a r l y , for t h e r i g h t u n i t a x i o m —» C £ CT, SO t h a t g OT id/5 = 9 G CT, a n d id/3 — 9* a x i o m for CT 0 VB- is e a s i l y e x p r e s s e d C-objects a n d C-arrows using the fourth construction a x i o m . B, g : B C, h : C - D £ C T a n d (h o T g) o T f = (h o T Then, ho (go f) T T i n t e r m s of Let / : A —> = h*o{go f) = h*o(g*of) T g)* o f — (h* o g)* o / . T h u s , CT is a c a t e g o r y . W e s u m m a r i z e entirely i n t e r m s of objects a n d arrows i n C , v i a the correspondence between Homc {A, B) T and E.omc(A,TB) given i n the second 26 Chapter 2. Moggi's Monads g : B —> TC c o n s t r u c t i o n a x i o m . T h a t is, g i v e n C - a r r o w s / : A —> TB, h : C —> TD, and we i n t e r p r e t t h e CT a x i o m s i n C as: (KC1) n* of (KC2) f* on = f (KC3) h* o (g* o f) = (h* o g)* o f B , left u n i t = f A right unit associativity. N o w , we c a n d i r e c t l y p r o v e t h e e q u i v a l e n c e b e t w e e n t h e K l e i s l i t r i p l e axioms ( K T 1 ) - ( K T 3 ) a n d the Kleisli category axioms ( K C l ) - ( K C 3 ) . assume ( K T 1 ) - ( K T 3 ) . First, By (KT1), => V*B° f = TB => V*B°f ° / id = f- T h u s , ( K C 1 ) holds. T r i v i a l l y , we h a v e ( K C 2 ) = (KT2). ( K T 3 ) states t h a t h* o g* = (h* o g)*. h*o(g*of) Thus, = (h*og*)of which implies ( K C 3 ) . Next, assume ( K C 1 ) - ( K C 3 ) . codomain TA. By (KCl), In p a r t i c u l a r , take / = VA I idrA- = 0 /» f o r a n y / t h a t h a s Then, rf o id A = ICITA A T r)*A = TA, id a n d so ( K T 1 ) h o l d s . A g a i n , we i m m e d i a t e l y h a v e ( K T 2 ) = . ( K C 2 ) . Finally, g : B ( K C 3 ) states that —> TC, so t h a t 5* : TB h* o (g* o / ) a n d ( K C 3 ) implies ( K T 3 ) . (h* o g)* o / . 5 ( K C 3 ) t o b e t h e i d e n t i t y a r r o w o n TB, h*og* = -> T C a n d (/i* o ) * : T S = Thus, f — idrs- Recall that T D . Take / Then, {h*og)*oid B T (KT1)-(KT3) are equivalent to (KC1)- ( K C 3 ) , as d e s i r e d . 27 in Chapter 2. Moggi's Monads 2.2.1 Example: State Updates We begin to concretize our work with triples by examining a side-effecting computation; namely, modifications to (variable) state. To show that state updates can be modelled using a suitable Kleisli triple, we must propose definitions for the three maps (T, 77,-*), and show that they satisfy the Kleisli triple axioms. As briefly mentioned above, a computation w i t h a state update can be thought of as a computation that takes an initial store or state variable, and returns a value (the output of the computation's underlying function) along w i t h the modified store. For example, say that we have a simple program that adds several numbers together, and that we want to include a sideeffecting computation i n which the number of additions that the function performs is kept track of as the function is evaluated. This computation could accept s = 0 as the initial store, and return the pair (sum, s'), where sum is the return value of the function, and s' is the number of addition operations that were performed to obtain sum. W i t h this motivation, we define the first component of the state update triple to be the functor T(_) = (_ x S) , where 5 is a nonempty set of stores [10]. We offer the corresponding maps n and _* to form a Kleisli triple (T, r/,_*) that models state updates: s • n : A -> (A x S) ~* VA( ) • S —> A x S s *—* (a, s) s A a h a • for / : A -> (B x S) , s f*:{Ax S) -> (B x S) , a -> f*{a) : S -» B x S s s where (b, s) = a(s). A t a glance, these maps appear a bit ghastly to deal with, so we introduce a small amount of notation that will make calculations easier. A n y function /:!?—> R maps an element d of its domain to a map f(d) : S —> R. We can, of course, evaluate this function on an element s G S to obtain a value r G R\ that is, f(d)(s) = r. Thus, / can be interpreted as a function that accepts a pair of values and returns a single image value. It will be convenient to emphasize this using the following notation: given a map / as above, we define a (redundant) map /1 : D —•> R by f\ = /, and a corresponding map / : Dx S -> R, such that Vd e D, Vs G S, f ({d, s)) = fi{d){s). s s 2 2 A s a word of caution, while this notation will prove to be expedient i n the calculations below, there is one case i n which it is somewhat counterintuitive; namely, w i t h regard to the identity map. For example, if D is 28 Chapter 2. Moggi's Monads t a k e n to b e a f u n c t i o n s p a c e , say D s £ S, = t h e n we c o u l d w r i t e i d i ( a ) ( s ) (A = x S) , with a s i d 2 ( ( a , s)). e (Ax S) and s Notice that while the r i g h t - h a n d side o f t h i s e q u a l i t y is s u g g e s t i v e o f a n e v a l u a t i o n t o (a, s), the a c t u a l e v a l u a t i o n s h o u l d b e i d 2 ( ( a , s ) ) = ( i d i ( o j ) ) ( s ) = a(s) ^ ( a , s). W i t h this setup, the p r o p o s e d K l e i s l i triple c a n be presented i n a new form. W e c l a i m t h a t t h e 3 - t u p l e ( T , 77, _*) d e f i n e d b y : (S ^ 0) r) ((a,s)) = (a, s) • TA = (A x S) • V a £ A, Vs £ S, s A2 • W l , / 3 6 O c with / : A -> (5 / ( a ( s ) ) , w h e r e / * : ( A x S) 2 x 5) a n d a £ ( A x S) , 5 s x S - s 2 / *((a,s)) 2 = x 5 a n d /j*(a)(s) = / » ( s ) = is a K l e i s l i t r i p l e . W e n o t e here t h a t a c o m m o n w a y t o define t h i s t r i p l e is t o use A - c a l c u l u s notation. W e h a v e a v o i d e d i n t r o d u c i n g t h a t n o t a t i o n here, b u t g i v e the f o r m u l a t i o n for r e f e r e n c e [12]: • r]A{a) = As : 5.(a, s) • for / : A -» (B x S) a n d a £ (A x S) , s a(s) The f*(a) s = A s : S.(let (a, s') = in /(a)(s')). n o t a t i o n A s : d e n o t e s t h a t t h e f u n c t i o n f o l l o w i n g t h e ':' is g i v e n i n t e r m s o f t h e i n d e p e n d e n t v a r i a b l e s, a n d t h e n o t a t i o n S. d e n o t e s t h a t the i n d e p e n d e n t variable i n the f u n c t i o n following the ' . ' i s restricted to S (in t h i s case, we h a v e s £ 5). W e n o w s h o w t h a t t h e g i v e n 3 - t u p l e satisfies ( K T 1 ) - ( K T 3 ) For ( K T 1 ) , we h a v e id idxAict) • {A x S) (Ax S) ', s TA = a s where V a £ Vs £ 5,idjvi(a:)(s) = as follows. (A x S) s a(s). T h u s , 77^ = i d y / i t r i v i a l l y b y t h e s e c o n d f o r m u l a t i o n for 77 g i v e n a b o v e . For ( K T 2 ) , let / : A -> (B x S) , s (r°7M).i(a)(s) T h u s , f* or] A = a £ A, s £ S. Then, = / *(7 = /2(?Mi(a)(s)) = / ((a,s» = /i(a)(s). 1 ? / U (a))(s) 2 f. T o p r o v e t h a t ( K T 3 ) h o l d s , we first p r o v e t h e f o l l o w i n g p r o p o s i t i o n : 29 Chapter 2. Moggi's Monads P r o p o s i t i o n 2.2.1. Given maps f :A-+(B x S) andg:B^{Cx s S)' we have: 5i Proof. Let / and g be 0 / l = 92 ° as g i v e n , a n d let a h£ A, s £ 5. T h e n , = 52(/i(a)(«)) = 22(/2«a, ») = (52 ° / ) ( ( a , s ) ) = (52 ° / 2 ) i ( a ) ( s ) , S 2 2 as d e s i r e d . N o w , t o s h o w ( K T 3 ) , let a £ {Ax (5* S) s a n d s £ 5 . W e have: = ^(^(a))^) o/)!(£*)(*) = 52(/i(a)(s)) = 52(/ (a(s))) = (520/2)2(^(5)) = (5i / i ) ( a ( s ) ) 2 0 2 proposition T h u s , 5* o / * = (#* o / ) * a n d ( T , 7?, _*) is a K l e i s l i t r i p l e . 2.3 Triples W h i l e K l e i s l i t r i p l e s a r e i n t u i t i v e l y j u s t i f i a b l e as a m e a n s t o t h e o r e t i cally m o d e l c o m p u t a t i o n s , they are equivalent to another very interesting category-theoretic construction; n a m e l y . . . monads! Incidently, the equiva- l e n c e e x p l a i n s t h e o t h e r n a m e "triples" for m o n a d s t h a t a p p e a r s i n m u c h o f the literature. T h e o r e m 2.3.1. There is a one-to-one correspondence between Kleisli triples and monads. Proof. Given a Kleisli triple ( T , 77, _*), we c l a i m t h a t the corresponding m o n a d is ( T , 77, fi), w h e r e : (i) T is t h e e x t e n s i o n o f t h e f u n c t i o n T : OQ — * OQ to t h e endofunctor T T(f) : C —> C , d e f i n e d o n a n y C - a r r o w / — (ilB o / ) * a n d (ii) fj, A : A —> B b y = i d f ^ . T o p r o v e t h i s , we first s h o w t h a t T is i n d e e d a f u n c t o r . L e t / : A —> B,g : B —> C. T h e n , 30 Chapter 2. Moggi's Monads T(gof) (vco(gof))* = = ((rjc = (((r?C°2)*°??B)°/)* o g) o / ) * associativity in C = = = (ivc ° g)* o (IB o f))* (rjc o g)* o (n o f)* TgoTf. (KT2) (KT3) B Further, T(id ) = (r) oid )* A A = rf A - A \d . TA T h u s , T is a n e n d o f u n c t o r o n C. N e x t , we s h o w t h a t ry a n d p are n a t u r a l t r a n s f o r m a t i o n s . L e t / : A —> B b e a n a r r o w i n C. F o r ry, we h a v e Tf or) A = (rjQ o /)* = VB°f. o r) d e f n of e x t e n s i o n o f T A (KT2) T h u s , ry is a n a t u r a l t r a n s f o r m a t i o n . T o s h o w p is n a t u r a l , we h a v e Tf OPA = {VB ° / ) * = {VB ° / ) * ° 0 fJ-A d e f n of T id UvB°fyo\d y = defn of TA = p A (KT3) TA ((vBojyy = ( i d r B ° {VB o / ) * ) * )o(ryW)T (KT2) = ((idf ory = (id^B = id^B 0 = MB (VTB ° (VB = = HB°T(riB°fy p oT{Tf). T B B 0 0 (VTB o {r)B ° /)*))* (.VTB {j]B 0 0 /)*)*) (KT3) /)*)* 0 defn of p B T of T defn of defn B T h u s , p is a n a t u r a l t r a n s f o r m a t i o n . W e n o w s h o w t h a t t h e m o n a d laws are s a t i s f i e d . N o t e t h a t i d ^ : T A 2 TA a n d \d*TiA :T M/l 3 0 A -> T A. 2 f-TA = = —> T h u s , we h a v e id o id 2 (id* oid 2 y TA T T TA A (KT3) A - (id^)* = {id = ((id^or/TvOoid^,)* = = (i<*TA ( lTA o id* )y id* o (m-A o id* )* ( K T 3 ) TA o id* )* TA 0 (KT2) r TA TA TA 31 Chapter 2. Moggi's Monads = id^oT(id^) = fj, oTu. . A A To show that the proposed 3-tuple satisfies the second monad law, we compute: HA 0 VTA = id o rjTA T A = id™ = V* (KT2) (KT1) = {id A = ((^TA = (idf/i = \d o{n o A °VA)* T TA VTA) 0 (VTA 0 o VA)* TJA)*. TA (KT2) VA)T 0 (KT3) Since TJA is an arrow from A to TA, by the definition of the extension of T from a function to a functor given i n Theorem 2.3.1, we have T(T]A) = {VTA ° VA)*- Thus, . = i d ^ o Tr] = \I OTT) . defn of extension of T A A A Thus, the Kleisli triple gives rise to a monad. To show the converse, if we are given a monad (T, v,f), then we claim the corresponding Kleisli triple is (T, 77, _*), where: (i) T is the restriction of the endofunctor T to the objects of C and (ii) for / : A —> T73 E Ac, f* = HB (T /)- We show that this 3-tuple is indeed a Kleisli triple as follows. First, 0 1 rf A = HA o TT]A = id A, T by the second monad law. Thus, ( K T 1 ) holds. Next, we compute /* ° VA = {HB Tf) o TJA HB VTB f = id^B ° / 0 = 0 0 monad law 2 - /• Thus, we have ( K T 2 ) . Finally, we show that ( K T 3 ) holds by 9* °f* = = = (fic ° Tg) o ( / j o Tf) Mc ° HTC ° T g o Tf B 2 HC THC 0 0 T g o Tf 2 = = He o T{HC oTgof) {HC °Tgo /)* = (5*0/)*, monad law 1 32 Chapter 2. Moggi's Monads which implies that (T, TJ, _*) is a K l e i s l i triple. Therefore, there is a one-toone correspondence between Kleisli triples and monads. • This correspondence immediately gives us an alternative description for Kleisli categories, which is stated thus [8]: given a monad (T, n, n) i n a category C, assign to each object A 6 C a new object Ax and to each Carrow / : A —>'TB a new arrow f : AT —> Bj-. These new objects and arrows constitute the K l e i s l i category of the monad Cx when the composition of C^-arrows f : A —> B , g • B —> C is defined by g o f = (fie ° Tg o f) . Proofs of the right and left unit laws and the associative law can be summarized w i t h the following commutative diagrams, where f : A -* TB, g : B -> TC, h : C TD G A . Notice that the identity arrow for Cj- is {T]A) '• AT —* AxK K K T K T T K T K c k f K o (n ) f A Va = K A f K TB TB VTB TA (vc) K og K + TB h« °(g =g • TC B T Tvc T 2.3.1 C J^- of«) K K 2 2 °g )offK K K s T C TD 2 D PD P-TD PC TC = (h TD 2 TC -^-> TD Example Revisited: State Updates Using the correspondence given in Theorem 2.3.1, we can now reformulate our earlier example i n terms of monadic, rather than Kleisli, maps. Recall that for state updates, we define the endofunctor T to act on objects by TA = (A x S) . Thus, from the theorem we have s HA = id* : {(A x S) s TA x S) s -> {A x S) . s 33 Chapter 2. Moggi's Monads We can semantically interpret this map i n terms of state modifications by observing how this function behaves on input values. L e t / £ ((A x S) x S) ,f £ (Ax S) ,a £ A and s,s',s" £ S. Further, let f(s) = (/',s'} and f'(s') = (a,s"). Then, s s s p (f)(s) A = id (f)(s) = id A2(f(s)) TA1 T = id r A 2«/V» = i d ™ (/')(*') = / V ) = (a, a")T h a t is, p (f) is a computation which, on input s (a store), first computes a pair comprised of a new computation and a modified store, f(s) = {/', s'} and then outputs the value-store pair f'(s') = (a,s"). We note that i n the A-calculus, this would typically be denoted by id (f) = As : 5.eval(/s). The theorem likewise supplies definitions for r\ and for the action of T on a given function. Using these, we can verify that the monad laws hold. First, we show equality of the outputs returned by each of the functions \± ° f-TA and p o Tp on input ( f , s ) , for / £ T A = (((A x S) x S) x S) and s £ S. T o facilitate this, let / ' £ T A, f" £ TA, a £ A and s', s" £ S, w i t h A TA A 3 A s s s A 2 f(s) = (f',s') f'(s') = (f",s») f"(s") = (ay). We have: (AM ° W r v i ) i ( / ) ( * ) = = °id* 2 )i(f)(s) ( *TA [d i d T A T/li(idT2^i(/))(s) = id (id 2 (f)(s)) = id ^2(id 2^ (/(s))) = id ^ (id 2 = T TA2 T T 2 T 2 r A1 ((/ ,s'))) / j 4 2 i d ™ ( i d r ^ i (/')(*')) = id (f'(s')) = \d (f")(s") TA2 TA1 = (a,s"') 34 Chapter 2. Moggi's Monads and {fi oTfiA)(f)(s) A = id T A o (nTA = o\d )*)(f){s) TA id A2{{r)TA°id Y {f)(s)) T TA 1 = id A2{{VTA ° id )i{f'){s')) = idT^(r?^i(id^ (/'))(5 )) T TA / 1 = id ((id (f),s')) TA2 TM = id^ (/')(*') = i<W/V)) = id A2((f",s")) T = f"(s") = (a,s">). Thus, the first monad diagram commutes. Similarly, we compute the return values for [/.A ° VTA and \XA ° TTJA on (/", s") G (A x S) x 5 as follows: s (HA°rrrA)i(f")(s") (id^o^)^/")^") = = id ( (f"))(s") TA1 = VTA1 i d i M 2 ( r 7 i v i i (/")(*")) • id 2((f",s")) = TA = f(s") id T (f")(s") A and (HA°Tr, )(f")(s") A = (id* o Tr Hf")(s") lA TA (id TA o {ryiA ° VAY)\U"){ ") S ^TAI{{VTAOVAYIU")){S") id A2((VTA°AY(f")(s")) T id A2((VTA°VAh(f"(s"))) T id A2{(VTA T or ) (a)(s"')) M 1 ^TA2{VTA\{VA\{a)){s'")) ^TA2((VAl(a),s"')) VAi(a)(s"') 35 Chapter 2. Moggi's Monads = /"(«")• Thus, we have verified that the 3-tuple (T, n, fi) for state updates is a monad. 2.4 Strong Monads We are about to investigate the connections between our work thus far with the so-called "monads" of Haskell. T h e implementation of monads in Haskell was led by P h i l i p Wadler, whose work on the subject uses the theory of Moggi's notions of computations. We conclude this chapter by examining another category-theoretic concept used i n Moggi's work, which is technically required for Wadler's implementation. In Section 1.4.1, we commented that the A-calculus served as a foundation for many functional languages, such as Haskell. Since Moggi's a i m was to provide a categorical semantics of computations for studying languages based on typed A-calculi, his theory had to be rich enough to be able to interpret A-terms. We have already noted the equivalence between the A-calculus and cartesian closed categories, but a monad over a cartesian closed category does not have sufficient structure to interpret A-terms in a computational model [10]. One problem that arises involves a form of local definition that is used in functional languages. A local definition is the introduction of an association between a name and value for use within an expression, and the form we are interested in here is called the let-constructor. T h e let-constructor is equivalent to function application, and we briefly motivate its definition before explaining its relevance i n Moggi's computational A-calculus. A typical A-function applied to an argument can be represented as A (name), (body) (argument), where (name) is a bound variable, and (body) and (argument) are A-expressions. The function evaluation requires that we replace a l l free occurrences of (name) in (body) with (argument), prior to evaluating (body). Equivalently, we can think of associating (name) and (argument) throughout the evaluation of (body). We notationally capture this equivalent conception of function evaluation using the let-constructor [9]: let (name) = (argument) in (body). 36 Chapter 2. Moggi's Monads As referred to in Section 2.2, the let-constructor is used to define the extension of a function / to /*; specifically, given a notion of computation T, types A, B, a program / : A —> T73, and a denotation of a program c £ we define the extension of / by /*(c) =' (let a; = c in f(x)). Despite its importance, the interpretation of an expression such as (let x = e in e') is problematic in a category with finite products and a monad, when e' has free variables aside from x, as is shown in [10]. Essentially, the problem is that even in a category with cartesian products and a monad, there is not enough structure to transform a value-computation or computationcomputation pair into a computation of a pair. A similar problem arises in the interpretation of functional types, in which it is necessary to obtain a computation that computes a pair from a given pair of computations. We can resolve these problems by requiring our monad to be a strong monad—a monad along with a natural transformation that is defined as follows. A strong monad over a category C with finite products is a monad (T, rj, p) together with a natural transformation t with value t ,B from A x TB to T(A x B), such that the following diagrams are commutative A [11]: 1 x TA - ^ - > T ( l x A) TA TA tAxB.C {A x B) x TC T((A xB)xC) <*A,B,TC I Ax(Bx TC) i d Ax B id A X " x t B - , A x T(B x C) id Ta ,B,C A T(A x (B x C)) c Ax B A x l VAxB T) B tA,B AxTB T{A x B) id VAxB A TB 2 x T(A x TB) ^ > T (A x B) 2 37 Chapter 2. Moggi's Monads where r • 1 x A —> A and CXA,B,C '• (A x B) x C —> A x (B x C) are natural transformations. The given natural transformation t B acts to transform a value-computation pair into a computation of a pair of values i n the following manner: A A l t ,B • AxTB A (a,c) -> T(A x 73) ' ^ (let y = c in (a,y)). For example, i n the case of state updates, the natural transformation t ,B ( ! c) is a function which, given a store s returns the computation (let (b, s') = Q A c(s) in ((a,b),s')) [12]. Since Haskell is based on the A-calculus, the applied category-theoretic concepts we explore must be compatible w i t h the interpretation of A-terms. Hence, the "monads" that we will examine in Haskell are actually strong monads. 38 Chapter 3 Wadler's Monads B y now, it should not be a surprise that in Haskell, monads are data types that are used to represent computations. If m is a monad, then an object of type m a represents a computation w i t h return type a. T h e definition of a particular monad will reflect the sort of side effect that the monad is trying to computationally mimic. Its use will ensure that type checker can verify that the side-effects are managed consistently throughout the program. In this way, monads are a means to encode the behaviour of computations, as opposed to values, in Haskell's type system. Before we examine the Monad class declaration, let us informally consider a motivating example. 3.0.1 The Maybe Monad Consider a function or operation that is undefined at one or more values: for example, division. In a composition of functions in which at least one of the functions involves division, an attempt to divide by zero means that the computation can not be completed. For instance, consider the simple function f :: I n t -> I n t -> I n t -> F l o a t f x y z = (1 / x) + 1 / (y + z) O n initial inspection, it appears that before this function can be evaluated we must check to see if both of the terms are actually integers. In general terms, to add error handling to a program written in a purely functional language, the result type of the program would have to be modified to include error values, and failures would have to be checked for and handled appropriately at each recursive call in the program. We can avoid such failure checking, however, by carrying out such an operation in the Maybe monad. In Haskell, Maybe is the name of both a data type and a monad, w i t h the declaration of the former given as follows: d a t a Maybe a = J u s t a I N o t h i n g 39 Chapter 3. Wadler's Monads A s a technical aside, we note the difference between 'data' declarations and 'type' declarations. The former introduces a new data type (for example, Maybe) and new constructors for values of that type ( J u s t and Nothing), while the latter defines a synonym for a pre-existing type. Either kind of declaration can be used in the definition of a monad. In the Maybe monad, the validity of an expression such as that given above can be implicitly checked and used to determine the value that is returned: if there are no undefined terms in the expression, the function will return a Maybe value of type J u s t , whereas an erroneous evaluation will return N o t h i n g . For example, if our function above is evaluated using an implementation of the Maybe monad, then f 1 0 1 would return J u s t 2 and f 1 -1 1 would return N o t h i n g . The benefit is that this function can now be composed w i t h other functions and, in the case of failure, the "error" will cascade through the program in a safe and predictable way. Note that i n a series of computations that included the latter evaluation of f above, any computations that did not involve f would evaluate as normal, and computations that d i d involve f would now return N o t h i n g . In general, the Maybe monad provides a means to combine a chain of computations, each of which may return N o t h i n g , by terminating the chain early if any step does in fact return N o t h i n g . 3.0.2 Definition of a Monad In the context of Haskell, we define a monad to be a 3-tuple (M, r e t u r n , >>=), where M is a type constructor and r e t u r n and » = (read " b i n d " ) are polymorphic functions w i t h type signatures as shown in the (minimal) class declaration below: c l a s s Monad m where r e t u r n :: a -> m a ( » = ) : : m a -> (a -> m b) -> m b Let us examine these methods in detail. For use as a model monad by which to exemplify the concepts discussed, we provide the instance declaration of the Maybe monad: i n s t a n c e Monad Maybe where return a = Just a N o t h i n g >>= f = Nothing J u s t x >>= f = f x 40 Chapter 3. Wadler's Monads The return Operator A n y compositions w i t h i n a monad must involve monadic-typed arguments. Informally, the purpose of the r e t u r n operator is to get us into the monad. More specifically, it converts a value into a computation that returns a value of the same t y p e — t h a t is, into a corresponding monadic type. T h e technical means by which this is done will depend on the particular monad under consideration, but i n a general sense, r e t u r n carries a value to the "simplest" corresponding monadic-typed value, while leaving the original value otherwise unaltered. A s shown in the instance declaration, r e t u r n is defined to be J u s t i n the Maybe monad. The ( » = ) Operator For a monad M, T h e infix operator (>>=) provides a way to apply a function of type a -> M b to a computation of type M a. Intuitively, we can think of the operation (M a) >>= f i n the following way. First, the monadic structure of the argument M a is removed to unveil any number of values of type a inside. T h e actual means of doing this is specific to the definition of ( » = ) for the particular monad under consideration. These non-monadic values can then be passed to f one at a time, with each application of f to a value of type a producing a corresponding monadic element of type M b. Next, the monadic structure of each of these f-output elements is removed, yielding a collection of elements of type b. Finally, the monadic structure is reassembled over this collection (in a way that must be defined specifically for the particular monad M under consideration), producing a single monadic element of the correct type, M b. We may refer to the process in its entirety as "binding M a to f." A s an example of this process, consider the Maybe monad, i n which we have ( J u s t x) » = f = f x, for x : : t and f : : a -> Maybe a. T h a t is, the monadic layer of J u s t x is "removed" and just the value x is unveiled. The value x is then passed to f : : a -> Maybe a, which returns a Maybetyped value (either J u s t y for some y, or Nothing). A s is also shown i n the instance declaration, N o t h i n g >>= k = N o t h i n g . This is consistent with the intuitive idea that no value is extracted from N o t h i n g , so nothing is passed to k and no output value is produced. Each instance of the monad class must satisfy 3 laws i n order to behave properly in the program. Note that for user-defined monads, it is up to the user to ensure that the laws are satisfied, since it is impossible for the type 41 Chapter 3. Wadler's Monads checker to do so [6]. T h e statement of the laws makes use of Haskell syntax that is based on shorthand A-calculus notation; namely, an expression of the form \ x —•> e(x) explicitly defines the function that maps a variable x to an expression e(x), w i t h e(x) dependent on x. T h a t is, we code the A-calculus expression Ax : e(x) i n Haskell as \ x -> e ( x ) . For example, the expression \ x -> s i n x is similar to defining the function e(x) = s i n x i n standard mathematical notation, except that we do not need to explicitly give the function defined by the former expression a name (such as 'e' i n the latter). T h e Haskell notation is helpful to use i n conjunction With expressions involving more than one occurrences of the (infix) operator ( » = ) , which accepts a monadic element on the left and a function on the right. W i t h this, the Haskell monad laws are: (Ml) (M2) (M3) return x » = f = f x n » = return = n (n » = f ) » = g = n » = (\x -> (f x » = g ) ) , where x : : a, n : : m a, f : : a -> m b and g : : b -> m c. T h e first two monad laws ensure that r e t u r n preserves all information about its argument. T h a t is, for ( M l ) , if we define a trivial computation r e t u r n that accepts a value a and returns a, and bind this computation w i t h another computation f, then the result is the same as if we had applied f to a directly. For (M2), if we perform the computation n and then pass the result to the trivial computation r e t u r n , the result is the same as if we had just performed the computation n. T h e third law (M3) is essentially an associativity law for (>>=), which accounts for the type signature of the operator. Given this definition of a monad and the associated laws, we raise our primary question of interest: What is the precise connection between monads as defined and used in Haskell, and the category theoretic definition of monads discussed in Section 1.3? 3.1 Examples We present further examples of monads in Haskell, so as to build an intuition of how they work. 42 Chapter 3. Wadler's Monads 1. The simplest monad is the identity monad, w i t h the triple defined as: type Id a return return a (»=) = :: = :: x »= f a a -> I d a a I d a -> (a -> I d b) -> Id b f x. Here, I d is the identity function restricted to types (implicitly), r e t u r n is the identity function, and ( » = ) simply evaluates f x. Were this monad implemented, no non-trivial computation would take place along w i t h the regular evaluation of the function. 2. We have already investigated in detail the theory behind state updates as monads, and we continue this example w i t h a Haskell implementation. To begin, we give a simple Haskell evaluator whose task is to compute a quotient of two integers [17]. We define a data type Term by: d a t a Term = Con I n t I D i v Term Term, where a term is to be either a constant Con a for an integer a, or a quotient D i v t u of terms t and u. The evaluator is then simply: eval e v a l (Con a) e v a l ( D i v t u) :: = Term -> I n t a eval t 'quot' e v a l u. A s indicated by the type signature, the function e v a l accepts a term and returns an integer. If the term is a constant (that is, an integer rather than an expression involving a division), then that constant is returned. If the term is a quotient, then the expressions comprising the numerator and denominator are: (i) evaluated as subterms; (ii) the division is computed once the subterms are reduced to constants; (iii) the quotient is returned (the function ' q u o t ' is integer division). For example, we have eval(Div(Div(Con 11904)(Con 2 ) ) ( C o n 3)) = 1984, W i t h this as an evaluator template, we can consider making modifications that mimic side-effects. Say we want to keep track of the number of divisions performed during a calculation made by e v a l using state updates. We begin w i t h Moggi's notion of computation and declare two data types: 43 Chapter 3. Wadler's Monads t y p e M a = S t a t e -> t y p e S t a t e = Int. (a, State) Certainly, the evaluator can be adapted to the task of counting operations as it calculates, without the use of monads. However, the alteration is tedious because for each call of the evaluator, the current state must be passed in, and the updated state must be extracted from the result and passed on appropriately. The modified evaluator can be coded to require an additional input integer that is to be thought of as a counter variable, w i t h sO being the initial state. Divisions are carried out using recursive substitutions (until a constant is reached), where each recursive step also involves substituting an incremented state variable value for the previous state variable value. This can be done using the let-constructor syntax in Haskell. Haskell code involving l e t statements is to be interpreted as the obvious substitution operation: l e t x — x' in E(x) means substitute x' for x in expression E. Such a modified evaluator could look like: eval e v a l (Con a) sO e v a l ( D i v t u) sO :: = = Term -> I n t -> M I n t (a,sO) l e t ( a , s i ) = e v a l t sO l e t (b,s2) = e v a l u s i (a ' q u o t ' b, s 2 + l ) in in To demonstrate its operation, an input of eval(Div(Div(Con 11904)(Con 2 ) ) ( C o n 3)) 0 corresponds to t = D i v ( C o n 11904) (Con 2 ) , u = Con 3 and sO = 0. Bearing in mind Haskell's lazy evaluation, this computation proceeds as: - - let (a, s i ) = eval D i v ( C o n 11904)(Con 2) 0 in let (b, s2) = eval (Con 3) s i in (a 'quot' b,s2 + 1) let (a, s i ) = eval D i v ( C o n 11904)(Con 2) 0 in let (b,s2) = (3, s i ) in (a 'quot' b, s2 + 1) - l e t (a, s i ) = eval D i v ( C o n 11904)(Con 2) 0 in (a 'quot' 3, s i + 1) 44 Chapter 3. Wadler's Monads - (a', si') = eval (Con 11904) 0 in let (b',s2') = eval (Con 2) s i ' in (a''quot' 6',s2' + l)) in (a 'quot' 3, s i + 1) let et (a, s i ) = (let (a, s i ) = (let (a', si') = eval (Con 11904) 0 in let (b',s2') = {2, si') in (a' 'quot' b',s2' + 1)) in a 'quot' 3, s i + 1) (a', si') = eval (Con 11904) 0 in (a' 'quot' 2, s i ' + 1)) in a 'quot' 3, s i + 1) et (a, s i ) = (let et (a, s i ) = (let (a', si') = (11904,0) in (a' 'quot' 2, s i ' + 1)) in a 'quot' 3, s i + 1) (a, s i ) = (11904 'quot' a 'quot' 3, s i + 1) et 5952 'quot' 2,0 + 1) in 3,1 + 1) 1984,2) As desired, the return of the computation is the ordered pair that has the result of the division calculation as its first component and the number of divisions performed as its second. To implement monadic abstractions in the evaluator, we re-define the evaluation function in terms of the r e t u r n and >>= operations. As offered in [17], we have: eval : : Term -> M I n t e v a l (Con a) = e v a l (Div t u) = return a e v a l t >>= \ a -> eval u » = \b -> r e t u r n a 'quot' b, noting that the compiler implicitly inserts brackets into the above expression as follows (that is, the compiler automatically interprets the expression with the following precedence):' eval (Div t u) = ((eval t) »= (\a —> ((eval u) »= (\b —+ (return (a 'quot'.6)))))). 45 Chapter 3: Wadler's Monads T h i s monadic evaluator can be verbally described as follows: the type signature indicates that the evaluator accepts a term and performs a computation w i t h return type I n t . To evaluate a constant term, simply return it. A computation of the form (Div t u) is performed by first computing t (which may be a constant or an expression involving the D i v operator) and binding a to the result, then computing u and binding b to the result, and lastly returning a 'quot' b. Informally, the expression t is simplified to a constant, this result being a monadic type M Int. The integral value contained in the monadic-typed term is extracted from the monad and passed as an argument to the function \ a -> e v a l u >>= \ b -> r e t u r n a ' q u o t ' b. In standard mathematical notation, if we call the extracted integral value c and label the preceding function as /(a), then the last step amounts to removing the monadic layer from the output of eval t to obtain c, and then computing /(c). Similarly, the expression e v a l u >>= \ b -> r e t u r n a ' q u o t ' b simplifies u to a constant element in M I n t and passes it as an argument to a function that maps input b to the (monadic-typed) constant a ' q u o t ' b. Essentially, then, all that is happening is that the given expressions t and u are simplified to constants and the quotient of these constants is returned. The >>= operation allows us to easily pass along the simplified results, accounting for the monadic types as required. Having offered a general monadic (division) evaluator, we now define the triple specific to the case of state updates. In the following code, r e t u r n a returns the computation that accepts an initial state x and returns the pair comprised of the value a and the unchanged state variable x. The call m » = f first performs the computation m in the initial state sO. T h i s computation yields the value a and updated state s i . Then, the computation f a is performed in state s i , which returns the pair comprised of the value b and the final state s2. The triple is given as: type M a type State return return a (»=) m >>= f = = :: = :: = S t a t e -> ( a , S t a t e ) Int a -> M a \ s 0 -> ( a , sO) M a -> (a -> M b) -> M b \ s 0 -> l e t ( a , s i ) = m sO i n l e t (b,s2) = f a s i i n (b,s2) 46 Chapter 3. Wadler's Monads To complete the task of counting the number of divisions that are performed during a calculation, we define a method count by: count count :: = M() \ s -> (0,s+l) This function increments the state by 1 and returns the empty value (). To count the number of divisions as they occur i n a calculation, we simply replace the call r e t u r n a ' q u o t ' b by count >>= \ _ -> r e t u r n a ' q u o t ' b i n the original monadic evaluator [17]. 3. T h e monadic evaluator from the preceding example can also be employed to demonstrate use of the monad laws ( M 1 ) - ( M 3 ) . We modify it slightly so that it performs a multiplication, rather than division operation: d a t a Term = Con I n t I M u l t Term Term eval : : Term -> M I n t e v a l (Con a) = return a e v a l (Mult t u) = e v a l t >>= \ a -> e v a l u >>= \ b -> r e t u r n a * b and use the monad laws to show that multiplication as performed by the evaluator is associative; that is, eval M u l t t ( M u l t u v) = eval M u l t (Mult t u) v. We begin by expanding the left-hand side using the definition of e v a l : = = eval (Mult t (Mult u v)) eval t »= (\a —> eval (Mult u v) >>= (\x —> return a * x)) eval t >>= (\a —* (eval u >>= (\6 —•» eval v »= (\c —> return b * c))) >>= (\x —> return a * x)). Law (M3) implies that the order of parentheses does not affect the output of the computation: — eval t »= (\a —•> eval u >>= (\b —> eval v »= (\c —> return 6 * c >>= (\x —» return a * x)))), and by ( M l ) , we have the return of the evaluator on input Add t (Add u v): = eval t >>= (\a —> eval u >>= (\6 —> eval v >>= (\c —> return a * (b * c)))). 47 Chapter 3. Wadler's Monads Similarly, for the right-hand term we have: = = = = eval (Mult (Mult t u) v) eval M u l t t u »= (\x —> eval v »= (\c —* return x * c)) (eval t >>= (\a —* eval u >>= (\b —> return a * b))) >>= (\x —> eval u >>= (\c —•> return x * c)) eval t »= (\a —> eval u »= (\b —> return a* b >>= (\x —> eval i> >>= (\c —> return x * c)))) eval t »= (\a —* eval u »= (\b —> eval v >>= (\c —> return (a * 6) * c))). Thus, associativity of integer multiplication implies associativity of the evaluator multiplication. 4. Lists were previously introduced in the discussion on the F u n c t o r type class, where we mentioned that [a] denotes a list of elements of type a. Some additional pertinent syntax includes the expression of a list in the form x : [ x s ] , where x is the first element of the list (called the 'head'), and xs is the remainder of the list (called the 'tail'). For example, [1,2,3] = 1:[2,3] = 1:2:3: [ ] . T h i s notation is tailored to facilitate recursive list definitions and computations. A s well, the operation (++) : : [a] -> [a] -> [a] is defined as the concatenation of the two input lists. W i t h this, we claim that the list type constructor, which we denote as L, forms a monad when combined w i t h the following maps: return return a (»=) [] » = f x:xs » = f :: = :: = = a -> [a] [a] [a] -> (a -> [b]) -> [] f x ++ (xs » = f ) [b] Here, r e t u r n maps an element x of type a to the list that contains the (single) element x. Note that the instance definition of ( » = ) for this proposed monad in terms of concatenation leads us to assume that f x is finite. W i t h this definition, the computational interpretation of the expression x : xs >>= f begins by passing the first element of the list x : x s (namely, x) to the function f : : a -> [b], which outputs a list of elements of type b. T h i s list f x is to be concatenated w i t h the list x s » = f, which must be recursively simplified according to the previous description until the final concatenation of some list w i t h the empty list occurs (that is, once [] » = f = [] is computed i n the recursion). 48 Chapter 3. Wadler's Monads Let us prove that lists do i n fact satisfy ( M 1 ) - ( M 3 ) . For ( M l ) , let x : : a and f : : a -> [b]. Then, return x >>= f = [x] » = f = = = = x: f f f [] » = f x ++ ([] » = x ++ [] x, f) defn defn defn defn defn of return for lists of (: ) of ( » = ) for lists of ( » = ) for lists of (++) as required. For (M2), we let m = x:xs be a list with n + 1 elements and induct over the length of m. Taking m to be the empty list as the base case, we have [] » = return = []. Now, assume that (M2) is true for a list xs of length n; that is, assume xs >>= return = xs. Then, x:xs >>= return = = = (return x) ++ (xs >>= return) [x] ++ xs x:xs. Thus, by induction (M2) holds for lists. Before proving (M3), we first prove the following small proposition, which states that ( » = ) is right distributive over list concatenation (++). Proposition 3.1.1. Given a finite list 1 : : and a function f : : a -> [b], we have (1 [a], a list j : : [a] ++ j) » = f = (1 » = f) ++ (j » = f). Proof. To show that the proposition holds for 1 = x: xs of arbitrary, finite length, we induct over the length of 1 and let j be arbitrary. In the base case, we have ( • ++ j) » = f = = = j »= f • ++ (j » = f) ( • » = f) ++ (j » = f), as desired. Next, assume as the induction hypothesis that (xs ++ j) » = f = (xs » = f) ++ (j » = f), where xs is a list of length n. Then for 1 = x:xs, we have 49 Chapter 3. Wadler's Monads (x:xs ++ j ) » = f = = = f x ++ ( ( x s ++ j ) » = f ) f x ++ (xs » = f ) ++ ( j » = f ) ( x : x s » = f ) ++ ( j » = f ) , by the associativity of (++). Thus, the proposition holds by mathematical induction. • Finally, to prove (M3), let g :: We have = = = = (x:xs (f x (fx (fx x:xs b -> [ c ] , and m be as i n (M2) above. » = f) » = g ++ (xs » = f ) ) » = g >>= g) ++ ( ( x s » = f ) » = g) proposition » = g) ++ (xs » = (\a -> ( f a » = g ) ) ) » = (\a -> ( f a » = g ) ) , defn of ( » = ) for lists as desired. We have thus shown that ( L , r e t u r n , ( » = ) ) is a monad. This proof is also a nice demonstration of the ease of synatactic manipulation and equational reasoning offered by purely functional languages! Thus far, each monad we have considered has corresponded to a computation type, so it is natural to consider if the same is true for lists. Indeed, we can think of lists as a means to model relations, where a computation of type [a] offers one choice for each element in the list. Thus, modifying a function f : : a -> b to its monadic equivalent f : : a -> [b], entails offering a choice of results for each argument [17]. 50 Chapter 4 Equivalence Thus far, we have examined: i) Moggi's incorporation of category theory into programming semantics via his category-theoretic semantics of computations, and ii) Wadler's implementation of Moggi's theoretical work in structuring the particular functional language Haskell. W h i l e the theory of i) was motivated using K l e i s l i triples, Theorem 2.3.1 gave the explicit connection between K l e i s l i triples and the "purer" category-theoretic concept of monads. We are now ready to give a similarly explicit connection between ii) , which is centred around the maps r e t u r n and » = , and the standard formulation of monads in category theory. 4.1 Preliminary Connections The first connection is easy to identify. Recall the type signature of r e t u r n for any (Haskell) monad: return :: Monad m => a -> m a, where a is any object in H a s k . This is exactly analogous to the domain and codomain of the map 77 that must be defined for any (category-theoretic) monad M in a category C: i] :A-4 A MA, where A is an object in C. We denote this analogy by r e t u r n ~ 77. Unfortunately, no such obvious comparison between » = and yx exists. There is, however, a Haskell function called j o i n that is defined in the Monad module. It has the following type declaration: join :: Monad m => m (m a) -> m a, which is indeed analogous to the domain and codomain of H : HA :MA 2 -> MA. T h a t is, j o i n ~ a.51 Chapter 4, Equivalence The action of j o i n is to remove one layer of nesting of a monadic value. A s is the case for r e t u r n and » = , the definition of j o i n is particular to the monad under consideration. In the case of Maybe, for example, the definition of j o i n is given by: join j o i n Nothing j o i n (Just Nothing) j o i n (Just (Just x)) :: = = = Maybe (Maybe a) -> Maybe a Nothing Nothing Just x Since j o i n (along with r e t u r n ) is another direct link between Haskell functions and category-theoretic monad maps, if we can display an explicit connection between j o i n and >>=, then we will have the desired translation between the Haskell and category-theoretic formulations of monads! Continuing w i t h the Maybe example, recall that one equation i n the original instance declaration of the Maybe monad is: J u s t xO » = f = f xO Consider the substitutions XQ = Just x and / = id in the above. We have: Just ( J u s t x) >>= i d = = = i d ( J u s t x) Just x j o i n (Just (Just x)). This gives a first relationship between >>= and j o i n i n the case of the Maybe monad. In fact, this relationship is the Maybe-specific case of the following general relation i n Haskell. Given a monad m and a monadic element m (m x ) , we have: j o i n m (m x) = (m (m x ) ) » = i d . T h a t is, removing one monadic layer using the j o i n operation is exactly the same as extracting the element m x from the (outer level of the)monad and leaving it unchanged. Therefore, for any Haskell monad m defined using the >>= function, we can obtain the definition for j o i n corresponding to m v i a the above equality. To establish an equivalence between these functions, it remains to provide a definition for >>= i n terms of j o i n . There is a small and obvious catch, however. The function >>= accepts two arguments, one of which is a function, whereas j o i n is a function from a double-layered monadic object to a monadic object. Also note that every (category-theoretic) monadic map 52 Chapter 4. Equivalence \i is a natural transformation, which means it is defined on objects and returns a function. Clearly, if j o i n is to be used to define a monad, we must integrate functions into its defined behaviour. To do this, we recall the method fmap that was introduced in Section 1.5.1 i n conjunction w i t h the Functor type class; specifically fmap is a function that must be defined for any instance of Functor, and it provides a means for functions defined outside a set of functor-type elements to act on functor-type elements. L e t us assume a (presently unjustified) consistency between Haskell's class names Functor and Monad, and the category-theoretic definitions of those two terms: namely, that any instance of Monad must also be an instance of Functor. Then, fmap :: a -> b -> f a -> f b can be defined for both functors and monads f. Using fmap w i t h this additional context, we have the following Haskell relation between » = and j o i n : x >>= f = j o i n (fmap f x ) . T h a t is, for a particular monad m, if we are given a definition for j o i n and fmap, then we can obtain the definition for >>=. For example, assume the three equations constituting the Maybe definition of j o i n given above. In addition, assume the following Maybe-definition for fmap: fmap fmap _ N o t h i n g fmap f ( J u s t x) :: = = a -> b -> Maybe a -> Maybe b Nothing J u s t f x. We show that these five equations imply those two involving » = given in the original instance declaration of the Maybe monad as follows. Let f : : a -> Maybe b. To show the first equation: N o t h i n g >>= f = = = j o i n (fmap f N o t h i n g ) j o i n Nothing Nothing. For the second equation, since J u s t x >>= f = j o i n (fmap f ( J u s t x ) ) , 53 Chapter 4. Equivalence we consider the following two cases: Case 1: f x = N o t h i n g We have, join (fmap f ( J u s t x ) ) = j o i n (Just = = Nothing f x. (Nothing)) Case 2: f x = J u s t y In this case, join (fmap f ( J u s t x ) ) j o i n (Just Just y f x. (Just y)) Therefore, j o i n (fmap f ( J u s t x ) ) — f x, and the definitions for fmap and j o i n are sufficient to provide an instance declaration for Maybe. 4.1.1 Comparative Motivations for >>=, join, and _* We have now given the equivalence between >>= and j o i n . Using this equivalence along w i t h Theorem 2.3.1, we can easily see the relationship between >>= and _*. Let m be a monad, x be a monadic element of type x : : m a, and f be a function of type f : : a -> m b. Then, as we have seen, x » = f = j o i n (fmap f x ) . Recall that a category-theoretic functor is defined on both objects and morphisms, and that Haskell's fmap can be thought of as corresponding to the part of a functor that is defined on morphisms. Therefore, if we think of m as a monad M i n the category-theoretic sense, we have an analogy between fmap f x and (M(f))(x). Furthermore, given the type signatures of x and f, we see that fmap f x has return type m (m b ) . Thus, the map j o i n is analogous to p and we have: B join (fmap f x) ~ iM (Mf(x)) B = = (p oMf)(x) B /*(*), where the last equality follows from Theorem 2.3.1. We thus have the following relationship between >>= and _*: X » = f ~ f*{x). 54 Chapter 4. Equivalence Now that we have explicitly established equivalancy relationships between » = , j o i n (or fi), and _*, the question remains as to why these different operations were chosen i n their respective contexts. We have already seen that K l e i s l i triples, w i t h _* as a means of composing operations, were nicely justifiable from a computational standpoint. In fact, the aforementioned let-constructor (Section 2.4) corresponds directly to composition i n a given K l e i s l i category [10]. T h e let-constructor similarly provides a motivation for the characterization of >>=. A n expression of the form x >>= \ a —> y, where x and y are expressions, is computationally described by the sequence of actions: perform computation x, bind a to the resulting value, and then perform computation y. This is analogous to the computational interpretation of the expression let a = x in y [17]. T h e » = function thus provides a means to implement impure/imperative sequential-style reasoning in a pure functional language. Finally, understanding the connections between these functions w i t h /v, is useful, because the category-theoretic formulation of monads is given solely in terms of functors and natural transformations. T h i s formulation is easier to reason about, and integrate into the established abstract setting of category theory. 4.2 Monad Laws To complete our exploration of the equivalence between monads i n Haskell and category theory, it remains to justify the use of fmap as a component in the characterization of » = in terms of j o i n . To do this, we propose a new set of monad laws as well as a definition for fmap i n terms of >>=. We then prove the equivalence between the new laws and the original laws (M1)-(M3). 4.2.1 Alternate Laws We offer and discuss the following new set of monad laws involving the polymorphic Haskell functions fmap, i d , r e t u r n , and j o i n , and arbitrary functions f : : a -> b, g : : b -> c [17]: (LI) (L2) (L3) (L4) (L5) fmap fmap fmap fmap join id = id (f . g) = fmap f . fmap g f . return = return . f f . j o i n = j o i n . fmap (fmap f ) . fmap j o i n = j o i n . j o i n 55 Chapter 4. Equivalence (L6) (L7) j o i n . fmap r e t u r n = i d j o i n . return = i d The astute reader may immediately notice similarities between these laws and the mathematical relationships pertinent to category-theoretic monads. We have already introduced the functor laws ( L I ) and (L2) as being analogous to the defining equations for a category-theoretic functor in Section 1.5.1. To make the analogy explicit, let F : C —> C be a functor, M : C —* C be a monad, x 6 Oc, and / : a —> b, g : b —> c € Ac• We have: fmap i d = i d fmap ( f . g) = fmap f . fmap g ~ ~ F(id ) = idp F{f o g) = F(f) x x o F(g). It is similarly easy to see that (L3) and (L4) correspond to the naturality conditions for rj : I —* M and p, : M —* M, respectively; that is, for any objects a, b € Oc, and any map between those objects / : a —> b £ Ac, we have: 2 fmap f . r e t u r n = r e t u r n . f fmap f . j o i n = j o i n . fmap (fmap f ) ~ ~ Mf o rj = r/b (Mf) ° Ma = Mb ° a Mf 2 We expect the last three laws to correspond to the category-theoretic monad laws given by the standard commutative diagrams for monads. However, on examining (L5), it looks as though we should have: ? j o i n . fmap j o i n = j o i n . j o i n ~ p o Mp = p o p. Not only does the right-hand side of the similarity fail to correspond to the first category-theoretic monad law, but the composition p o p is not even defined! The apparent inconsistency is actually caused by the polymorphism of j o i n . T h e function j o i n can accept any object of the form m ( m a ) , i n cluding the case when a itself is a monadic type (for example, if a = m b). Thus, while it is important to mathematically distinguish between the functions p and pM, Haskell's polymorphic type system allows the name j o i n to stand for more than one function, each being technically distingusihed by the domain it is defined on. A n exemplary demonstration of (L5) using the list monad L helps to clarify the issue. In Section 1.5.1 we noted that, i n the case of lists, fmap is equivalent to map. In Section 3.1 we gave the definition of » = for lists, which we can now use (informally) to recover j o i n as follows. Let 10, 11, 12,... , l n be lists of the same type, and form the list of lists [10, 1 1 , . . . , l n ] £ L . Then, 2 56 Chapter 4. Equivalence join [10, 1 1 , . . . , In] = = [10, 1 1 , . . . , In] » = i d i d 10 ++ [11,12,..., In] = 10 ++ (11 ++ (12 ++ »= (••'• ++ id In))---). A Haskell expression involving the concatenation of more than two lists is typically written in terms of the operation cone a t :: [ [ a ] ] -> [a]. The last equation in the above derivation is equivalent to 10 ++ (11 ++ (• • • ++ I n ) ) - • •) = c o n c a t ( 1 0 , 1 2 , . . . , l n ) . Thus, the instance of j o i n defined for lists is equivalent to concat. G i v i n g the instance (L5) in terms of fmap and j o i n for lists, we have: c o n c a t . map c o n c a t = c o n c a t . concat. Recall that the type signature of map is map : : (a -> b) -> [a] -> [b]. Taken with the type signature of concat, this means that the type signature of the inner function on the left-hand side of the preceding equation is map c o n c a t : : [ [ [a] ] ] -> [ [a] ] . The return type of this function matches the input type of c o n c a t , so that the type signature of the composed function is c o n c a t . map c o n c a t :: [ [ [ a ] ] ] -> [a]. For example, the action of the left-hand side function can be demonstrated on the following "list of lists of lists": = = = c o n c a t . map c o n c a t [ [ [ 1 , 2 ] , [ 1 , 2 ] , [ 3 , 4 , 5 ] ] , [ [ 1 , 3 ] , [ ] ] ] c o n c a t [concat [ [ 1 , 2 ] , [ 1 , 2 ] , [ 3 , 4 , 5 ] ] , c o n c a t [ [ 1 , 3 ] , [ ] ] ] concat [ [ 1 , 2 , 1 , 2 , 3 , 4 , 5 ] , [ 1 , 3 ] ] [1,2,1,2,3,4,5,1,3] The apparent problem w i t h the similarity for (L5) arose w i t h respect to the composition j o i n , join </> u, o \±M, but we can see in the list example that the " M " — o r L in the case of l i s t s — i s accounted for i n the polymorphism of c o n c a t . Explicitly, a list of lists [ [a] ] 6 L is equivalently a list [b] G L, where b has type [a]. A s such, the compiler would recognize an element of [ [ [ a ] ] ] as simply a list and allow the polymorphic function concat to accept such an element as an input w i t h the correct type. Thus, the map c o n c a t . c o n c a t also has type c o n c a t . c o n c a t : : [ [ [ a ] ] ] -> [a] and acts on the previously given member of L as follows: 2 3 = = concat . concat [ [ [ 1 , 2 ] , [ 1 , 2 ] , [ 3 , 4 , 5 ] ] , [ [ 1 , 3 ] , [ ] ] ] concat [ [ 1 , 2 ] , [ 1 , 2 ] , [ 3 , 4 , 5 ] , [ 1 , 3 ] , [ ] ] [1,2,1,2,3,4,5,1,3], 57 Chapter 4. Equivalence which matches the evaluation of concat . map c o n c a t on the same list. This reasoning regarding polymorphism for c o n c a t generalizes to j o i n and similar reasoning applies to r e t u r n . Thus, the analogies between ( L 5 ) (L7) and the category-theoretic monad laws do, i n fact, hold: j o i n . fmap j o i n = j o i n . j o i n j o i n . fmap r e t u r n = i d join . return = i d 4.2.2 ~ ~ ~ fx o Mp — fj, o fiM fi o Mr) = i d /i o 7 7 M = id. Law Equivalence We now show the equivalence of laws (L1)-(L7) w i t h laws ( M 1 ) - ( M 3 ) , given in Section 3.0.2. First, we assume that laws (L1)-(L7) hold, as well as the previously given definition of » = i n terms of j o i n . To prove the laws imply ( M l ) , we have: r e t u r n x >>= f = = = = = j o i n (fmap f ( r e t u r n x ) ) j o i n ( r e t u r n (f x)) ( j o i n , r e t u r n ) (f x) i d (f x) f x. (L3) (L7) For (M2), we have: n >>= r e t u r n = = = = j o i n (fmap r e t u r n n) ( j o i n . fmap r e t u r n ) n id n n. (L6) Finally, we prove (M3), recalling that any single-variable function h can equivalently be expressed w i t h Haskell syntax as either h or \x-> h x. We have: = = = = = = — = = = (n » = f ) » = g ( j o i n (fmap f n ) ) >>= g j o i n (fmap g ( j o i n (fmap f n ) ) ( j o i n , fmap g . j o i n ) (fmap f n) ( j o i n . j o i n , fmap (fmap g ) ) ( f m a p f n) ( j o i n . j o i n . fmap (fmap g) . fmap f ) n ( j o i n . j o i n . fmap ((fmap g) . f ) ) n ( j o i n . j o i n , fmap (\x -> fmap g (f x ) ) ) n ( j o i n . fmap j o i n . fmap (\x -> fmap g ( f x ) ) ) n ( j o i n , fmap ( j o i n . (\x -> fmap g (f x ) ) ) ) n ( j o i n , fmap (\x -> j o i n (fmap g ( f x ) ) ) ) n (L4) (L2) (L5) (L2) 58 Chapter 4. Equivalence = = = ( j o i n , fmap (\x -> f £d&fcef g>^= i n terms of j o i n ) ( j o i n (fmap (\x -> f x » = g) n) n » = (\x -> f x » = g). Thus, monad.laws (L1)-(L7) imply monad laws ( M 1 ) - ( M 3 ) . For the converse, we assume laws ( M 1 ) - ( M 3 ) , along with the definition for j o i n in terms of » = . In addition, we assume the following definition for fmap, given a function f : : a -> b and monadic element n : : m a : fmap f n = n >>= return . f The proofs of the laws are then as follows: (LI): fmap i d n = = = = = = = = = = = = n >>= n >>= n return . id return (M2) fmap (f . g) n n >>= r e t u r n . f .g n » = (\x -> r e t u r n f (g x ) ) n >>= (\x -> r e t u r n . f g x) n » = (\x -> ( r e t u r n g x » = r e t u r n , f ) ) n >>= (\x -> ( r e t u r n . g x >>= r e t u r n . f ) ) (n » = r e t u r n . g) » = r e t u r n . f (fmap g n) >>= r e t u r n . f fmap (fmap g n) (fmap . fmap g) n (Ml) (M3) (L3): fmap f . r e t u r n n = = = fmap f r e t u r n n r e t u r n n >>= r e t u r n . f return . f n (Ml) 59 Chapter 4. Equivalence (L4): fmap . join n fmap f (join n) join n » = return . f (n » = id) » = return . f n » = (\x -> (id x » = return, f)) n » = (\x -> (x » = return, f)) n >>= (\x -> fmap f x) n » = (\x -> i d (fmap f x)) n » = (\x -> (return (fmap f x) » = id)) (n >>= return . fmap f) >>= i d join (n >>= return . fmap f) join (fmap (fmap f) n) join . fmap (fmap f) n = = = = = = = = = = = = = (join . fmap join) n join (fmap join n) join (n » = return . join) (n >>= return . join) >>= i d n » = (\x -> (return . join x » = id)) n >>= (\x -> (return (join x) » = id)) n » = (\x -> i d (join x)) n >>= (\x -> join x) n » = (\x -> (x » = id)) n » = (\x -> (id x » = id)) (n » = id) » = i d join (n » = id) join (join n) join.join n (M3) (Ml) (M3) (Ml) (L6): (join . fmap return) n join (fmap return n) join (n >>= return . return) (n » = return . return) (n >>= return . return) >>= i d 60 Chapter 4. Equivalence n n n n id (\x (\x (\x » » » » n -> -> -> ( r e t u r n . r e t u r n x >>= i d ) ) ( r e t u r n ( r e t u r n x) » = i d ) ) i d (return x)) (M3) (Ml) return (M2) (L7): (join return) n = = = j o i n ( r e t u r n n) return n » = id id n (Ml) Therefore, laws ( M 1 ) - ( M 3 ) imply laws (L1)-(L7) and the two sets of monad laws are equivalent. Previously, we assumed a consistency between the Haskell class names Functor and Monad w i t h part of the category-theoretic definition of a monad; namely, that every monad is a functor. Since we have shown the monad laws ( M 1 ) - ( M 3 ) to be equivalent to laws (L1)-(L7), with the latter set including the functor laws (LI) and (L2), we have also shown that our assumption is justified. In fact, we can use laws ( M 1 ) - ( M 3 ) to prove the definition of >>= in terms of j o i n as follows: n »= f = = = = = = n » = (\x -> i d (f x ) ) n » = (\x -> ( r e t u r n (f x) » = i d ) ) n » = (\x -> ( r e t u r n . f x » = i d ) ) (n » = r e t u r n . f ) » = i d j o i n (n >>= r e t u r n . f ) j o i n (fmap f n) Therefore—at last!—we can provide an alternate class declaration for Monad that looks very similar to the category-theoretic formulation of a monad, w i t h r e t u r n ~ rj and j o i n ~ fi: c l a s s F u n c t o r m => Monad m where r e t u r n :: a -> m a j o i n : : m (m a) -> m a. 4.3 4.3.1 Concluding Connections List in Translation We conclude by using our work w i t h equivalences to tie together our knowledge of monads in Haskell w i t h those i n category theory. In Section 61 Chapter 4. Equivalence 3.1, Example 4, we showed that ( M 1 ) - ( M 3 ) hold for the list type constructor. In examining the alternate law formulations i n Section 4.2.1, we saw that for the list monad L, fmap is given by map and j o i n by concat. Notably, the maps of the latter formulation seemed easier to understand and manipulate than the recursive definition of » = for lists. If we begin to translate to category-theoretic terms, we see that L is a functor which, given a Haskell type a, constructs the set of all lists of type a (denoted [a]). In other words, given a set X £ H a s k , L X is the set of all possible lists that can be formed using elements i n X, L X is the set of all lists of lists, and so on. Further, we recall that r e t u r n for the List monad is given by r e t u r n x = [ x ] , where x 6 X. 2 Thus, given a set X G L, we define rjx by sending x i - > [x] and px by the concatenation of lists i n LX. Note that the concatenation of lists does not entail 'discarding' repeated elements (as with, say, a union operation) or 'cancelling' elements i n accord with some given relations (as for, say, invertible elements). Comparing this to Example 3, Section 1.3.2, we see that the List monad is none other than the free-forgetful semigroup monad! 4.3.2 Distributive Laws In many applications of monads i n Haskell to structuring functional languages, the desired monad is most easily conceived as a combination of simpler monads. For example, we considered using monads to facilitate error handling (the Maybe monad) and to count the number of operations performed during a calculation (state updates). It is easy to see the desirability of being able to combine these monads so that if an error occurred during a calculation in which a counter was running, then the error would be handled properly. In general, it is not the case that when endofunctors comprising monads are composed, the result is another monad. However, there is a categorytheoretic concept that formalizes the cases when this does happen; the concept being that of the distributive laws for monads. A distributive law for monads is a means by which the functorial composition of two monads can be made into a more complex, combined monad. T h e idea is a similar to viewing the ordinary distributive law of multiplication over addition as a means by which abelian groups and monoids are combined into the more complex structure of rings. Let M i — ( M i , 771, p-i) and M 2 = (M ,772, ^2) be monads on a category C. A distributive law of M i over M 2 is a natural transformation A : M i o M 2 —> M 2 o M i for which the following diagrams commute [1]: 2 62 Chapter 4. M M x Mirj Equivalence M x 2 T]lM 2 2 M Mi MM X M 2 MM 2 2 X MM MMM 2 X 2 X M Mi 2 2 -^U 2 MM 2 X 112 Mi M i 112 MM MM X 2 2 2 ,„ M(M A X MMM X 2 2 M M\ X 2 M21H MM MM X 2 2 A distributive law A of M over M bined monad defined by A, given by x MM 2 X = where the composite maps r / Mi 2 X gives rise to the composite or com- (M M ,r) \H ), M2M 2 M 2 M l M2Ml x and u. M2Ml are defined by: M =-> M M i 2 2 2 Mi M Mi 2 77 1 M Ic > \ r D M M2M MM 2 X 2 X 2 M 2 A M l X > M 2 2 M 2 M Mj 2 p, 2M M 2 M MMMM M 2 ? ? M l 2 1 M Mi 2 2 M M\ 2 M Mi 2 The natural question for us to ask is: C a n we implement this theoretical definition to combine certain monads in Haskell? T h e answer is "Yes!" and we show how the distributive laws can be used to define a monad ML, where M is an arbitrary monad and L is the list monad. 63 Chapter 4. Equivalence We begin by using the same essential idea that we explored i n Section 4.2.1; namely, we draw analogies between the category-theoretic distributive laws and Haskell code. T h e difference here, is that while in Section 4.2.1 we had sets of laws from category theory and Haskell, and we justified an analogy between them, i n this case we use a set of category-theoretic laws to motivate, v i a analogy, a set of relations between Haskell functions. A t this stage, we feel confident enough to blend notations for convenience. Thus, we let A be a Haskell function with type signature A : : L (M a) -> M ( L a ) , as well as if : : a -> M a, /x : : M (M a) -> M a and m : : (a -> b) -> M a -> M b be the monad maps for M, and rf', / j and m denote the monad maps for L. T h e n the first analogous Haskell relation corresponds to the naturality of A : M M L L (DO) A . m L (m f ) M = m M (m f ) . A, L where f is an arbitrary function w i t h type signature f : : a -> b. B y letting M\ and M correspond to L and M, respectively, i n the diagrams above, we can infer an analogous set of Haskell distributive laws. For example, the equation corresponding to the second diagram given i n the definition of a distributive law is A o J]\M = M r)\. TO come up with a corresponding Haskell law, we: (i) let the fmap methods defined for the instances L and M—namely, m and m , respectively—replace the functor maps M\ and M , respectively; (ii) let the r e t u r n methods for L and M—namely, r? and rf , respectively—replace the monad units rji and n , respectively; (iii) account for Haskell's polymorphic type system (see Section 4.2.1 for a detailed introduction of the manifestation of polymorphism in this sort of context). The Haskell function m can be thought of as accepting two arguments: a function of type a -> b and a monadic element of type M a. Thus, we convert the expression M rj\ to the corresponding Haskell function m n , where the type signatures of m : : (a -> b) -> M a -> M b and rf' : : a -> L a imply that m rf' accepts an argument of type M a and has return type M ( L a ) . For the left-hand side of the distributive law equation under consideration, we convert A o r\\M to A . r? . In the category-theoretic case, we have A o r ] \ M as a map from M to M M\. In the Haskell version, the polymorphism of rf' means that we do not need to explicitly include the inner monad map M i n the law's formulation; rather, we just note that rf' can accept monadic-typed arguments. Moreover, the type signature of A : : L (M a) -> M (L a) guarantees that we must give n an argument of type M a, so that its return type w i l l be L (M a) (which corresponds to the input type of A). Thus, the Haskell version of this distributive law equation is A . n = m rf" . 2 2 L 2 M L 2 1 2 M M L 2 M M L 2 2 2 2 2 l L M 64 Chapter 4. Equivalence We can similarly translate the remaining distributive law equations. In summary, the category-theoretic distributive laws give rise to the following relations i n Haskell: (DI) (D2) (D3) (D4) - m L M r? = rf L // = M m n A A A A h m m M m A.A u . A.m M = L M M 1 M L It is important to note that to assume that these analogous Haskell laws hold, we must first assume that the Haskell function A exists. W h a t makes the case of L over an arbitrary M special is that such a distributive law function A does indeed exist. Showing this involves working w i t h a so-called Haskell " f o l d " function, and defining a sort of Cartesian product for lists. Interested readers are referred to [7] for details. B y continuing w i t h our stated assumptions, we propose that the following 3-tuple defines a combined monad defined by A. P r o p o s i t i o n 4.3.1. The triple defined by: unif fmap join = = = L f KL ML rf . n m (m f ) m u . // m A L H L H L H is a combined monad, ML, in Haskell. Proof. To prove the 3-tuple (ML, e t a , j o i n ) is a monad, we must show that the maps u n i t , f m a p , and j o i n satisfy (L1)-(L7), given that this same set of laws holds for each of M and L, and that (D0)-(D4) also hold. We have: ML ML ML ML M L (LI): fmap ML 1 id m m id ( m id ( id ) M L M (L2): fmap ML (f .g) m ( m f . g) m (m i . m g) M T m m f m m g fmap f . fmap g ML M L M L M = L L ML ML 65 Chapter 4. Equivalence (L3): fmap ( f . unit ML ) ML = > m ( f L r? )) 77" L n .f L unit ML = 1 Note that our particular interest i n combining the list monad w i t h an arbitrary monad is highlighted i n the remaining laws, where we assume the existence of A for L over M. (L4): = fmap . join m ( m f ) . m p . p m A m ( m f . fi ) . j . m A m ( m (m f . p ) ) . m M m ( m ( m f . ph ) . A ) A* m ( m ( p . m ( m f ))• M m ( m p . m ( m ( rn f M m ( m p ) . m (m (m M m A* - p . m ( m ( m ( r n f DO) m ML = M L M L M L L M L K M K M f = = = = M H M M = M L M M L M M M M L H h L H L L M M M = M M L L M L M M L M M L L L A A) ) ) . A) (m f )) ) ) . A) L A) M - /J . m ( A . m ( m ( m f ) ) ) M p" . m A . m ( m ( m ( m f ) ) ) jmo i n . m a p ( map f ) M L m M L M M L M ML M L L M L M L (L6): j o i n ML unit ML _ m m m M m m m id M = ( ) = = D1 M M m A . r/ p . p H n . A T? p . p" p, . id . A n M M L L L L p . m M ,„L ( M •V (id) L L 66 Chapter 4. Equivalence (L7): join ML ->ML map"" unitf ML m A . m ( m ( rf . r, M m ( A . m ( rf . n ) ) M m ( A . m 77'M m M M M L L M = u. m M = m — m (D2) • = = = = = = (D4) = • „ = = = = L M m id) L M M M H M M L L M M M H K M M M M A* • M . m . /i . m M • M . m . m M M M M M M M L M ( m ( m ( m ( m M M M ( m ( m ( m ( m M M M L A M A A • M ) • A ) L m m M L m m M M M M M L L h M L M M L M L u. . A . m A ) . A ) u . A ) . m ( m A ) . A L M L L M M L M • A ) A . m ( m A)) u. ) . m A . A . m ( m M L L L M ) M M U M K u ) ) . m ) . /i . m L M • // . m m ( m . m ( A* . m M M M M M M ) . /z • m ( A . . Ai ) • M • m ( A . ) )• m m ( m ( u. . m ( m ( • M ) . A . m ( m ( /U . m . (M M . M . M . M M A M A . /x ) ) • m L M . m K // . m M M M ). fj L fJ L M M M M M . , L A A M M M M m Ai . m M M L M A)) M M M . /L L L M L L L A . m / ( A . fi ) ) . m (A . A . u. ) • ( m /t . ( m . M • M m ( m ( m u . m . m ( m u, . m u. . m ( m ( . ,, ) ) M . H ) • M • M . m m ( / / . /v ) . //" . m m U L . LI ) . LI . m ( M m m 1- M I M M (D3) M m id Ml L m ( 77" . m ry ) m 77" . m ( m n ( m r? ) L LI ) ) L L M m M = = = = = T \x ur join . join m /v m m M - H . m . m ( m" M m ( m /i . M» m ( m i< M« . fi m ( m (DO) = = M L . L LI L L M M M M M M M M M M L L M L M L L m m M M M ( A m A ( m A ( m A m A . . A . m A . A . A . m M M L M L . A . . A . A . m ( m . m ( ( m L M L M m m" A ) ) m m A) ) ( m A )) A ) ) m A) ) A ) ) L L M M M M • m ( m A ) ) ( LI . m A ) ) ( A . m ( /J* . m A ) ) m ( // . m A ) ) L L M L M M K L M L A . m M L ( M M • m M A ) ) 67 Chapter 4. Equivalence u. . m ( m M H =° } L M L fj, ) . A . m ( /a . m L L M M A)) . m ( m u, • A . m ( m ) . m ( . m A)) /x" . m ( m fi • A • m ( m u. . u" . m A ) ) /a . m ( m n . A ) . m ( m ( m / i . /x . m A ) ) // . m ( m fi ) • m " A . m ( m ( j o i n ) ) m u, . / i m A . f m a p ( j o i n ) join . fmap ( join ) M M M L L M M L L M M M M M = /x . m ( m M L M L M M M M M L L M M l M M ML ML M L M M L M M L L L M M M L M L M L Thus, the 3-tuple (ML, e t a , j o i n ) is a monad. ML ML • 68 Bibliography M . B a r r and C. Wells, Toposes, Triples, and Theories. Springer-Verlag, 1985. M . B a r r and C. Wells, Category Theory for Computing Science. Prentice H a l l , 1990. J. Beck, Distributive laws. Lecture Notes in Mathematics, Springer, 1969. 80:119-140, H. Daume III, Yet Another Haskell Tutorial. H a l Daume, 2002-2006. R. Goldblatt, Topoi: T h e Categorical Analysis of Logic, Studies in Logic and the Foundations of Mathematics, Volume 98. North-Holland Publishing Company, 1979. P. Hudak, J . Peterson, and J . H. Fasel, A Gentle Introduction to Haskell 98. P a u l Hudak, John Peterson and Joseph Fasel, 1999. D. K i n g and P. Wadler, Combining Monads. In Glasgow Workshop on Functional Programming, A y r , July 1992. S. M a c Lane, Categories for the Working Mathematician. Verlag, 1971. Springer- G. Michaelson, An Introduction to Functional Programming Through Lambda Calculus. Addison-Wesley Publishers L t d . , 1989. E. Moggi, Computational lambda-calculus and monads. In Sympasium on Logic in Computer Science, Asilomar, California; I E E E , June 1989. E. Moggi, A n abstract view of programming languages. Course notes, University of Edinburgh, 1989. E. Moggi, Notions of computations and monads. Information and Computation, 93:55-92, 1989. 69 Bibliography [13] M . Pierce, Basic Category Theory for Computer Scientists. T h e M I T Press, 1991. [14] F. R a b h i and G. L&p&lme, Algorithms: A Functional Programming Approach. Pearson Education L i m i t e d , 1999. [15] J . C. Reynolds, Using category theory to design implicit' conversions and generic operators. In N. D. Jones, editor, Proceedings of the Aarhus Workshop on Semantics-Directed Compiler Generation, number 97 i n Lecture Notes i n Computer Science. Springer-Verlag, January 1980. [16] P. Wadler, Comprehending Monads. In Conference on Lisp and Functional Programming, Nice, France; A C M , June 1990. [17] P: Wadler, Monads for functional programming. In M. Broy, editor, Marktoberdorf Summer School on Program Design Calculi, Springer Verlag, N A T O A S I Series F: Computer and systems sciences, Volume 118, August 1992. [18] P. Wadler, The essence of functional programing (invited talk). In 19'th Symposium on Principles of Programming Languages, Albuquerque, New Mexico; A C M , January 1992. 70
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Notions of category theory in functional programming
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Notions of category theory in functional programming Gammon, Shauna C. A. 2007
pdf
Page Metadata
Item Metadata
Title | Notions of category theory in functional programming |
Creator |
Gammon, Shauna C. A. |
Publisher | University of British Columbia |
Date Issued | 2007 |
Description | We present a detailed examination of applications of category theory to functional programming languages, with a primary focus on monads in Haskell. First, we explore E. Moggi's work in categorical semantics, which provides the theoretical foundation for employing monads in functional languages. In particular, we examine his use of Kleisli triples to model notions of computation. We then study P. Wadler's implementation of Moggi's ideas as a means to mimic side-effects in the purely functional language Haskell. We explicitly demonstrate the connections between Kleisli triples, categorytheoretic monads, and Haskell monads. It is our principal aim to provide a coherent translation between the abstracted concept of monads that exists in category theory, and the formulation of monads as type-constructors that is implemented in Haskell. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-02-24 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0080357 |
URI | http://hdl.handle.net/2429/31756 |
Degree |
Master of Science - MSc |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_2007-0409.pdf [ 2.96MB ]
- Metadata
- JSON: 831-1.0080357.json
- JSON-LD: 831-1.0080357-ld.json
- RDF/XML (Pretty): 831-1.0080357-rdf.xml
- RDF/JSON: 831-1.0080357-rdf.json
- Turtle: 831-1.0080357-turtle.txt
- N-Triples: 831-1.0080357-rdf-ntriples.txt
- Original Record: 831-1.0080357-source.json
- Full Text
- 831-1.0080357-fulltext.txt
- Citation
- 831-1.0080357.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080357/manifest