Haskell's Monad is a very, very special case of the mathematical monad. The problem with Stephen Diehl's approach is that it treats the topic in full generality making it much harder to understand how the special case is actually much simpler.
As a counterpoint, I'm fluent in mathematics but know virtually no Haskell. Your write up presumes the exact opposite and is therefore totally unsuitable for someone like me. :)
I mean that quite literally: I can make neither heads nor tails of your write up. It looks like a bunch of text with funny characters to me. This write up, OTOH, is written in my language using idioms I understand and relates it to Haskell without involving too much of the machinery of the language.
Personally, I found the Haskell in this article far harder than the math, which was more or less trivial to anyone who knows some category theory. For instance, the following
newtype (FComp g f) x = FComp { unCompose :: g (f x) }
instance (Functor b c f, Functor a b g) => Functor a c (FComp g f) where
fmap f (FComp xs) = FComp $ fmap (fmap f) xs
just does not parse easily at all to my mind. In fact, I don't see how someone who doesn't know Haskell could possibly understand the above truly. unCompose isn't even ever used.
And for me, as someone experienced with programming languages but only casually poking at the math, it didn't take that long to learn Haskell's syntax, but syntax that gratuitously uses η and μ (which look vaguely like mirror images of each other) is confusing, and I expect I would have a hard time remembering which is which (of course, I would probably learn to read Greek letters instinctively after spending some time reading them). And after at least 10 minutes, I cannot figure out what "Tμ" in "μ∘Tμ" means, syntactically, while no matter how confusing the Haskell code, it can at least be parsed unambiguously.
It's really not the syntax that's the issue. It's the fact that the author of the article is embedding the category-theoretic concepts into the language at the type level, so wrapper types like Id and FComp have to be introduced as "hacks" to represent functors.
My main takeaway from this article is that at the end of the day Haskell is a programming language and not the most natural setting for abstract mathematics, even if there are many connections between the two areas.
On the other hand, embedding mathematics in the types of a language has equivalence with writing theorems (writing the values in that language with proving them) so it's not such an unnatural setting in at least one way.
Funnily, η and μ are chosen precisely because they are mirror images of one another—their functionalities are in some sense mirror images of one another as well. η approaches a single-layer of T from the bottom and μ approaches it from the top!
(Edit: extra funnily, I actually flipped these two in my first writeup. This one is corrected, but I wanted to leave the note!)
The real trouble with parsing mathematics is that the syntax is overloaded at every opportunity. This is nice once you're in the same "mathematical mindset" as the author because it's a way for the author to focus your attention purely on the most important concepts being discussed (and everyone expects that there will be a laborious translation step once you start trying to prove things with more rigor). That said, until you are it's a bear and a half.
For instance, T is a functor which means it acts on both objects and maps in the source category. So if T : A -> B and x : Ob(A) then Tx is an object in Ob(B) and T was being used here as an object map. But, if y is also an object in Ob(A) (also written as y : Ob(A)) and there's an arrow in A called f : x -> y (which you might also write as f : A(x,y) --- serious TMTOWTDI) then Tf is an arrow in B like Tf : Tx -> Ty.
So to parse Tμ. If T is an endofunctor T : X -> X, then for any object A : Ob(X), "μ : TTA -> TA" is an arrow in X(A, TA), thus Tμ is an arrow in X(TTTA, TTA).
To parse μ.Tμ note that . is function composition, we just have Tμ as an arrow in X(TTTA, TTA). If we follow that with an application of μ then we have μ.Tμ as an arrow in X(TTTA, TA).
(Edit: also worth noting, for more feeling of parallelism between μ and η, if you, like I did, replace μ with η in the above you get Tη : X(TA, TTA) and η.Tη as (TA, TTTA).)
In more Haskelly notation, Tμ is (fmap join :: (Monad t, Functor t) => t (t (t a))) -> t (t a)) and μ.Tμ is (join . fmap join
:: (Monad t, Functor t) => t (t (t a)) -> t a) where I specialized all the types to be operating on the same endofunctor/monad.
This is precisely what made Haskell's monads difficult for me to understand. I kept trying to fit the square block in the round hole.
I finally had to divorce the terms in my head, much like watching the film version of a book that I have read. They are..."related" but frustration will result from assuming they will be the same.
Nice! Yeah. I had two avenues of understanding available to me when it came to monads. The first was my knowledge as a mathematically-inclined software engineer. The second was my knowledge as someone fluent in high-level mathematics.
I'll say, here was one key to my understanding once I saw it. In a purely functional language you want everything to be referentially transparent. That is, an expression should be able to be substituted for its value at any given moment in time. At the very least, you'll want the ability to denote which parts of your program are referentially transparent and which aren't.
This can be expressed as utopian vision: in a purely functional language we'd like everything -- everything -- to be a value.
We now need some way of translating code with side effects into code that returns a value. The logical abstraction is to say that the "value" is not the work itself -- once the work is done it's out of the bag, after all -- but rather the "value" is the machine which can perform that work.
To a mathematician this screams "functor." If you want to move between the two spaces repeatedly this screams "adjoint functors." There's some work you want to do in Space A that's more easily expressed in Space B, so you take a thing in A, transform it to a thing in B, do the work in B, and then transform back to A.
Think in, e.g., Ruby:
sentence.split(' ').map(&:reverse).join(' ')
split(' ') and join(' ') aren't quite inverses of each other, but they are adjoint. These pairs take us from the land of space-separated strings to the land of arrays back to the land of space separated strings.
You see that all the time in programming and mathematics. I sometimes call it the "Super Mario Brothers" maneuver when I'm trying to teach it to students. Mario wants to get to the end of the level, so he hits a block, goes up the vine to cloud land, runs through cloud land, and then comes back down.
Ka-bam! This sort of intuition is exactly what actually drove home monads to me after I spent far too long "knowing" them in a "I can code with these" kind of sense. Category theory might be "impractical" for coding, but the intuitions it helps you develop are wonderful.
He's playing fast and loose with the formalisms to try to convey the intuition to people without a strong algebraic background. Ruby isn't very well behaved from a categorical perspective (heck, even Haskell doesn't get all of the way there if you want to be really formal) so it's not perfect, but I think the reasoning is morally correct and insightful.
I wouldn't say they're the tightest functors ever, but you could see them as endofunctors between the subcategories String and [String]. Some small amount of intuition would translate, although by and large this setting isn't polymorphic enough to have good behavior. After all, there are a lot of functions on strings that would not have a functorial relationship over split(' ').
Yes, this is more correct, and I was also imagining the category being something like "space-delimited strings." I realize that'd require redefining things like string concatenation to be "space-delimited concatenation" and other such bits.
Anyhow, I was playing loose and I know it. If I were writing in a more serious context I'd cross all the t's and dot all the i's. To me the more important thing was this slow gnawing in my gut, seeing hints of natural transformations and adjoint functors here and there and thinking, "Oh come on, there has to be a way to connect these dots."
Non-functional languages like Ruby don't think in this way, so there are hole all over the language.
Maybe you mean functors instead of endofunctors (an endofunctor goes from a category to itself)? But that's a minor quibble, the main problem is I don't understand which categories you mean? What are the objects and morphisms? My best guess (and I have to guess because you didn't say) is that you mean to treat String and [String] as monoids and thus as one object categories. That is String is the category with a single object, the morphisms from that object to itself are all the possible strings and composition is concatenation; and similarly for [String] with list concatenation. If that's what you meant several things are wrong:
1. split(" ") is not a functor because (x+y).split(" ") is not equal to x.split(" ") + y.split(" "), for example:
"a BC".split(" ") == ["a", "BC"], but
"a B".split(" ") == ["a B"] and "C".split(" ") == ["C"].
(At least join(" ") is a functor, which here just means monoid homomorphism.) I guess this problem could be fixed by replacing String with StringsThatAreEitherEmptyOrEndInSpace (you need the empty string to be the identity for composition).
2. They are not adjoint, even after fixing the monoids to make them homomorphisms! It's easy to easy that if f and g are adjoint monoid homomorphisms, with unit η and counit ε, then the functions m ⟼ ε f(m) and n ⟼ g(n) η are inverses of each other. This does not happen here since, for example, join(' ') is not injective.
Can you fix the example, maybe by regarding them as categories in some other way? My gut feeling is that this won't work, but I haven't thought about it very much.
If, as I suspect, this adjointness is wrong and not easily fixable, does it still have value as an explanation? I think that's a personality question, I find it more confusing than enlightening, but can imagine some people would find it helpful.
I'm not trying to defend the example in detail—I agree that they don't have an obvious, rigorous categorical connection. I broadly think that with so little polymorphism you're not going to get a terrific categorical reading. I agree with the op that they still "feel" adjoint and may have a better reading there with a little generalization.
But yeah, it's not the greatest example. I don't mean to defend it as something rigorous.
Oh, and also, the moment this became clear to me was when I read the original monad paper. How frustrating -- all this ink spilled on explaining monads and I just needed to read the original paper to see the real motivation.
Understanding the very generalized algebraic monad helps you to understand the nature of encapsulating effects and adding complexity to categories through layering. It's absolutely not beginner material—but it's valuable material.
I don't like your take because it refuses to take that deeper dive into the mathematical Monad, despite [alluding] to it. Knowing the relationship between `Monad` and `Category` the type classes is almost never useful in programming Haskell. Knowing the relationship between `Monad`, monads, and their categorical underpinning is a different matter.
> Understanding the very generalized algebraic monad helps you to understand the nature of encapsulating effects and adding complexity to categories through layering.
I don't see how the general notion of category theoretical monad helps with this. In fact it's debatable whether the notion of monad is really the right treatment for universal algebra at all. Lawvere theories are much simpler and capture almost all of what you need. See: https://www.dpmms.cam.ac.uk/~martin/Research/Publications/20...
> Knowing the relationship between `Monad` and `Category` the type classes is almost never useful in programming Haskell.
I disagree. I use (<=<) all the time. It's basically (.) with the Kleisli wrapping and unwrapping done for you.
I'm not arguing that (>=>) isn't great or that Monads are better than Lawvere Theories, nor that what you presented isn't interesting and useful. Instead, I'm responding from the point of view of someone who read a lot of terse accounts of what typeclass Monads were back when I was first learning them.
The category law account for the Monad laws is useful once you've already learned to think categorically, at which point a full definition is great and provides indications of the next places to examine. Defining that a "Monad is really a Category k with an identification between k a b and a -> k () b" seems most useful to people who could already write "Monads Made Difficult", but not those who are trying to struggle to see how all these connections play out.
Monads Made Difficult is difficult, but lays the groundwork for a lot of the categorical machinery that forms the environment where Monads were born. That's important.
Are you saying that "Monads Made Difficult", is useful to "those who are trying to struggle to see how all these connections play out"?
My position is that it's definitely not. I don't think it's even really useful to experts, because, wait for it ... Haskell Monads really have very little to do with category theoretical monads! If all you want to do is to know how Haskell Monads work than I think that level of generality is unhelpful. If, on the other hand, you want to learn category theory then this is a fine approach :)
(NB I'm not claiming my article is for beginners either)
In summary, I think the article is a great way to learn about category theory by building on your knowledge of Haskell, but it is more complicated than necessary to help with understanding Haskell Monads.
I can agree with that. I suppose I believe, from my own personal experience, that a categorical approach to monads, even if they are distinct from Haskell `Monad`s, is a very good way to learn more advanced concepts in understanding Haskell Monads.
I don't think either article is great place for a beginner, but instead am coming from a point of view toward learning "advanced" functional programming and getting a flavor for categorical/squiggol style thinking. To me, it was greatly helpful to just see the pure math approach and then realize Haskell Monads as a simple example and less useful to see categorical concepts played out in the simplified Haskell domain.
Categories just don't make nearly as much sense as objects when all you see them as is a convenient way to talk about composition rules. They're dramatically powerful when you connect them to preorders, databases, graphs, or homology. My biggest moment of categorical bliss came from seeing the Paths monad on graphs, even though that is far removed from the programming domain.
This is explicitly not a writeup for beginners ( ergo the title ), it's a side-by-side description of the /very/ general categorical description and how you can model that in Haskell. The three parameter form of morphism composition ( i.e. c y z -> c x y -> c x z ) would be quite unwieldy in day-to-day programming but it does illustrate the underlying definitions better to have categories explicit.
I don't mean to nitpick, but the three parameter form is exactly what you use in day-to-day Haskell programming. You simply let c be the infix (->), and you get (y -> z) -> (x -> y) -> (x -> z). We just tend to think of it as a two parameter form because a lot of the time, you can think of (->) as syntax, as opposed to the type that it is.
It's great for two classes of people: those who understand monads from a Haskell/software engineering perspective and want to learn the mathematical perspective, and those like me who understand them from the mathematical perspective and want to learn them from the Haskell/software engineering perspective.
That's pretty much my point: I don't think mathematical monads are a good background for Haskell Monads. It's a bit like saying the ZF axioms are the mathematical background for Data.Set.
That's not a problem with Stephen's approach, that's simply what he was writing about. I found it to be a great concise presentation. Your post and his post have different goals, that does not make either one better than the other.
This is a post about categorical monads, I assume this is why it doesn't use the standard symbols that the Haskell monad uses (>>=, return) and instead refers to the natural transformations (η, μ) as mathematicians do.
To all those who are replying to me that I have misinterpreted this as beginner material, well the first sentence does say that it's an "introduction to Haskell monads".
Yes. Thank you. Yes yay yes. I studied mathematics. I've been programming for 15 years, 10-or-so professionally. I've always wanted a "real math" explanation of monads.
They now make sense to me. Seeing why mu needs to go from T^2 -> T and the role it plays concretely WRT the list and IO examples makes it really clear to me.
God bless you for not using some crazy analogy like bacon or whatever.
There was a UReddit course several months back, those videos might still be up. Where to go really depends on what you want to learn. If you just want to get a general feel for it, Wikipedia's probably the best place. If you want to actually learn the subject, it's going to take some studying. If you don't have any abstract algebra you'll almost certainly benefit from learning some first. I found Dummit & Foote and Artin to both be useful texts for abstract algebra, and Awodey to be useful for category theory.
Try the free online textbook Category Theory for Computing Science by Michael Barr and Charles Wells [1]. It's a true textbook with exercises and lots of content.
If you have time, Steve Awodey's book "Category Theory" is both beginner-friendly and helpful. You can get a PDF of an early pre-print online. He covers monads towards the end. If you are an experienced Haskell programmer, I've seen a few "Category Theory for Haskell programmers" tutorials around. That all said, I don't think that knowledge of Category Theory is a prerequisite to effective programming in Haskell (or any other language that features monads).
It first clicked for me when reading Barr and Wells' Category Theory for Computing Science [0], but I don't know about your mathematical background. Category theory is algebra, so it's probably advisable to study basic group theory before tackling category theory. (I have a hard time seeing how functors make sense until you've understood the general concept of a homomorphism, which is perhaps easiest to do in the context of groups).
Hrm, not sure about that. I know of no college CS course that requires category theory, let alone having the expectation that a decent CS degree will give you exposure to it.
As an example, the University of Chicago, my alma mater, doesn't expose CS students to Category Theory. Of all the places it might be where you's expect it the most. It has a notoriously "theoretical" computer science department tied closely to the mathematics department and is where Category Theory was invented.
Even in a math degree, one wouldn't typically touch on Category Theory in a classroom setting until one studied Algebraic Topology. Before that students might come across it as a neat sideshow, but nothing they'd be interested in using to solve an actual mathematical problem. There are plenty of students who get a BS in mathematics without ever making a serious go at it.
It's also not really that useful as a first-order field of mathematics. It's mostly useful as a way of organizing other mathematical things and as I kind of general vernacular. Even mathematicians sometimes lovingly call it "abstract nonsense".
I see CT as a kind of mathematical interstate system. If you want to go back and forth between Chicago to LA it's fantastic, but most of your work is being done in Chicago or LA itself. The interstate system itself isn't that interesting most of the time.
Yes, I agree: you can easily do a bachelor's degree in either mathematics or computer science without ever being taught category theory (though the chances are you'd hear it mentioned a few times).
More specific evidence to go with jfarmer's: I read mathematics at the University of Cambridge -- like UChicago, an absolutely first-rate institution and not one that gives its students an easy ride -- and there category theory is not part of the undergraduate curriculum. It is one of the courses you can take in "Part III", which is a one-year taught master's degree[1].
[1] Until very recently, the best way to describe it was "a one-year taught master's degree that inexplicably isn't actually a master's degree". But they've fixed that now.
I thought Hask was supposed to be the category of Haskell types with functions between them. My Haskell is pretty rusty... how exactly does (->) resolve to this meaning?
Also, on a more theoretical level, since Haskell types themselves belong to the category Hask, by definition, to what extent can Haskell truly model category theory when the objects of a category are evidently being represented here as Haskell types?
1. Hask has Haskell types as objects, and functions between them as morphisms. "a" is a type, "b" is a type, and "a -> b" is an arrow between them.
2. Haskell cannot model (the entirety of) category theory. Everything in Haskell is in Hask, and the only way to get to an entirely different category (say Set) is on paper. This great talk gives some insight into the issue: https://vimeo.com/67174266
1. Hask has exponentials, so (a -> b) is the exponential object final in the category of c's such that we have arrows from a * c to b.
2. It's incorrect that the only category in Haskell is Hask, after all it's easy enough to formulate subcategories of Hask and do your algebra there. I don't know how to answer to what degree you can model category theory, but you can do a lot.
Regarding 2, you're right of course about subcategories of Hask, but I'm more interested in completely different categories.
My question really boils down to this: given the Category class provided in this article, which mathematical categories occur as instances of this Category? Are they necessarily all subcategories of Hask?
I think the answer to that question is better understood in the dependently-typed literature. Haskell does not have easily accessible dependent types, but many of the techniques could be translated and would help to answer your question.
Offhandedly, I'd say you could model many categories of interest, but since you're working in a "challenging" of constructive logic there are a lot of limitations on what can be modeled. At the end of the day, your objects will always be Haskell types, but a lot can be encoded in those types.
Again, though, the DT literature is really where people are attacking this question. Haskell's Category is usually just used when there's a meaningful subcategory of Hask that you want to overload composition and identity on.
I would like to give a word of encouragement to anyone who, like me, thought this insight into Monads was all greek. Understanding any of the material in this tutorial is not a prerequisite to writing practical computer programs in Haskell.
You don't have to completely understand Monads in order to use them.
Just as you learn adding two numbers together in pre-school and learn about axioms of associativity and commutativity later, you should start by learning IO first and think about the more general ideas later.
Nevertheless, this was quite an interesting article. Maybe I'll try to put some thought into it and see if I can learn something that would help me level up my Haskell skills.
I prefer http://web.jaguarpaw.co.uk/~tom/blog/posts/2012-09-02-what-i..., but that's probably because I wrote it.