Haskell's Monad is a very, very special case of the mathematical monad. The problem with Stephen Diehl's approach is that it treats the topic in full generality making it much harder to understand how the special case is actually much simpler.
As a counterpoint, I'm fluent in mathematics but know virtually no Haskell. Your write up presumes the exact opposite and is therefore totally unsuitable for someone like me. :)
I mean that quite literally: I can make neither heads nor tails of your write up. It looks like a bunch of text with funny characters to me. This write up, OTOH, is written in my language using idioms I understand and relates it to Haskell without involving too much of the machinery of the language.
Personally, I found the Haskell in this article far harder than the math, which was more or less trivial to anyone who knows some category theory. For instance, the following
newtype (FComp g f) x = FComp { unCompose :: g (f x) }
instance (Functor b c f, Functor a b g) => Functor a c (FComp g f) where
fmap f (FComp xs) = FComp $ fmap (fmap f) xs
just does not parse easily at all to my mind. In fact, I don't see how someone who doesn't know Haskell could possibly understand the above truly. unCompose isn't even ever used.
And for me, as someone experienced with programming languages but only casually poking at the math, it didn't take that long to learn Haskell's syntax, but syntax that gratuitously uses η and μ (which look vaguely like mirror images of each other) is confusing, and I expect I would have a hard time remembering which is which (of course, I would probably learn to read Greek letters instinctively after spending some time reading them). And after at least 10 minutes, I cannot figure out what "Tμ" in "μ∘Tμ" means, syntactically, while no matter how confusing the Haskell code, it can at least be parsed unambiguously.
It's really not the syntax that's the issue. It's the fact that the author of the article is embedding the category-theoretic concepts into the language at the type level, so wrapper types like Id and FComp have to be introduced as "hacks" to represent functors.
My main takeaway from this article is that at the end of the day Haskell is a programming language and not the most natural setting for abstract mathematics, even if there are many connections between the two areas.
On the other hand, embedding mathematics in the types of a language has equivalence with writing theorems (writing the values in that language with proving them) so it's not such an unnatural setting in at least one way.
Funnily, η and μ are chosen precisely because they are mirror images of one another—their functionalities are in some sense mirror images of one another as well. η approaches a single-layer of T from the bottom and μ approaches it from the top!
(Edit: extra funnily, I actually flipped these two in my first writeup. This one is corrected, but I wanted to leave the note!)
The real trouble with parsing mathematics is that the syntax is overloaded at every opportunity. This is nice once you're in the same "mathematical mindset" as the author because it's a way for the author to focus your attention purely on the most important concepts being discussed (and everyone expects that there will be a laborious translation step once you start trying to prove things with more rigor). That said, until you are it's a bear and a half.
For instance, T is a functor which means it acts on both objects and maps in the source category. So if T : A -> B and x : Ob(A) then Tx is an object in Ob(B) and T was being used here as an object map. But, if y is also an object in Ob(A) (also written as y : Ob(A)) and there's an arrow in A called f : x -> y (which you might also write as f : A(x,y) --- serious TMTOWTDI) then Tf is an arrow in B like Tf : Tx -> Ty.
So to parse Tμ. If T is an endofunctor T : X -> X, then for any object A : Ob(X), "μ : TTA -> TA" is an arrow in X(A, TA), thus Tμ is an arrow in X(TTTA, TTA).
To parse μ.Tμ note that . is function composition, we just have Tμ as an arrow in X(TTTA, TTA). If we follow that with an application of μ then we have μ.Tμ as an arrow in X(TTTA, TA).
(Edit: also worth noting, for more feeling of parallelism between μ and η, if you, like I did, replace μ with η in the above you get Tη : X(TA, TTA) and η.Tη as (TA, TTTA).)
In more Haskelly notation, Tμ is (fmap join :: (Monad t, Functor t) => t (t (t a))) -> t (t a)) and μ.Tμ is (join . fmap join
:: (Monad t, Functor t) => t (t (t a)) -> t a) where I specialized all the types to be operating on the same endofunctor/monad.
This is precisely what made Haskell's monads difficult for me to understand. I kept trying to fit the square block in the round hole.
I finally had to divorce the terms in my head, much like watching the film version of a book that I have read. They are..."related" but frustration will result from assuming they will be the same.
Nice! Yeah. I had two avenues of understanding available to me when it came to monads. The first was my knowledge as a mathematically-inclined software engineer. The second was my knowledge as someone fluent in high-level mathematics.
I'll say, here was one key to my understanding once I saw it. In a purely functional language you want everything to be referentially transparent. That is, an expression should be able to be substituted for its value at any given moment in time. At the very least, you'll want the ability to denote which parts of your program are referentially transparent and which aren't.
This can be expressed as utopian vision: in a purely functional language we'd like everything -- everything -- to be a value.
We now need some way of translating code with side effects into code that returns a value. The logical abstraction is to say that the "value" is not the work itself -- once the work is done it's out of the bag, after all -- but rather the "value" is the machine which can perform that work.
To a mathematician this screams "functor." If you want to move between the two spaces repeatedly this screams "adjoint functors." There's some work you want to do in Space A that's more easily expressed in Space B, so you take a thing in A, transform it to a thing in B, do the work in B, and then transform back to A.
Think in, e.g., Ruby:
sentence.split(' ').map(&:reverse).join(' ')
split(' ') and join(' ') aren't quite inverses of each other, but they are adjoint. These pairs take us from the land of space-separated strings to the land of arrays back to the land of space separated strings.
You see that all the time in programming and mathematics. I sometimes call it the "Super Mario Brothers" maneuver when I'm trying to teach it to students. Mario wants to get to the end of the level, so he hits a block, goes up the vine to cloud land, runs through cloud land, and then comes back down.
Ka-bam! This sort of intuition is exactly what actually drove home monads to me after I spent far too long "knowing" them in a "I can code with these" kind of sense. Category theory might be "impractical" for coding, but the intuitions it helps you develop are wonderful.
He's playing fast and loose with the formalisms to try to convey the intuition to people without a strong algebraic background. Ruby isn't very well behaved from a categorical perspective (heck, even Haskell doesn't get all of the way there if you want to be really formal) so it's not perfect, but I think the reasoning is morally correct and insightful.
I wouldn't say they're the tightest functors ever, but you could see them as endofunctors between the subcategories String and [String]. Some small amount of intuition would translate, although by and large this setting isn't polymorphic enough to have good behavior. After all, there are a lot of functions on strings that would not have a functorial relationship over split(' ').
Yes, this is more correct, and I was also imagining the category being something like "space-delimited strings." I realize that'd require redefining things like string concatenation to be "space-delimited concatenation" and other such bits.
Anyhow, I was playing loose and I know it. If I were writing in a more serious context I'd cross all the t's and dot all the i's. To me the more important thing was this slow gnawing in my gut, seeing hints of natural transformations and adjoint functors here and there and thinking, "Oh come on, there has to be a way to connect these dots."
Non-functional languages like Ruby don't think in this way, so there are hole all over the language.
Maybe you mean functors instead of endofunctors (an endofunctor goes from a category to itself)? But that's a minor quibble, the main problem is I don't understand which categories you mean? What are the objects and morphisms? My best guess (and I have to guess because you didn't say) is that you mean to treat String and [String] as monoids and thus as one object categories. That is String is the category with a single object, the morphisms from that object to itself are all the possible strings and composition is concatenation; and similarly for [String] with list concatenation. If that's what you meant several things are wrong:
1. split(" ") is not a functor because (x+y).split(" ") is not equal to x.split(" ") + y.split(" "), for example:
"a BC".split(" ") == ["a", "BC"], but
"a B".split(" ") == ["a B"] and "C".split(" ") == ["C"].
(At least join(" ") is a functor, which here just means monoid homomorphism.) I guess this problem could be fixed by replacing String with StringsThatAreEitherEmptyOrEndInSpace (you need the empty string to be the identity for composition).
2. They are not adjoint, even after fixing the monoids to make them homomorphisms! It's easy to easy that if f and g are adjoint monoid homomorphisms, with unit η and counit ε, then the functions m ⟼ ε f(m) and n ⟼ g(n) η are inverses of each other. This does not happen here since, for example, join(' ') is not injective.
Can you fix the example, maybe by regarding them as categories in some other way? My gut feeling is that this won't work, but I haven't thought about it very much.
If, as I suspect, this adjointness is wrong and not easily fixable, does it still have value as an explanation? I think that's a personality question, I find it more confusing than enlightening, but can imagine some people would find it helpful.
I'm not trying to defend the example in detail—I agree that they don't have an obvious, rigorous categorical connection. I broadly think that with so little polymorphism you're not going to get a terrific categorical reading. I agree with the op that they still "feel" adjoint and may have a better reading there with a little generalization.
But yeah, it's not the greatest example. I don't mean to defend it as something rigorous.
Oh, and also, the moment this became clear to me was when I read the original monad paper. How frustrating -- all this ink spilled on explaining monads and I just needed to read the original paper to see the real motivation.
Understanding the very generalized algebraic monad helps you to understand the nature of encapsulating effects and adding complexity to categories through layering. It's absolutely not beginner material—but it's valuable material.
I don't like your take because it refuses to take that deeper dive into the mathematical Monad, despite [alluding] to it. Knowing the relationship between `Monad` and `Category` the type classes is almost never useful in programming Haskell. Knowing the relationship between `Monad`, monads, and their categorical underpinning is a different matter.
> Understanding the very generalized algebraic monad helps you to understand the nature of encapsulating effects and adding complexity to categories through layering.
I don't see how the general notion of category theoretical monad helps with this. In fact it's debatable whether the notion of monad is really the right treatment for universal algebra at all. Lawvere theories are much simpler and capture almost all of what you need. See: https://www.dpmms.cam.ac.uk/~martin/Research/Publications/20...
> Knowing the relationship between `Monad` and `Category` the type classes is almost never useful in programming Haskell.
I disagree. I use (<=<) all the time. It's basically (.) with the Kleisli wrapping and unwrapping done for you.
I'm not arguing that (>=>) isn't great or that Monads are better than Lawvere Theories, nor that what you presented isn't interesting and useful. Instead, I'm responding from the point of view of someone who read a lot of terse accounts of what typeclass Monads were back when I was first learning them.
The category law account for the Monad laws is useful once you've already learned to think categorically, at which point a full definition is great and provides indications of the next places to examine. Defining that a "Monad is really a Category k with an identification between k a b and a -> k () b" seems most useful to people who could already write "Monads Made Difficult", but not those who are trying to struggle to see how all these connections play out.
Monads Made Difficult is difficult, but lays the groundwork for a lot of the categorical machinery that forms the environment where Monads were born. That's important.
Are you saying that "Monads Made Difficult", is useful to "those who are trying to struggle to see how all these connections play out"?
My position is that it's definitely not. I don't think it's even really useful to experts, because, wait for it ... Haskell Monads really have very little to do with category theoretical monads! If all you want to do is to know how Haskell Monads work than I think that level of generality is unhelpful. If, on the other hand, you want to learn category theory then this is a fine approach :)
(NB I'm not claiming my article is for beginners either)
In summary, I think the article is a great way to learn about category theory by building on your knowledge of Haskell, but it is more complicated than necessary to help with understanding Haskell Monads.
I can agree with that. I suppose I believe, from my own personal experience, that a categorical approach to monads, even if they are distinct from Haskell `Monad`s, is a very good way to learn more advanced concepts in understanding Haskell Monads.
I don't think either article is great place for a beginner, but instead am coming from a point of view toward learning "advanced" functional programming and getting a flavor for categorical/squiggol style thinking. To me, it was greatly helpful to just see the pure math approach and then realize Haskell Monads as a simple example and less useful to see categorical concepts played out in the simplified Haskell domain.
Categories just don't make nearly as much sense as objects when all you see them as is a convenient way to talk about composition rules. They're dramatically powerful when you connect them to preorders, databases, graphs, or homology. My biggest moment of categorical bliss came from seeing the Paths monad on graphs, even though that is far removed from the programming domain.
This is explicitly not a writeup for beginners ( ergo the title ), it's a side-by-side description of the /very/ general categorical description and how you can model that in Haskell. The three parameter form of morphism composition ( i.e. c y z -> c x y -> c x z ) would be quite unwieldy in day-to-day programming but it does illustrate the underlying definitions better to have categories explicit.
I don't mean to nitpick, but the three parameter form is exactly what you use in day-to-day Haskell programming. You simply let c be the infix (->), and you get (y -> z) -> (x -> y) -> (x -> z). We just tend to think of it as a two parameter form because a lot of the time, you can think of (->) as syntax, as opposed to the type that it is.
It's great for two classes of people: those who understand monads from a Haskell/software engineering perspective and want to learn the mathematical perspective, and those like me who understand them from the mathematical perspective and want to learn them from the Haskell/software engineering perspective.
That's pretty much my point: I don't think mathematical monads are a good background for Haskell Monads. It's a bit like saying the ZF axioms are the mathematical background for Data.Set.
That's not a problem with Stephen's approach, that's simply what he was writing about. I found it to be a great concise presentation. Your post and his post have different goals, that does not make either one better than the other.
This is a post about categorical monads, I assume this is why it doesn't use the standard symbols that the Haskell monad uses (>>=, return) and instead refers to the natural transformations (η, μ) as mathematicians do.
To all those who are replying to me that I have misinterpreted this as beginner material, well the first sentence does say that it's an "introduction to Haskell monads".
I prefer http://web.jaguarpaw.co.uk/~tom/blog/posts/2012-09-02-what-i..., but that's probably because I wrote it.