The main problem is that category theorists keep writing stuff claiming that, if only you get through the math, you'll eventually be able to apply the category theory to typical everyday programming problems. And by the time they get to the Yoneda lemma or whatever, they've forgotten they're supposed to show some applications. All they show is how to express category theoretical concepts in a program. And it is mostly because the demonstrated applications are few (I have spent a lot of time looking and have found little convincing evidence).
Having learned the subject myself, I'm not convinced it's useful for programmers. Hell, it's not even useful for all areas of mathematics, and that is its raison d'etre.
Category theory can be very useful, but you don’t use it the same way you use other kinds of math. You can apply optimization theory, for example, by noticing that a problem has a certain form, and therefore a certain algorithm will converge to a solution. Applications of category theory are usually more subtle. You’re not likely to quote some theorem from category theory that finishes off a problem the way the selecting an optimization algorithm does.
I had been skeptical of applications of category theory, and to some extent I still am. Many reported applications of category theory aren’t that applied, and they’re not so much applications as post hoc glosses.
At the same time, I’ve seen real applications of categories, such as the design of LINQ mentioned above. I’ve been a part of projects where we used category theory to guide mathematical modeling and software development. Category theory can spot inconsistencies and errors similar to the way dimensional analysis does in engineering, or type checking in software development. It can help you ask the right questions. It can guide you to including the right things, and leaving the right things out.
If you can stand up a Cartesian-closed category, it is a model of a lambda calculus. So you can make a compiler plugin that takes lambda terms (a large subset of Haskell) and compiles it into all sorts of exotic settings (dataflow diagrams, derivatives of the function, ...)
> Having learned the subject myself, I'm not convinced it's useful for programmers.
I think it is demonstrably useful to people who design programming languages with strong type systems. Not a very common use case in practice, but it did find an application. Damned with faint praise, perhaps. I haven't seen anyone demonstrate a use outside that niche.
I've been down that rabbit hole, and while some few select people do it, the vast majority of people doing this rely on PL papers that long preceded the attempt to put category theory in the mix.
> The main problem is that category theorists keep writing stuff claiming that, if only you get through the math, you'll eventually be able to apply the category theory to typical everyday programming problems.
There is no "typical everyday programming problems." It's an incredibly broad discipline, and one person's typical is another's esoteric.
When you're designing an API, it's useful to know when you're dealing with one of the common structures that occurs in category theory, and there's a somewhat high cost to missing that, since you can end up with a "broken" feeling API where the pieces don't fit together properly, or where the different parts aren't consistent with one another.
It's hard to write something that is both accessible and well-motivated.
The best uses of category theory is when the morphisms are far more exotic than "regular functions". E.g. it would be nice to describe a circuit of live queries (like https://materialize.com/ stuff) with proper caching, joins, etc. Figuring this out is a bit of an open problem.
Haskell's standard library's Monad and stuff are watered down to the point that they are barely category theory, but they are also quite useful and would not have been readily invented without category theory. See even if you have no taste for "abstract nonsense", maybe you can still accept the fact that its left a trail of more accessible "semi-abstract semi-nonsense" in its wake.
This stuff takes time. If it's not your cup of tea, no need to make yourself an early adopter. See also things like Lean where fancy type systems are finally reaching "regular non-CS" mathematicians with great success.
On the pure math side, Locales (https://ncatlab.org/nlab/show/locale) are so much more beautiful an axiomatization scheme than regular topological spaces, even if one is just doing regular classical rather than constructive math. We would have not discovered them except for category theory.
The basic (pre-category theory) idea is instead of making up new weird math just for logical purposes, start thinking more abstractly and using "regular" math concepts. The section says "models" but it should work for syntax too not just semantics (the syntax is the "initial" semantics).
Category theory takes all that order/lattice stuff and generalizes it in the same way programming generalizes logic ("whether" true/false to "which" inhabitant of a type). So it is definitely useful for people trying to explore the space of programming language designs.
The reason category theory might be useful to "regular programmers" not programming language designers, basically boils down to the belief that programming and language design are not so different after all. I can't really argue that explicitly, but think "where does designing a library interface end, and designing a domain-specific language being?". That's the basic idea.
I was a college math major. Category theory wasn't one of the offered courses at my college, so I didn't study it specifically. But I didn't get through any of my math classes without working the problems and proofs, and without having a teacher to help guide me through the material. It can't be read like a book. At least that's my experience.
And as I get older, being interested in the topic helps more and more.
As a counterexample, I have self-studied queuing theory, extreme value analysis, combinatorics, generalised linear regression, etc. and had no problem with it.
Category theory I just can't get through. Either it does not fit with my brain or I fail to see the practical implications, but something makes it not stick.
(Curiously, differential equations (beyond the basics) are in a similar spot for me, even though they are obviously immediately practical in a huge number of areas!)
This oughtn't be super surprising, even the DE comment at the end - people do just have bits of maths they gel with more than others!
Even so, CT is not easy to motivate without a lot of experience in a few specific areas of maths - it's definitely a very "meta" subject (even though a lot of these blogs pretend it's super applicable to everyday engineering) and is a very different lens from the maths taught up to that point.
Category theory is a great tool for thinking about programming language design (especially types in functional programming). In some sense that translates to better programming, but there’s a lot more math topics that are likely to be directly applicable day to day.
Do you know how to evaluate a polynomial efficiently? Fit a curve to points? Solve a linear system? What the Fourier transform is for? If no, hold off on the categories.
Bézier curves parameterise a curve with points. When people talk about "fitting a curve" I would expect they mean fitting a model rather than that the curve is fully determined by the points. Typically people fit linear or low-order polynomials, it'd be a weird day where someone wanted to use a Bézier for their model - those curves weren't designed for being statistically tractable or interesting.
To expand upon roenxi's answer: Bezier curves are an example of creating a cubic (degree 3) parametric polynomial curve by using four points to describe where you want it to go.
Curve fitting is about starting with a cloud of points, and trying to decide how to draw a polynomial (of whatever degree you want) that does a good job of describing the trend line of those points.
Then you can use that polynomial to approximately interpolate points between the samples, or extrapolate points beyond the samples as an example.
As a math course, I felt category theory was interesting as a way of unifying ideas across different areas of mathematics. But I never quite seen a value of it for programming (but maybe it's just me)
As primarily an engineer doing maths, with code on the side, I feel the exact same.
It’s been way more of use to me in pulling together parts of topology, linear algebra, geometric algebra, homology et al than discovering furtive programming abstractions.
If you're talking about toposes, many logics do not correspond to them. It's the other way around: many category theorists/logicians needed jobs and found computer jobs and hence the category theoretic perspective in CS.
You don't know what you're talking about. Applications of category theory to computer science predate the first US computer science department. There weren't any computer science jobs to speak of.
Categorical logic is larger than elementary topoi. I would wager that you can't name a logic not expressible in category theory.
>Applications of category theory to computer science predate the first US computer science department.
This doesn't sound plausible: according to Wikipedia, Eilenberg and Mac Lane introduced concepts of CT in 1942, and CT was originally used mainly homological algebra, with applications in logic/semantics later. Certainly CT was given credence by Grothendieck and Serre, working on (and solving) Weil's conjecture in the 50s and 60s. Lawvere's 1963 thesis predated categorical logic (according to his own words, <http://www.tac.mta.ca/tac/reprints/articles/5/tr5abs.html> p. 12, "but the method here, and indeed in the whole ensuing categorical logic, is to exploit the uniqueness by using adjointness itself as an axiom"). The first computer science department in the US was started in 1962 in Purdue University. Meanwhile for contrast the proofs of Church and Turing were published in 1936.
>I would wager that you can't name a logic not expressible in category theory.
I prefaced my statement with an "if"; you're not talking about toposes apparently. The interpretability of one theory in another is a concern of logicians and I'm not familiar with any restrictions on category theory, indeed it can be used for logics. I do not know topos theory but I was basing my statement on what I heard Colin McLarty say in an interview of his <https://www.youtube.com/watch?v=81sPQGIWEfM>, can't give timestamp; he definitely named logics that cannot be expressed by the topos of a space.
I feel that you've used two straw men: 1) specifically naming US 2) ignoring my "if topos" and saying "expressible in CT".
Even though I went pretty deep in math/cs in college and worked as a SWE for five years, I never really "got" the connection between category theory and actual practical software problems until I started working on a new project with complex uses of types and functional programming. IMO most programmers are not working on projects like that where category theory is even relevant. And for me even though it was relevant, other than being to identify something by its category theory vocab term, it wasn't really helpful to actually know category theory.
Also, you probably should study axiomatic set theory and abstract algebra before taking on category theory and even in materials like this where they try to bring you up to speed on them, you'll not have had the exposure time to those subjects (or perhaps higher math in general) to understand things well. You'll also skip over stuff like https://en.wikipedia.org/wiki/Multiplicative_group_of_intege... which aren't category theory per se but prime your brain into thinking about the properties of sets, relations within/between sets, and repeatable "structures" across sets.
The other problem I ran into is that most programming languages do not have sophisticated enough type/functional systems to be able to properly implement a lot of fancy stuff where category theory is relevant. For example, even though typescript had all the information necessary to know that class A is abstract and B extends it, or that class C is templatized by types subject to some constraint, it can't at compile time implement a lot of checks that it technically should to handle Functors (https://en.wikipedia.org/wiki/Functor_(functional_programmin...). The languages that do handle these things properly are typically lacking in other ways that made them not worth it (for me) to use - it was better to handle the problem inelegantly in the less sophisticated language.
Exactly. While Category Theory is interesting and the way it can abstract things like the Diagonal argument is very elegant every time I try to learn it I end up feeling I would have been better off learning almost anything else.
There is nothing to “get” that is worth getting unless you are interested in pure math. As a programmer we already know how to compose types/functions/applications/systems. No need for category theory to understand how to do that.
Perhaps another way of asking the question is: Are there any results, either about individual programs or in PL theory as a whole, that were made simpler / clarified / generalized because of category theoretic insights?
Yes, but it's much, muvh faster and more productivr to just learn Haskell's type system (mainly Functors, Monads and Applicatives) as 3 individual "design patterns" than it is to figure it how you can even begin to apply the Yoneda Lemma to whatever programing problem you have in front of you.
Category Theory has birthed some useful abstraction, but you don't really need any of CT to learn and use the abstractions.
The basics of Abstract Algebra on the otherhand are absolutely worth the bit of time it takes to learn them.
Groups, semi-groups, rings, fields, monoids - distribution, identity, zeros, associstivity, communitavity are pretty trivia to learn - most people already know the underlying concepts and they pop up all the time in programing.
Monoids are incredibly useful - most people already know them, and intuitively use them, but it's helpful naming and understanding the pattern.
Category Therory just doesn't have the same return on investment for software development.
There is a very useful perspective in which categories are just typed monoids, and the monoid operation can only be applied when the types "line up". For example, here are some useful operations which do not form a monoid:
- PUSH(n) where n is a floating-point number
- POP
- CLEAR
- ADD, SUB, MUL, DIV
- ID
These can be interpreted as operations on a stack of floating-point numbers in the obvious way, PUSH(1.2) * [3.14] == [1.2, 3.14], POP * [1, 2, 3] == [2, 3], ADD * [1, 2, 5] == [3, 5], CLEAR * [1, 2, 3] == [], ID * [1, 2, 3] == [1, 2, 3], etc. However, not all of the compositions of stack operations are legal. For example, ADD * PUSH(1) * PUSH(2) is fine and equivalent to PUSH(3), but ADD * PUSH(1) * CLEAR is illegal.
Ok, so our stack operations don't form a monoid. But they obviously can still be composed, sometimes, so what do we have if not a monoid? They form a category! There is one object for each natural number, representing the height of the stack. So there are arrows like PUSH(3.14) : Height_{n} -> Height_{n+1} for all n, and POP : Height_{n} -> Height_{n-1} whenever n >= 1, and ADD : Height_{n} -> Height_{n-2} whenever n >= 2.
Another common example is matrices. Square matrices form a monoid, but what about arbitrary rectangular matrices? They don't form a monoid, but they do form a category where the objects are natural numbers, and the arrows N -> M are just the MxN matrices. You can't multiply any two matrices, but if you have a P -> Q matrix (QxP) and a Q -> R (RxQ) matrix then you can multiply them to get a P -> R matrix (RxP).
Monads are a traditional stumbling block for Haskell newbies, including me when I was one. I found that those articles I linked demystified them more than any number of "monad tutorials" ever could do.
The thing about questions like this is that the complexity of mathematical explanations is highly dependent on what each reader considers "simple." Consider two different approaches to understanding a concept, expressed in information-theoretic terms:
This additional complexity layer of categorical framing has a nonzero cognitive cost, which varies based on the learner's intuitions and background. The investment in learning categorical thinking only becomes worthwhile when the learner can amortize its cost by envisioning its broad applicability - when they can see how categorical frameworks enable systematic problem decomposition and generalization. This implies they've already recognized the limitations and redundancies of non-categorical approaches, understanding intuitively how many concepts would need to be reinvented as the problem evolves within its conceptual framework (gestell).
However, there exists a middle path that often goes unrecognized as categorical thinking, despite embodying its essence. This approach involves incrementally discovering generalizations -- "oh, this can be generalized with this type" or "oh, if I wrap this in another type I can do something else later on" or "oh this syntactic sugar for this particular operator overload feels quite nice"
Yes, there are a number of them. Here are some examples off the top of my head:
- Moggi was studying the problem of equivalence of programs, and noted that the traditional approach to modeling a program as a total function Input -> Output is problematic. He pioneered the use of monads and Kleisli categories as a foundation for reasoning about equivalence of real programs, including all the real-world nastiness like non-termination, partiality (e.g. throwing an exception that kills the program), non-determinism, and so on. https://person.dibris.unige.it/moggi-eugenio/ftp/lics89.pdf
- Linear logic (and it's close relative affine logic) was the inspiration behind Rust's ownership model, from what I understand. Linear logic was originally described in terms of the sequent calculus by Girard (http://girard.perso.math.cnrs.fr/linear.pdf), but later work used certain categories as a model of linear logic (https://ncatlab.org/nlab/files/SeelyLinearLogic.pdf). This answered and clarified a number of questions stemming from Girard's original work.
- Cartesian-closed categories (CCCs) form models of the simply-typed lambda calculus, in the sense that any lambda term can be interpreted as a value in a CCC. Conal Elliott pointed out that this means that a lambda term doesn't have just one natural meaning; it can be given a multitude of meanings by interpreting the same term in different CCCs. He shows how to use this idea to "interpret" a program into a circuit that implements the program. http://conal.net/papers/compiling-to-categories/
- There is a classical construction about minimizing a DFA due to Brzozowski which is a bit magical. Given a DFA, do the following process twice: (a) get an NFA for the reverse language by reversing all edges in the DFA and swapping start / accept nodes, then (b) drop any nodes which are not reachable from a start node in the NFA. The result will be the minimal DFA that accepts the same language as your original DFA! Bonchi, Bonsangue, Rutten, and Silva analyzed Brzozowski's algorithm from a categorical perspective, which allowed them to give a very clear explanation of why it works along with a novel generalization of Brzozowski's algorithm to other kinds of automata. https://alexandrasilva.org/files/RechabilityObservability.pd...
- I would also put the development of lenses in this list, but they haven't leaked very far outside of the Haskell universe yet so I don't think they are a compelling example. Check back in 5 years perhaps. Here's a blog post describing how lenses relate to jq and xpath: https://chrispenner.ca/posts/traversal-systems
- I've personally had success in finding useful generalizations of existing algorithms by finding a monoid in the algorithm and replacing it with a category, using the fact that categories are like "many-kinded monoids" in some sense. I haven't written any of these cases up yet, so check back in 2 years or so. In any case, they've been useful enough to drive some unique user-facing features.
This comment pairs very well with the recent thread on Heaviside’s operator calculus. He got ahead of theory, did groundbreaking work, and was ultimately superseded by more principled approaches. I think this illustrates a kind of intellectual leapfrog. Sometimes we do things that are right, but it’s not clear why they work, and other times we have theoretical structures that open up new realms of practice.
I just realized I botched the description of Brzozowski's algorithm, step (b) should be "determinize the NFA using the powerset construction". Mea culpa.