Monads don’t compose, effects do. ‘IO a’ works great until until you need to add another effect, for example ‘Maybe’. Then you need to bring in monad transformers and create your own monad combining the two, then rewrite all your code to lift the effects from one monad to the other. And you have to do this every time you want to add a new effect.
> Not to mention you need monadic and nonmonadic versions of every higher order function (or so it feels like) - map / mapM
This is more a weakness of Haskell's standard library (which is despite its reputation not very friendly to category theory) than an inherent problem with monads. A more general `map` would look something like this
class Category c => FunctorOf c f where
map :: c a b -> c (f a) (f b)
fmap :: FunctorOf (->) f => (a -> b) -> f a -> f b
fmap = map
type Kleisli m a b = a -> m b
mapM :: (Monad m, FunctorOf (Kleisli m) f) => (a -> m b) -> f a -> m (f b)
mapM = map
type Op c a b = c b a
contramap :: FunctorOf (Op (->)) f => (a -> b) -> f b -> f a
contramap = map
type Product c d (a,x) (b,y) = (c a b, d x y)
bimap :: FunctorOf (Product (->) (->)) f => (a -> b, x -> y) -> f (a, x) -> f (b, y)
bimap = map
-- and so on
but this would require somewhat better type-level programming support for the ergonomics to work out.
> Monads need to wrap each other, effects are more composable
It's really trickier than algebraic effects make it seem though. Haskell-ish "monad transfomers" as a stack of wrappers may pick concrete ordering of effects in advance (e.g. there's difference between `State<S, Result<E, T>>` and `Result<E, State<S, T>>`, using Rust syntax), but effect systems like one in Koka either have to do the same decision by using specific order of interpreters, or by sticking to single possible ordering, e.g. using one, more powerful monad. And then there're questions around higher order effects - that is, effects with operations that take effectful arguments - because they have to be able to "weave" other effects through themselves while preserving their behaviour, and this weaving seems to be dependent on concrete choice of effects, thus not being easily composable. In a sense, languages like Koka or Unison have to be restricted in some way, giving up on some types of effects. I'm not saying that's a bad thing though, it's still a improvement over having single effect (IO) or no effects at all.
Being able to change the ordering of effects on the fly is a benefit of algebraic-effect systems. As you mentioned `State<S, Result<E, T>>` and `Result<E, State<S, T>>`have very different effects. Algebraic-effects let you switch between the two behaviors when you run the effects, whereas with monad transformers you have to refactor all your code to use `State<S, Result<E, T>>` instead of `Result<E, State<S, T>>` or vice-versa
You can recover the ability to reorder effects by using MTL-style type classes, so you could write that as M<T> where M: MonadState<S> + MonadError<E>, in rust-ish syntax. But that makes the number of trait/typeclass implementation for each transformer explode (given a trait and a type for each transformer, it's O(N^2)), whereas algebraic effect systems don't really have that issue. I also have a hunch that algebraic effects(or, well, delimited continuations in general) are probably easier to optimize than monad transformers, too.