The user group explicitly rejected multithreading at the OCaml Users conference a couple of years back. (There was a working implementation called oc4mc).
The reasons were that it will slow down single-threaded performance, and threads as a programming model is not robust (compared to, eg. forking, message passing, MPI etc). Also actual graphs of 4- and 8-way SMP performance showed pretty poor scaling for real problems, so the benefits aren't that great compared to going for full MPI, which you have to do for NUMA anyhow.
GHC has a compiler flag to enable multi threaded runtime. Why can't OCaml have that? It is a bit rigid to have to build two versions of a binary, but for folks who run their own code, no problem at all.
Sure, you can run OCaml and oc4mc. You'll also end up with two different binaries. oc4mc isn't as well tested, so I guess you'll hit more bugs. This still doesn't solve the horrible programming model [I've been cursing a multithreaded C program in the past few days. It's amazing how much stuff you have to remember and how much simply doesn't work in C when you've got threads]. Nor does it let your program scale past the limit of a single NUMA node. Whereas message passing a la Erlang lets you scale over NUMA and across machines and networks.
The reasons were that it will slow down single-threaded performance, and threads as a programming model is not robust (compared to, eg. forking, message passing, MPI etc). Also actual graphs of 4- and 8-way SMP performance showed pretty poor scaling for real problems, so the benefits aren't that great compared to going for full MPI, which you have to do for NUMA anyhow.