This sounds interesting. Can you share an example piece of code?

aleksiy123 · on Feb 20, 2023

The general idea is to have a worker/handler which has one or more pure functions to perform some piece of work.

This worker should be stateless. Or if does need state you should put into a centralized location which supports concurrent access (e. g. A DB technically no longer pure).

Your orchestration script takes gives tasks to your workers and collects the result. The key idea is you can horizontally scale the amount of workers because they are stateless.

A simple example of this might be a an orchestrator script that creates a bunch of tasks (e.g. adding two numbers) and pushes it to a input queue. The workers take a task off the queue, and push the result on the result queue. Orchestrator takes the result off and aggregates all the results.

A more complex example is a webserver communicating through RPCs.

dgb23 · on Feb 20, 2023

Not GP but I think the fundamental idea is to think about "who calls who" and have pure functions be called/scheduled by a stateful coordination mechanism.

Usually queues/channels/event loops are involved at the top level especially if you're doing async IO. If you're doing parallel computation then you'd probably use fan-in/fan-out/waitgroup logic at the top that calls into pure functions. The general point is "only this piece of code worries about coordinating state".

For async IO think Go's concurrency primitives, Clojure's core.async or Erlang/Elixir as examples of coordinating state through messages.

For parallelism specifically look at this pretty cool rust crate:

https://lib.rs/crates/rayon

Specifically at the discussion in the FAQ to get a sense of why it's useful to push mutability to the top level:

https://github.com/rayon-rs/rayon/blob/master/FAQ.md

As a counterpoint I'd like to link to a discussion about FP where Martin Thompson wrote:

https://groups.google.com/g/mechanical-sympathy/c/gSkbc3grzN...

> Most FP persistent data structures take the form of a tree and perform a path copy from leaf to root for the change, reusing as much of the tree as possible. What this results in is a single contention point at the root of the tree for CAS operations. Now expand this view to a larger domain model of many entities, e.g. Customers, Orders, Products, etc., all interrelated and that model needs one or more entry points. Each of these entry points become a contention hotspot as the model is mutated. Also with trees and linked lists being the underlying data structures, performance is limited due to indirection causing cache misses.

So really it matters a lot what you're doing. FP is not a panacea for concurrency and parallelism. As an application programmer that does mostly IO and very little heavy calculation etc. it's for example great.

zelphirkalt · on Feb 20, 2023

> > Most FP persistent data structures take the form of a tree and perform a path copy from leaf to root for the change, reusing as much of the tree as possible. What this results in is a single contention point at the root of the tree for CAS operations. Now expand this view to a larger domain model of many entities, e.g. Customers, Orders, Products, etc., all interrelated and that model needs one or more entry points. Each of these entry points become a contention hotspot as the model is mutated. Also with trees and linked lists being the underlying data structures, performance is limited due to indirection causing cache misses.

But the root of a tree is not mutated. Not mutating anything in the whole tree is the whole point. It will be there, until garbage collected (since no one has a reference to it any longer, but only to other versions / other roots). Not sure I am understanding what he is saying correctly.

Maybe he is talking about the kind of "reduce step" which one has to deal with, when all the parallel running functions return results and how to use the results? They might make a new tree-like thing. Sometimes there is no way around a sequential section in the program, since the problem is inherently sequential. But I don't see the contention hotspots.