Polars can do a lot of useful processing while streaming a very large dataset without ever having to load in memory much more than one row at a time. Are there any simple ways to achieve such map/reduce tasks with pandas on datasets that may vastly exceed the available RAM?
Not currently. But I imagine that, if Pandas does adopt Arrow in its next version, it should be able to do something like that through proper use of the Arrow API. Arrow is built with this kind of processing in mind and is continually adding more compute kernels that work this way when possible. The Dataset abstraction in Arrow allows for defining complex column "projections" that can execute in a single pass like this. Polars may be leveraging this functionality in Arrow.