Working with small datasets in memory can be fast, but larger datasets consumes more memory and cpu.
The purpose of a database is to efficiently store and retrieve data from disk — and limiting it to only the data you need.
Most database interactions are also over a network, which is always slower than (and in addition to) disk retrieval. This should not be done except when you cannot fit the data on disk (or work with it in memory) on the same system.
Services should not be separated except for the same reasons (exceeding computation or memory) for the same reasons (network, disk latency).
We used to perform billions of complex computations in seconds reading from slow disks and slow systems with much smaller memory footprints on single systems with single cpus.
This article is a parable for the consequences of not understanding that.
The Google and Amazon and Netflix etc white papers are about organizations that build solutions for applications that cannot fit on disk or in memory or be handled by the computation of a single system — and do not apply to 99% of application architectures that use them, including those developed by Google, Amazon, etc.