Why is this being downvoted? Seems like a fair counter-point to me.

ghostwriter · on Aug 14, 2020

I didn't downvote it, but apart from the fact that async io is not meant to be faster (it's all about throughput, after all), the benchmark is flawed and it's been discussed in full before https://news.ycombinator.com/item?id=23496994

leafboi · on Aug 14, 2020

asyncio is meant to be "faster" for IO heavy tasks and low compute. The benchmark tests requests per second which is indeed directly testing what you expect it to test.

It's been discussed before but the outcome of that discussion (in the link you brought up) was divided. Highly highly divided. There was no conclusion and it is not clear whether the benchmark was flawed.

The discussion is also littered with people who don't understand why async is fast for only certain types of things and slow for others. It's also littered with assumptions that the test focused on compute rather than IO which is very very evidently not the case.

ghostwriter · on Aug 15, 2020

> asyncio is meant to be "faster" for IO heavy tasks and low compute.

the point is that it's not meant to be any faster than a parallel pool of processes that perform the same heavy IO without blocking all requesting clients. asyncio is about packing as many concurrent socket interactions into a single process as possible, hence optimising for throughput by giving up the speed that gets eaten up by task context-switching. Hence the flaw in the becnhmark. The benchmark was run on the same machine where Postgres was operating. The benchmark used different number of processes for sync and async workloads, the connection pool was not setup to prevent blocking when a coroutine tries to acquire a connection from the pool when the pool is exhausted (for benchmark purposes it should not have upper bound limit and to be pre-populated with already established connections).

leafboi · on Aug 15, 2020

>The benchmark used different number of processes for sync and async workloads.

Wrong. Workers amounts are the same. See the chart with benchmark results. http://calpaterson.com/async-python-is-not-faster.html

>The benchmark was run on the same machine where Postgres was operating.

This wouldn't affect the variance between sync and async results very much because both frameworks were run on the same machine.

>the connection pool was not setup to prevent blocking when a coroutine tries to acquire a connection from the pool when the pool is exhausted.

Real world connection pools have an upper bound limit. I don't see why setting an upper bound limit to be closer to reality is not a good test.

Also you're completely wrong about the connection pool blocking when it is exhausted. See source code:

https://github.com/calpaterson/python-web-perf/blob/master/a...

If all connections are exhausted then the system still yields to compute and incoming requests.

>(for benchmark purposes it should not have upper bound limit and to be pre-populated with already established connections).

Disagree. The real world sets an upper bound. There's nothing wrong with simulating this in a test.

ghostwriter · on Aug 15, 2020

> Wrong. Workers amounts are the same.

I see that aiohttp has 5, uwsgi has 16, and gunicorn has 12, 14, and 16 depending on a web-framework, is this your definition of the same?

The author says:

> The rule I used for deciding on what the optimal number of worker processes was is simple: for each framework I started at a single worker and increased the worker count successively until performance got worse.

That's not how benchmark is supposed to be conducted, one doesn't fit workers number to the result that one finds "optimal", one should use the same amount of workers and find bottlenecks and either eliminate them or explain why they cannot be eliminated without affecting benchmark invariants.

> this wouldn't affect the variance between sync and async results very much because both frameworks were run on the same machine.

it will affect the variance. Firstly, because the db will spawn processes on the same machine, pgbouncer will spawn processes on the same machine, they all will compete for the same CPU and the order of preemptive context switches affects individual benchmark runs differently. On top of that, there are periodic and expensive WAL checkpoints, and fsync that competes with a benchmark for the kernel system call interruptions and context switches, and the multi-process workers setup may be affected dramatically. If you don't believe that external processes affect the numbers to the extent they become incomparable, try to serf the Internet with your web-browser randomly while running a benchmark.

> Real world connection pools have an upper bound limit. I don't see why setting an upper bound limit to be closer to reality is not a good test.

Because benchmarks are not real-world workloads, they are designed to show unbound performance of the implementation detail that is selected for a test, where external resources are non-exhaustable for the purpose of avoiding side-effects external to the functionality that is being tested.

> Also you're completely wrong about the connection pool blocking when it is exhausted. See source code: > If all connections are exhausted then the system still yields to compute and incoming requests.

I didn't say that it wouldn't yield, I said that the coroutine will be blocked at the point where it tries to acquire a non-existing connection from the pool, which affects the benchmark. Now, instead of one blocking context switch at a network socket call that queries Postgres, the coroutine will yield AND WAIT twice - at the exhausted connection pool, and at the network socket call after the connection is acquired. This is exactly the reason why resources should be unbound, and why DB should be on a separate machine (unbound spawning of connection processes upon request), and why the number of OS workers should be the same in all benchmarks, because the sync version will also block twice, and the consequence of blocking there will be much more dramatic and different than in the case of async, WHICH IS THE POINT of a proper benchmark - https://github.com/calpaterson/python-web-perf/blob/master/s...