and in this particular case, the trade-off is increased recovery time after potential crashes
it is possible to conduct another benchmark that will measure this time, showing how it increases (in "avg" case -- "normal" TPS, and in "worst"* case -- increased TPS, e.g. massive UPDATE with random IO/access), and collect another set of interesting data and then combine two sets to support decisions
in result, we will choose something like: we decided to use 32 GiB for max_wal_size and 15 min checkpoint_timeout and we know that if we crash in the worst* case, DB will need up to 10 minutes to recover; but benefit is that we have much less disk IO stress during massive writes.
___
*) "worst" here has two levels that show off when writes are massive with random IO pattern of block writes:
- excessive writes from frequent checkpoints
- additionally, more WAL needs to be written due to (if) full_page_writes=on
for each setting, we need to remember trade-offs
and in this particular case, the trade-off is increased recovery time after potential crashes
it is possible to conduct another benchmark that will measure this time, showing how it increases (in "avg" case -- "normal" TPS, and in "worst"* case -- increased TPS, e.g. massive UPDATE with random IO/access), and collect another set of interesting data and then combine two sets to support decisions
in result, we will choose something like: we decided to use 32 GiB for max_wal_size and 15 min checkpoint_timeout and we know that if we crash in the worst* case, DB will need up to 10 minutes to recover; but benefit is that we have much less disk IO stress during massive writes.
___
*) "worst" here has two levels that show off when writes are massive with random IO pattern of block writes:
- excessive writes from frequent checkpoints
- additionally, more WAL needs to be written due to (if) full_page_writes=on