More

jdreaver · 2026-04-10T21:20:33 1775856033

https://en.wikipedia.org/wiki/ReStructuredText

This format really took off in the Python community in the 2000's for documentation. The Linux kernel has used it for documentation as well for a while now.

jdreaver · 2025-07-10T14:55:44 1752159344

I recently discovered that `perf` itself can spit out flamegraphs. My workflow has been:

    $ perf record -g -F 99 ./my-program
    $ perf script report flamegraph

You can also run `perf script -F +pid > out.perf` and then open `out.perf` in Firefox's built-in profile viewer (which is super neat) https://profiler.firefox.com

jdreaver · 2025-06-23T01:10:25 1750641025

There are plausible scenarios where a region can go down for days or more at a time, like natural disasters. I'm not terribly worried about a region going away _forever_, but during a regional outage long enough to start losing business, having data in multiple regions is important so you can restore in another region (if you aren't able to fail over quickly).

jdreaver · on Dec 29, 2024

You are correct that storage is cheaper in S3, but S3 charges per request to GET, LIST, POST, COPY, etc objects in your bucket. Block storage can be cheaper when you are frequently modifying or querying your data.

thayne · on Dec 29, 2024

That's a lot of requests.

hansvm · on Dec 29, 2024

It is, but it's not _that_ many. AWS pricing is complicated, but for fairly standard services and assuming bulk discounts at ~100TB level, your break-even points for requests/network vs storage happens at:

1. (modifications) 4200 requests per GB stored per month

2. (bandwidth) Updating each byte more than once every 70 days

You'll hit the break-even sooner, typically, since you incur both bandwidth and request charges.

That might sound like a lot, but updating some byte in each 250KB chunk of your data once a month isn't that hard to imagine. Say each user has 1KB of data, 1% are active each month, and you record login data. You'll have 2.5x the break-even request count and pay 2.5x more for requests than storage, and that's only considering the mutations, not the accesses.

You can reduce request costs (not bandwidth though) if you can batch them, but that's not even slightly tenable till a certain scale because of latency, and even when it is you might find that user satisfaction and retention are more expensive than the extra requests you're trying to avoid. Batching is a tool to reduce costs for offline workloads.

thayne · on Jan 4, 2025

Ok, there are definitely cases where it would be more expensive, like using it for user login data.

But for metrics, like you would use for prometheus:

- Data is usually write-only. There isn't usually any reason to modify the metrics after you have recorded them.

- The bulk of your data isn't going to be used very often. It will probably be processed by monitors/alerts and maybe the most recent data will be shown in dashboard (and that data could be cached on disk or in memory). But most if it is just going to sit there until you need to look at it for an ad-hoc query, and you should probably have an index to reduce how much data you need to read for those.

- This metrics data is very amenable to batching. You do probably want to make recent data available from memory or disk for alerts, dashboards, queries, etc. But for longer term storage it is very reasonable to use chunks of at least several megabytes. If your metrics volume is low enough that you have to use tiny objects, then you probably aren't storing enough to be worried about the cost anyway.

jdreaver · on Dec 11, 2024

Huge +1 to this, but I would also add walking _at least_ 8000 steps per day. I still had some minor, nagging pain until I started walking more. Turns out humans are not meant to sit all day!

I can highly recommend a book called _Built to Move_ [0]. It tells you to do a lot of things that many people consider common sense, like walk every day, eat vegetables, sleep 8 hours, etc. However, it also explains _why_ to do these things pretty concisely. The most impactful argument it made to me was you can't counteract sitting for 12 hours a day with any amount of exercise. You have to sit less and move around more.

[0] https://thereadystate.com/built-to-move/

supersrdjan · on Dec 12, 2024

It appears that the problem is not in sitting too much, but rather in sitting in chairs specifically. Apparently, hunter-gatherer people also spend about 10 hours a day sitting. But they sit on the ground. Or kneel or squat. And they don't have the issues we get from sitting too much:

https://www.pnas.org/doi/10.1073/pnas.1911868117

So... the end-game of ergonomic chairs might be no chair at all.

vladvasiliu · on Dec 12, 2024

Given the tables in the results section, it would seem that the people in the study don't have long periods where they don't move. "average sedentary bout lengths" hover between 15 and 20 minutes.

So the problem with "sitting in chairs specifically" is probably not the chair, but the fact that the chair facilitates longer "sedentary bout lengths". If this is correct, then the commenter suggesting to get up and move every so often is probably on point.

supersrdjan · on Dec 12, 2024

Makes sense. That said, fidgeting and moving around is spontaneous when on the floor, you don’t have to be reminded to do it. Also, no chair is cheaper than an expensive chair.

vladvasiliu · on Dec 12, 2024

> fidgeting and moving around is spontaneous when on the floor, you don’t have to be reminded to do it

Indeed, it's actually what prompted me to go look over the document.

I remember, as a kid, when out and about and before getting into the habit of sitting in a chair all day every day, I would sit on the floor or on random objects, like stones or tree trunks in the countryside. I wouldn't be able to sit still for long periods of time and would need to at least change positions.

Whereas now, in my "ergonomic chair", I can sit for more than one hour at a time with minimal, if any, changes in position. Ditto for my couch (which wasn't marketed as "ergonomic" in any way).

That being said, I've tried using a computer in other positions, like putting the laptop on a coffee table and squatting or sitting on the floor in front of it, or having it rest on my thies while squatting. It gets tiring very quickly, especially in the shoulders and neck area.

So, in my case, what seems to work best is to get up regularly and walk around the room for a bit.

jdreaver · on Dec 2, 2023

Most of my company uses VS Code or IntelliJ IDEs, which are officially supported by our developer productivity team, but many of us use Emacs and Vim (I'm an Emacs user). I spend most of my time in Go, C, Rust, and the plethora of "infrastructure"-related languages like Puppet, YAML, Starlark, Python, bash, SQL, etc. I also sometimes use more of the common languages in our company's stack like Ruby, Java, and Python.

My experience using Emacs at work for the past 15 years has been outstanding. I find that when I join a new company, there is sometimes a bit of legwork getting Emacs working with potentially bespoke SSH, tooling, or VPN configs (for remote development), but once it works I don't touch it. I touch a lot of languages at work, including more I didn't mention above, and not having to leave Emacs to learn a new tool is a huge boon to productivity. I get all the niceties of an IDE via LSP and some other Emacs packages, including autocomplete, code navigation, Github Copilot, and more.

I don't ever tell anyone they _should_ learn Emacs at work, but once in a while someone sees me use it while screen sharing and they get interested.

jdreaver · on May 13, 2023

The Elements of Computing Systems: Building a Modern Computer from First Principles [0] [1]

Easily one of the most interesting and engaging textbooks I've read in my entire life. I remember barely doing any work for my day job while I powered through this book for a couple weeks.

Also, another +1 to Operating Systems: Three Easy Pieces [2], which was mentioned in this thread. I read this one cover to cover.

Lastly, Statistical Rethinking [3] really did change the way I think about statistics.

[0] https://www.nand2tetris.org/

[1] https://www.amazon.com/Elements-Computing-Systems-second-Pri...

[2] https://pages.cs.wisc.edu/~remzi/OSTEP/

[3] https://xcelab.net/rm/statistical-rethinking/

electriccatblan · on May 13, 2023

Would Statistical Rethinking help me interpret web app metrics? E.g. if I have a canary out and the response times are longer after x requests, is that significant?

maxFlow · on May 13, 2023

I've found that Statistics is one of those topics that changes your world view about everything. You can consider pretty much any issue statistically, and that will enrich your perspective significantly. In that sense, Statistical Rethinking will help. However, it's a book on Bayesian stats, it's quite dense, and examples are coded in R. It may be overkill for web app metrics interpretation. For that you may be better served with basic stats & inference, frequencies, descriptive statistics, percentiles, basic distributions, data visualization (e.g., trend lines, scatter plots, boxplots, histograms), etc.

To be clear though, Statistical Rethinking is a beautiful piece of work. You can check out the author's lectures[0] and see how much they suit your needs.

[0] https://www.youtube.com/playlist?list=PLDcUM9US4XdPz-KxHM4XH...

jdreaver · on May 13, 2023

The book is a bit more foundational than that. It teaches you about Bayesian statistics, and discusses (among other things) why the concept of binary yes/no statistical significance is usually not the best way of evaluating a hypothesis with data.

However, for your question specifically, the choice of prior is less meaningful when you have lots of data, and presumably a web app seeing hundreds or thousands of requests per second can gather enough data to determine if the canary has a different latency profile than the deployed version within a few seconds. Also, presumably you would use an uninformed prior for a test like that. If I were trying to prevent latency regressions in an automated deployment pipeline I would just compare latency samples after 1 minute with a t-test or something simple like that.

jdreaver · on April 27, 2023

I agree with basically everything you are saying, except I think a sprinkle of rote memorization can go a long way in some domains. Whenever I read a math book, memorizing definitions and some key theorems helps me apply them in problems. With programming, however, I tend to do zero rote memorization.

jdreaver · on March 18, 2023

That's probably so you can rotate your keys without downtime.

jdreaver · on Jan 18, 2022

> RDS is slow. I've seen select statements take 20 minutes on RDS which take a few seconds on _much_ cheaper baremetal.

I'm sure you observed this, but concluding that RDS is slow as a blanket statement is totally wrong. You had to have had different database settings between the two postgres instances to see a difference like that. 3 orders of magnitude performance difference indicates something wrong with the comparison.

stillicidious · on Jan 18, 2022

You could easily observe this with a cache-cold query performing lots of random IO. EBS latency is on the order of milliseconds, even cheap baremetal nowadays is microseconds

singron · on Jan 18, 2022

Also rds caps out around 20k IOPS. You can hit 1 million IOPS on a large machine with a bunch of SSDs. Imagine running 50 rds databases instead of 1.

It's a huge bummer that EBS is the only durable block storage in aws since the performance is so bad. Has anyone had luck using instance storage? The aws white papers make it seem like you could lose data there for any number of reasons, but the performance is so much better. Maybe a synchronous replica in a different AZ?

jdreaver · on Jan 18, 2022

I've used Aurora and the IO is much better there than on vanilla RDS. Postgres Aurora is basically a fork of postgres with a totally different storage system. Their are some neat re:Invent talks on it if you are interested.

singron · on Jan 18, 2022

We use aurora actually. It's a lot more scalable, but also pretty expensive. The IO layer is multi-tenent, and unfortunately when it goes wrong, you have no idea why and no recourse. I think I've never had a positive experience with AWS support about it either. We've had IO latency go from <2ms to >10ms and completely destroy throughput. Support tells us to try optimizing our queries like we are idiots.