I'm looking for resources/tips on how to learn concepts in system design. Things around backend engineering and all the various components that go into building large scale systems (load balancing, message queues, CDNs, etc.). I'd also like to learn concepts around data like database sharding, non-relational database models, streaming/batch (lambda architecture, kappa architecture, etc.) and more.
In terms of what I've done so far...
I've read
- Designing Data Intensive Applications (https://dataintensive.net/) - This book has been recommended a ton on HN/twitter/reddit and I found it extremely useful to get a better idea of how data systems work, different types (OLAP vs. OLTP, data warehouse vs. lake, etc.) and get a better understanding of distributed systems.
- Quastor (https://www.quastor.org/) - This is a newsletter that sends out summaries of system design blog posts from Big Tech engineering blogs. It's been helpful to me for getting an understanding of the different components involved in the tech stacks at various companies and how companies think about building scalable systems.
- MIT Distributed Systems (https://www.youtube.com/playlist?list=PLrw6a1wE39_tb2fErI4-W...) - This is a series of lectures by Robert Morris (co-founder of YC) on distributed systems and their properties. The lectures pick a specific tool/technology (Google File System, ZooKeeper, Apache Spark, etc.) and then discusses it. I've really enjoyed reading the papers and watching the lectures.
I'd really love any recommendations from the HN community on how else I should be learning about system design.
Thank you very much!
(It is pretty hands on.)
After you are done with the initial learning, find an academic machine learning discord or something similar. There will always be people there who will be very happy to find someone to clean their data. It's a great way of getting hands on with data engineering.
System design is best learnt through fires.
A good angle of attack is: pick a certification like AWS SAA or equivalent(AWS, azure, Google cloud. Doesn't really matter. Just pick a mid level certification). Then do the labs. They will quickly point out the holes in your knowledge/understanding. The free tier will take care of your needs, and cloud providers most of the time forgive the first surprise bill(it happens to everyone).
Soon data engineering and systems design merge anyways. This path is like a cheat-code for forcing convergence. Otherwise, the mind will quickly forget most theory.
---
Adding to this:
When you get hands on, you will find how the mind lies to you about how much you know. Mind maps are a nice way to reliably detect holes in understanding. Sit down with paper & pen once a week or so, and make a big map of everything you know.