Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Resources on learning System Design (back end/data engineering)?
116 points by LewisVerstappen on July 7, 2022 | hide | past | favorite | 19 comments
I'm looking for resources/tips on how to learn concepts in system design. Things around backend engineering and all the various components that go into building large scale systems (load balancing, message queues, CDNs, etc.). I'd also like to learn concepts around data like database sharding, non-relational database models, streaming/batch (lambda architecture, kappa architecture, etc.) and more.

In terms of what I've done so far...

I've read

- Designing Data Intensive Applications (https://dataintensive.net/) - This book has been recommended a ton on HN/twitter/reddit and I found it extremely useful to get a better idea of how data systems work, different types (OLAP vs. OLTP, data warehouse vs. lake, etc.) and get a better understanding of distributed systems.

- Quastor (https://www.quastor.org/) - This is a newsletter that sends out summaries of system design blog posts from Big Tech engineering blogs. It's been helpful to me for getting an understanding of the different components involved in the tech stacks at various companies and how companies think about building scalable systems.

- MIT Distributed Systems (https://www.youtube.com/playlist?list=PLrw6a1wE39_tb2fErI4-W...) - This is a series of lectures by Robert Morris (co-founder of YC) on distributed systems and their properties. The lectures pick a specific tool/technology (Google File System, ZooKeeper, Apache Spark, etc.) and then discusses it. I've really enjoyed reading the papers and watching the lectures.

I'd really love any recommendations from the HN community on how else I should be learning about system design.

Thank you very much!



Beyond the theory, if you are having trouble getting your hands dirty, here is the path I had taken for data engineering:

(It is pretty hands on.)

After you are done with the initial learning, find an academic machine learning discord or something similar. There will always be people there who will be very happy to find someone to clean their data. It's a great way of getting hands on with data engineering.

System design is best learnt through fires.

A good angle of attack is: pick a certification like AWS SAA or equivalent(AWS, azure, Google cloud. Doesn't really matter. Just pick a mid level certification). Then do the labs. They will quickly point out the holes in your knowledge/understanding. The free tier will take care of your needs, and cloud providers most of the time forgive the first surprise bill(it happens to everyone).

Soon data engineering and systems design merge anyways. This path is like a cheat-code for forcing convergence. Otherwise, the mind will quickly forget most theory.

---

Adding to this:

When you get hands on, you will find how the mind lies to you about how much you know. Mind maps are a nice way to reliably detect holes in understanding. Sit down with paper & pen once a week or so, and make a big map of everything you know.


But why data engineering here?


Many modern systems are built around data. So, data engineering and systems engineering are very close.


It's mostly personal bias and circumstances.

Data engineering and systems design are incredibly close. After some time, they completely mesh together.

And from personal experience, the knowledge-volume to become practically potent is actually smaller from data engineering side (than from normal software development side).

I've found it easier to get data gigs and simultaneously strengthen the knowledge.

(I love databases, so that's also a factor)

The bonus $$$$ is also a nice side effect.


Really fantastic advice. Thank you very much!


You're welcome.

Also, keep in mind that the first month could be incredibly frustrating. (it was for me).

Once you understand the industry meta, until you learn the tools, it may turn into an exercise of frustration management.

But it is also a lot of fun.

All the very best.


I like the series "Software Architecture Monday" by Mark Richards

https://www.youtube.com/playlist?list=PLdsOZAx8I5umhnn5LLTNJ...

Currently 140 videos, each a 5 to 15 minutes bite sized introduction to different software architecture topics from specific patterns to more conceptual and management things. I especially like, that it is pretty much completely free from the usual YouTube clickbait nonsense ala "If you DONT use X NOW, you are DOOMED!!!"


Here are few more:

1. http://highscalability.com/

2. https://www.infoq.com/architecture-design/presentations/

3. https://blog.bytebytego.com/ - the book from same author is a good starting point too

4. https://gist.github.com/vasanthk/485d1c25737e8e72759f

General search on Google & YouTube also will get you ton of materials.


Along with Designing Data Intensive Applications, this is my favorite resource:

https://github.com/donnemartin/system-design-primer


You've read about most of the concepts if you've gone through that material.

> How else I should be learning about system design.

I'd recommend putting that reading to use and make some kind of an architecture astronaut system with it to see if you truly understand it.


This guy is the only one on youtube I like. The rest often take the answer for granted.

https://www.youtube.com/c/SystemDesignInterview


Yes, this guy is awesome, but has stopped adding content.


DDIA is a good book and I've read it at least 3 times, but there is no substitute for real world experience.

I also think a lot of system design material is oddly positioned in that it caters towards synthetic interview scenarios where you are designing mega scale systems but somehow are also the single person responsible for all of load balancing, CDNs, message queuing, databases etc ...

In the real world it doesn't really work for that way for product engineers.

It's good to know about that stuff but there's lots of other topics that would have higher ROI like learning how to plan and sequence work properly, how to reduce different kinds of risk during build and deploy, how to migrate complex, existing systems towards better architectures without disrupting the business, how to actually become someone who gets to design systems ...


Are there any resources you’d recommend for learning the high ROI items you mentioned?


I find the book System Design Interview by Alex Xu very helpful.

It's focused on passing System Design Interviews, but I found it to be a great primer on System Design in general.

He also has a new website to teach you System Design Fundamentals at ByteByteGo[0].

--

[0]: https://bytebytego.com/


I think the next step after reading that material is to just get experience in it.

Consider applying to some software companies that are massive in scale. AWS, GCloud, Azure, Netflix. People whose systems need good design.

Once you get into a place like that, go after positions on teams that own huge systems, and get mentorship.


Small plug for what I am working on https://architecturenotes.co which is designed to help with that you are try to learn.


https://dataintensive.net/ covers a huge number of tradeoffs involved in handling data and is my top technical book recommendation, glad to see it on your list





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: