"Small Twitter clone" can be taken two ways, and this one, I think, is the less interesting of the two. (Not to denigrate the project for what it is; it's a good idea.)
On a one-machine scale, Twitter can be a single Postgres table, or a single ElasticSearch index. At the scale they actually are, it's really a whole lot more.
I'd really love a tutorial—or possibly a whole book—that instead took you through setting up the sort of distributed system that is required to make a Twitter clone run at a Twitter-like scale. Either through IaaS APIs, or on your own with something like OpenStack DevStack.
It could probably start where this tutorial ends—with a one-node system running a Twitter-backend-alike monolith. And then each chapter would increase the scale, introduce a problem the scale causes, and then walk you through adding in an additional component: a message queue; a fragment cache; app-level health checks; a search indexing cluster; distributed logging + request tracing; geographic sharding; multi-master DB replication—in order to solve that scaling problem.
There would also be scale-points that would require changes in the business logic: making IDs globally unique and sortable ala https://github.com/twitter/snowflake; deprecating but keeping around old APIs as new ones are added; "Ball of Mud" refactorings; isolated Enterprise clusters of the app; etc.
Bonus points if later chapters actually go back and rip out solutions that were introduced in earlier chapters—not because they were mistakes, but just because they were right for 10^3/s but not 10^6/s. And bonus bonus points if they assume an SLA that requires that such switchovers occur without downtime.
I really like this idea! It has a very real world feel and I'm not sure there is another course that does what you're talking about.
I think the course is more advanced than my target audience could handle right now, but it might a nice follow up to a bigger web app course.
I'll definitely write this down as an idea for a future course. I'd like to create a series of courses that take people from zero to employable and I think that would be a nice addition. Thanks!
On a one-machine scale, Twitter can be a single Postgres table, or a single ElasticSearch index. At the scale they actually are, it's really a whole lot more.
I'd really love a tutorial—or possibly a whole book—that instead took you through setting up the sort of distributed system that is required to make a Twitter clone run at a Twitter-like scale. Either through IaaS APIs, or on your own with something like OpenStack DevStack.
It could probably start where this tutorial ends—with a one-node system running a Twitter-backend-alike monolith. And then each chapter would increase the scale, introduce a problem the scale causes, and then walk you through adding in an additional component: a message queue; a fragment cache; app-level health checks; a search indexing cluster; distributed logging + request tracing; geographic sharding; multi-master DB replication—in order to solve that scaling problem.
There would also be scale-points that would require changes in the business logic: making IDs globally unique and sortable ala https://github.com/twitter/snowflake; deprecating but keeping around old APIs as new ones are added; "Ball of Mud" refactorings; isolated Enterprise clusters of the app; etc.
Bonus points if later chapters actually go back and rip out solutions that were introduced in earlier chapters—not because they were mistakes, but just because they were right for 10^3/s but not 10^6/s. And bonus bonus points if they assume an SLA that requires that such switchovers occur without downtime.