Until recently, rewriting an established open source project could take years. With LLMs, that's changed. We're rewriting CRIU in Zig, and expect it to be complete in months, not years.
We make Architect: a Kubernetes runtime that hibernates idle pods in place and wakes them in 50ms with TCP connections intact. Five engineers. You'd be the 6th.
If tracing an x86 instruction in the morning and hunting a control-plane race in the afternoon both sound fun, and you insist on measuring rather than guessing, this is the job.
Customers run Architect for workloads where cold starts hurt: real-time voice & video AI agents, long-warming JVM apps, stateful data services that can't be rescheduled cheaply. 1.0 shipped Q1 2026; you'd join mid-way through 2.0. Seed-stage, VC-funded, a few years of runway. Fully distributed across the Americas and Europe.
- Control plane: per-node DaemonSet streaming checkpoints; admission controller resizing hibernated pods in place.
- Networking and migration: eBPF/XDP at line rate; cross-node live migration to production; cross-cloud next.
You're a senior generalist. Years across the stack: assembly to frontends, hardware-near, comfortable in x86. Tests ship with the code, decisions get worked out in writing, and you measure rather than guess. Strong in Go; willing to use Zig, Rust, or C.
Bonus: eBPF/XDP, CRIU, Linux kernel internals, containerd, gVisor, live migration, or public writing in kernel/container/eBPF land. Strong systems depth and the willingness to pick up the rest is enough on its own.
After 15 months & more than 100 million requests served by our Phoenix + PostgreSQL app running on Fly.io, I would be hard pressed to find a reason to complain.
- Some deploys failed, and re-running the pipeline fixed it.
- Early July 2023, 9k requests from Frankfurt returned 503s. Issue lasted 10 seconds.
- While experimenting with machines, after many creations & deletions, one volume could not be deleted. Next day, the volume was gone.
That's about it after 15 months of running production workloads on Fly.io.
I'm sorry to hear that many of you didn't have the best experience. I know that things will continue improving at Fly.io. My hope is that one day, all these hard times will make for great stories. This gives me hope: https://community.fly.io/t/reliability-its-not-great/11253
changelog.com used to be WordPress, then became a Phoenix app because it needed features that were hacky to implement & then manage in WP. It's more of a podcasting platform these days rather than a CMS.
My first Supermicro just turned 9 and it's still running strong, with a fresh install of Ubuntu 20.04 & k3s over the holidays. The second Supermicro turned 5, and has been running FreeBSD all this time like a champ. They are both loft guardians.
A bunch of bare metal hosts run on Scaleway / Online, and different VMs & managed services run in Digital Ocean, Linode, AWS & GCP. I sometimes spin the odd bare metal instance on Equinix Metal (former Packet).
A diverse fleet means that there's always something new to learn and try out. A single large host would make me anxious, as no internet provider or power grid is 100% reliable and available. Also, software upgrades sometimes fail, and things get messed up all the time, which is when I find it most efficient to just start from scratch. A single host makes that less convenient.
Every approach has its pros and cons, which is why my main workstation is a 20 Xeon W with 64GB RAM & 1TB NVME : ). Yes, there is a backup workstation which doubles up as a mobile one meaning that it can work without power or hard internet for almost a day. Options are good ; )
> Does this imply there is a cloud abstract layer that should come
crossplane.io comes closest afaik
> And is k8s the simplest possible abstraction? And if not - what is?
If you are asking about the simplest possible abstraction for container scheduling and orchestration, then I believe Nomad from HashiCorp or Docker Swarm are simpler. As for managed solutions with wide adoption in all types of environments and the largest investment to date, I am not aware of anything on par with K8S.
We are both! I would also add lazy to that paradox. My surname is a letter off, and that's at close as it gets : )
The devil is in the details, there is more to it than dynamic & static content, we are using Fastly, otherwise we couldn't serve all the traffic that we do.
The best part is that it's all public - https://github.com/thechangelog/changelog.com - and we welcome contributions, especially those that simplify our setup without compromising on resiliency and availability. I'm looking forward to yours ; )
K8S is an API that the majority is agreeing on, which is rare. There is a lot of amazing tooling, a staggering amount of ongoing innovation, all built on solid concepts: declarative models, emitted metrics (the /proc equivalent, but with larger scope) and versioned infrastructure as data (a.k.a. GitOps).
For someone that is known as the King of Bash (self-proclaimed) - https://speakerdeck.com/gerhardlazu/how-to-write-good-bash-c... - and after a decade of Puppet, Chef, Ansible and oh wow that sweet bash https://github.com/gerhard/deliver - even if all my workstations and work servers (yup, all running k3s) are provisioned with Make (bash++), I still think that K8S is the better approach to running production infrastructure. The advantage to using simple and well-defined components (e.g. external-dns, ingress-nginx, prometheus-operator etc.) that adhere to a universal API, and are maintained by many smart people all around the world, is a better proposition than scripting in my opinion.
At the end of the day, I'm in it for the shared mindset, great conversations and a genuine desire to do better, which I have not seen before K8S & the wider CNCF. I will go on a limb here and assume that I love scripting just as much as you do, but go beyond this aspect and you will discover that it's more to it than "thin install scripts that deploy containers" (which are not just glorified jails or unikernels).
I thi k you've hit your head on the nail - the point is not just the kubernetes, it's that you can build standard infrastructure on top. Any software can be (in theory) setup with a helm script, configured in a standard way through YAML configmaps rather than some esoteric configfiles or scripts which are diffetent for every piece of software