We've moved a lot of services into Kubernetes and broken things up into smaller and smaller micro-services. It definitely eliminates a lot of the complexity for developers ... but you trade it for operational complexity (e.g. routing, security, mis-matched client/server versions, resiliency when dependency isn't responding). I still believe that overall software quality is higher with micro-services (our Swagger documents serve as living ICDs), but don't kid yourself that you're going to save development time. And don't fall into the trap of shrinking your micro-services too small.
The big trade off is the ability to rewrite a large part of the system if a business pivot is needed. That was the bane of the previous company I worked in, engineering and operations was top notch unfortunately it was done too soon and it killed the company because it could not adjust to a moving market (ie customer and sales feedback was ignored because a lot of new features would require an architecture change that was daunting). It was very optimized for use cases that were becoming irrelevant. In my small startup where product market fit is still moving I always thank myself that everything is under engineered in a monolith when signing a big client that ask for adjustments.
Unique storage for multiple services sounds like a recipe for disaster.
The purpose of splitting services, at least one of, is to decouple parts of the code at a fundamental level, including storage and overall ownership thereof.
You're probably better served with a modular monolith if you really can't break storage up.
No, only one service is reading/writing, everything else just call that. Still, things get quite lost when it involves talking to multiple other teams and needing to keep everything in sync.
Ok, but then what's the point of splitting it in the first place?
The way I see it is to split your domain so that a team owns not only the code, but also the model, the data, the interface and the future vision of a small enough area.
If a service owns all the data, then someone who needs to make any change is bottlenecked by it and they would need knowledge beyond their domain.
So the key is defining the right domains (or domain boundaries). Unfortunately most people just split before thinking about the details of this process, so the split will sooner or later hit a wall of dependencies.
We need synchronous work flow and then asynchronous workflows. That was the primary reason. Now that doesn't mean it must split, but since we're running on multiple hosts anyway it wasn't hard to split off the asynchrounous functions to another batch.
> And don't fall into the trap of shrinking your micro-services too small.
^ this
I think the naming decision of the concept has been detrimental to its interpretation. In reality, most of the time what we really want is a"one-or-more reasonably-sized systems with well-enough-defined responsibility boundaries".
Perhaps "Service Right-Sizing" would steer people to better decisions. Alas, that "Microservices" objectively sounds sexier.
> It definitely eliminates a lot of the complexity for developers
We're currently translating a 20 year old ~50MLOC codebase into a distributed monolith (using a variety of approaches that all approximate strangler). I have far less motivation to go to work if I know that I will be buried in the old monorepo. I can change, build and get a service changed in less than an hour. Touching the monorepo is easily 1.5 days for a single change.
We seem to be gaining far more in terms of developer productivity than we are losing to operational overhead.
Sorry ... I should have said "don't kid yourself that you'll save time" instead of developer time. We do indeed have a faster change cycle on every service which is a win even if we're still burning (in general) the same number of hours over the whole system.
I also should have mentioned that it's definitely more pleasant for those in purely development roles. Troubleshooting, resiliency and system effects don't impact everyone (and I actually like those types of hard problems). I'd also suggest that integrating tracing, metrics, and logging in a consistent way is imperative. If you're on Kubernetes, using a proxy like Istio (Envoy) or LinkerD is a great way to get retries, backoff, etc established without changing coded.
Finally, implementing a healthcheck end-point on every service and having the impact of any failures properly degrade dependent services is really helpful both in troubleshooting and ultimately in creating a UI with graceful degradation (toasts with messages related to what's not currently available are great). I have great hopes for the healthcheck RFC that's being developed at https://github.com/inadarei/rfc-healthcheck.
That’s an encouraging story to hear. The thing I’ve noticed is that the costs of moving a poorly written monolith to a microservice architecture can be incredibly high. I also think that microservice design really needs to be thought through and scrutinized, because poorly designed microservices start to suck really quickly in terms of maintenance.
And also trading off with how easy it is to understand the system. If you have one monolith in most cases it's a single code base you can navigate through and understand exactly who calls who and why.