At Arist (YC S20) we've found that we can have our cake and eat it too by using ...

atmosx · on Dec 16, 2021

Is AWS lambda really cost effective? It has been many years since I was part of a team that was assessing AWS Lambda as _workers_ but the resource limitations at the time alongside cost calculation made PHP+VMs the cost-effective choice by orders of magnitude.

ep103 · on Dec 16, 2021

Its loved because it means your technical costs are always guaranteed to be a % of your total computing needs, and your computing needs should (theoretically) always be proportional to your total company revenue.

From a business standpoint, that's a pretty great pitch.

Is it actually the most cost effective solution? No more so than any other tool. It depends on exactly what you're building, how, and how you measure the cost. AWS can be extremely costly, or cheap, depending on your engineering needs, constraints, and practices.

timr · on Dec 16, 2021

I have been at exactly one (web) company in my life where the cost of computing resources actually mattered, and compared to bandwidth, even that didn't matter.

The companies I have been at where computing costs did matter were doing extremely specialized, long-running calculations that are inappropriate for lambda.

I'm sure there are rare exceptions, but this all smells like premature optimization. Developer salaries are, 9.99 times out of ten, the cost you want to optimize.

sam0x17 · on Dec 17, 2021

Another area that this workflow really empowers is it frees you up to have infinite and arbitrary staging environments. In our case we have a GitHub action set up that creates a fully deployed version of the app for every PR automatically, and a bot that comments on the PR with a link to the deployment. You can of course do this with any kind of infrastructure, but only with a serverless sort of architecture is it virtually free to do this kind of thing. On the db side, we allocate extremely tiny instances so it is still quite affordable to just have one running for all of our branches. They are automatically destroyed when the PR is closed so the overhead is basically $5/month for each PR that is open for a whole month (in practice they are open for a few days at most).

atmosx · on Dec 18, 2021

This is an interesting use case, thanks for sharing although these days, setting up kubernetes environment emulators is supported by most CI systems out of the box.

sam0x17 · on Dec 16, 2021

With a spikey workload, it's almost always worth it because we have to scale up 1000x and then 10 mins later scale back down to 1x for the rest of the day, and this can happen any time of day randomly. You can also do this with load balancers but in practice I've found these to be much slower to scale and much more costly with the same workload.

For reference, 90% of our bill is database related currently.

exikyut · on Dec 17, 2021

> 90% of our bill is database related currently.

Huh. With the context of the rest of the comment, I realize (the very obvious comparison) that a database engine designed to shard to many thousands of small workers could potentially be a very attractive future development path.

Iff the current trends in cloud computing (workers, lambda, etc) continues and some other fundamental doesn't come along and overtake.

Which is probably (part of) the reason why this doesn't exist, since I think I've basically just described the P=NP of storage engineering :)

sam0x17 · on Dec 17, 2021

> Huh. With the context of the rest of the comment, I realize (the very obvious comparison) that a database engine designed to shard to many thousands of small workers could potentially be a very attractive future development path.

Yes. I've been patiently waiting for the database community to realize this for the last 5 years now :)

exikyut · on Dec 18, 2021

I have no delusions I'd be able to viably make a dent in the area (at least anytime soon), but I do wonder how this would actually work.

The optimal solution would of course be to shard both the compute I/O and the storage footprint, so each worker only needed to hold onto maybe 1-100MB of data.

Perhaps some existing (simpler) designs could be modified to "hyper-shard" the compute angle, but would still likely require carrying around a large percentage of the database.

In any case, you'd need an internal signalling fabric capable of (cost-effectively) handling very bursty many-to-many I/O across thousands of endpoints to make consensus work in realtime.

It would honestly be really interesting to see how something like this would work in practice.

garethmcc · on Dec 16, 2021

I have worked with many teams and found lambda to be by far more cost effective. Did your calculations include the time lost waiting to deploy solutions while infrastructure gets spun up, the payment for staff or developers spending time putting infrastructure together instead of building solutions, the time spent maintaining the infrastructure, the cost of running servers at 2am when there is no traffic. Perhaps even the cost of running a fat relational database scaled for peak load that needs to bill you by the hour, again even when there is no traffic.

Serverless as an architectural pattern is about more than just Lambda and tends to incorporate a multitude of managed services to remove the cost of managment and configuration overhead. When you use IaC tools like Serverless Framework that are built to help developers put together an application as opposed to provisioning resources, it means you can get things up fast and ready to bill you only for usage and that scales amazingly.

daanlo · on Dec 16, 2021

Funnily enough, I have made almost the opposite experience. In my experience IAC and serverless bring all the troubles of dev ops to „regular“ developers. Your plain vanilla mid-level full stack dev now needs to understand not only a bunch about FE & BE code, but also a much bigger bunch about servers, networking, VPCs, etc. than in a non-serverless setup.

How do you resolve this in your projects? (Serious question).

This is such a big problem for some of the projects that they are now only able to hire senior develops (which brings it‘s own set of problems).

fragmede · on Dec 17, 2021

But VPCs and networking and distributed computing aren't serverless. Serverless is using AWS Lambdas or GCP functions and not dealing with VPCs beyond having an endpoint to hit.

There's not getting around the networking and such - that's the full part of full-stack (FS) - it's more than simply FE+BE. Maybe call them distributed systems engineers instead.

What it sounds like though, is your organization (regardless of what we call it) is large enough to organize into FE, BE, and FS roles, with FS running the platform and being in charge of the fleet and for them to work on the system itself so that FE and BE can work without having to know about fleet - and FS folk are building internal tools that the rest of the org use to do their job, and to shield them from any of the implementation details your fleet has.

sam0x17 · on Dec 17, 2021

Again though on my framwork-ification point -- in our case, we have a full VPC setup, and jets actually allocates the VPC for us we just configure it in our application.rb file. I'm sure the serverless framework has something similar. Either way, we have gotten away with not having a dedicated devops person or persons because of how much the framework does for you.

sam0x17 · on Dec 16, 2021

Yeah, framework-ification of this process has been the real differentiator in the last 5 years that has taken lambda from being obtuse and glitchy to work with to quite a joy if you just lean on your framework.

All of that said, I vastly prefer google cloud functions personally and would switch to that in a heartbeat if they had capabilities like API gateway but it's not there yet.

I also regret that there isn't a better cross-cloud solution currently, but that's something I have a lot of open source ambitions about creating soon. I don't like serverless that much so stay tuned for something from me in the coming years, probably in Rust.

emteycz · on Dec 16, 2021

Deploying a Lambda function with Terraform or Pulumi seems to not require knowledge about servers or networking to me

freeqaz · on Dec 16, 2021

They bill by the millisecond for usage now, which has helped bring down the cost a ton (at least for web services). And if you don't need fancy stuff, they have a cheaper version of API Gateway now too.

Lambda also supports Docker and is pretty damn fast now, so it's less painful to use.

sam0x17 · on Dec 17, 2021

Most seed stage startups don't even surpass the free tier in my experience, and depending on the product this can be true for a lot of Series A startups as well.

vorticalbox · on Dec 16, 2021

My biggest issue with lambda are these two limitations.

1. You have a maximum of 30 seconds to reply to the request

2. You cannot stream data from lambda through the api gateway. Though you can stream to s3 etc.

sam0x17 · on Dec 16, 2021

Max limit is 15 minutes now.

I agree streaming needs work, but if I had to deal with that I'd probably just use naked endpoints.

vorticalbox · on Dec 17, 2021

Not when triggered by the api gateway, that is still 30 seconds for things triggered from sqs, s3 etc its 15 minutes.

sam0x17 · on Dec 17, 2021

Needing anything more than a few seconds out of API gateway is just bad application design though unless we're talking about websockets or streaming as you said. If it's a webhook, you risk the client hanging up if you do all that work up front (you should instead be scheduling a background lambda that will do it based on the webhook params). If it's an actual user, I'm assuming you're not expecting to have a user wait 30+ seconds for a page to load, so probably some JSON API endpoint with a progress bar on the frontend. In those cases there's really no need to use API gateway as it isn't visible to the user anyway (you could just use your lambda naked in that case).

If you're talking about web sockets, yeah, I feel ya. I haven't had to navigate that situation yet but my plan of action is to bypass API gateway entirely and use a naked lambda running at its full execution time, and when we approach the 15 minute boundary, send a command that resets the connection. Don't know if it would work well but I haven't had a need to use web sockets for any projects I work on for a while.

For streaming, like you said, maybe an elastic beanstalk cluster is more the way to go depending on your workflow. If you can find a way to get it all to work in Lambda though, would probably be a game changer cost-wise I would expect as long as you figure out a way to deal with resetting every 15 mins.

jokethrowaway · on Dec 17, 2021

I'd say it's still the case for the majority of the cases. Which is a shame, because I like this idea of having hw limits for your software and being able to efficiently maximise hardware at the datacenter level for redundant, trivial applications (like serving web requests). Most web servers are half idle "just in case" because hardware is cheap and engineers time is expensive.

I migrated some mid business (500-1000 people company) lambda setup to ec2 spot instances and a standard app and the costs dropped.

It would have been even cheaper on something like heztner, but good luck getting a buy-in from the infra team.

I venture most small businesses will have less load than them. Certainly it's not worth it for my tiny side businesses.

I was running some calculations for a project though, and if you "abuse" them with lengthy / expensive calculations run very infrequently (think like a cron job), you may end up being quite cost effective.

PetahNZ · on Dec 17, 2021

We trialed switching over to lambda for our worker queues just last month. We went for $500 a month EC2 costs (auto scaling spot fleet), to $500 a day.

endlessvoid94 · on Dec 16, 2021

I have always wondered if Jets was ready for production. Have you found the documentation & community to be ready for mainstream?

sam0x17 · on Dec 17, 2021

So what I would say is, we've gone ahead and done a lot of the upfront investment of making Jets much more production-ready than it was a year ago. Pretty much every feature path a typical startup Rails app would hit, we've used in production with jets at this point (and in some cases had to get issues fixed, etc). We also sponsor the jets project at $1k/month, and have a great relationship with Tung, the creator of Jets. We're at a point now where we haven't had to get anything fixed for a few months, and everything is stable and working the way we want.

While we did have to spend some $$ to get things up to snuff (particularly so we could pass SOC 2), this pales in comparison to how much we would have had to spend to do devops and the usual server wrangling to get our use-case up and running in a typical elastic beanstalk sort of situation.

One area of difficulty was Oauth so if you ever need help implementing that in Jets feel free to reach out or read the public issue history of how we got it working.

endlessvoid94 · on Dec 17, 2021

Hey, really appreciate the reply.

hu3 · on Dec 16, 2021

That sounds great!

Is there a blog post about the tech stack? My main concern with serverless/lambda is cold start time. How do you deal with it? What does the p99 latency look like?

Also how do you scale the usual bottleneck which is the database?

jcytong · on Dec 16, 2021

Discovered Ruby on Jets in this thread and started watching the video on the project page.

The author addressed the cold start problem here(timestamped) https://youtu.be/a0VKbrgzKso?t=439

Hope it helps :)

sam0x17 · on Dec 16, 2021

I've been meaning to put together a blog entry on our whole stack and will post it on HN probably in the next few months!

hu3 · on Dec 16, 2021

nice! All the best to Arist and the team!

sam0x17 · on Dec 16, 2021

There is an auto-warming option -- we keep it warmed up every 5 seconds so it's always super peppy -- requests served within 100ms generally, sometimes much faster. Appdex hovers around 0.996 but some webhooks are included in there so it's probably faster in reality.

sam0x17 · on Dec 16, 2021

DB-wise we utilize the usual postgresql cluster setup with read clones in several regions. We could easily partition by course or by org if we had to but honestly we could probably scale up to series C+ before needing to do that.

gregplaysguitar · on Dec 17, 2021

Out of interest, have you considered moving to a "serverless" db like Aurora Postgres or even DynamoDb, to avoid the cost of unused database capacity at idle times?

sam0x17 · on Dec 17, 2021

I'd love to, but when I've tried to set up Aurora, it seems impossible to do multi-regional with postgresql (not multi-zonal, but multi-regional). Would love to hear how to do this if anyone has got it working. Last time I tried was about 2 years ago.

gregplaysguitar · on Dec 17, 2021

Ah ok, interesting - I haven't tried Aurora, my company uses a mixture of RDS postgres and Dynamo. Cockroachdb and Yugabyte also seem like good options but a harder sell for us not being AWS native.

More generally though, all of these "newsql" offerings feel a little too good to be true for me, I can't see how you could really have all the relational integrity of postgres with the elastic scalability of a distributed DB without trading something off. Am I too cynical?