Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

At Arist (YC S20) we've found that we can have our cake and eat it too by using Ruby on Jets, which is a nearly 100% drop-in replacement for Rails that runs on AWS Lambda. Our service sends messages to hundreds of thousands of people at scheduled moments in the day, and traffic is incredibly spikey, so combining the productivity of Rails and the Ruby ecosystem with the cost effectiveness and scalability of Lambda was a no-brainer. It also passed muster recently when we subjected our platform to a comprehensive penetration test.

We also get the benefits of a mono-repo and the benefits of microservices in the same application footprint, because every controller method is automatically deployed as its own independent lambda (this is core to how Jets works), but we're still in the usual Rails mono-repo format we all know and love. Also very strong integration between ApplicationJob and background lambdas has been killer for us.

One thing I've always said is the real difficulties in software development happen at the seams where services connect with other services. This sort of strategy (and particularly, the mono-repo format) minimizes the number of seams within your application, while still giving you the scalability and other benefits of microservices and serverless.



Is AWS lambda really cost effective? It has been many years since I was part of a team that was assessing AWS Lambda as _workers_ but the resource limitations at the time alongside cost calculation made PHP+VMs the cost-effective choice by orders of magnitude.


Its loved because it means your technical costs are always guaranteed to be a % of your total computing needs, and your computing needs should (theoretically) always be proportional to your total company revenue.

From a business standpoint, that's a pretty great pitch.

Is it actually the most cost effective solution? No more so than any other tool. It depends on exactly what you're building, how, and how you measure the cost. AWS can be extremely costly, or cheap, depending on your engineering needs, constraints, and practices.


I have been at exactly one (web) company in my life where the cost of computing resources actually mattered, and compared to bandwidth, even that didn't matter.

The companies I have been at where computing costs did matter were doing extremely specialized, long-running calculations that are inappropriate for lambda.

I'm sure there are rare exceptions, but this all smells like premature optimization. Developer salaries are, 9.99 times out of ten, the cost you want to optimize.


Another area that this workflow really empowers is it frees you up to have infinite and arbitrary staging environments. In our case we have a GitHub action set up that creates a fully deployed version of the app for every PR automatically, and a bot that comments on the PR with a link to the deployment. You can of course do this with any kind of infrastructure, but only with a serverless sort of architecture is it virtually free to do this kind of thing. On the db side, we allocate extremely tiny instances so it is still quite affordable to just have one running for all of our branches. They are automatically destroyed when the PR is closed so the overhead is basically $5/month for each PR that is open for a whole month (in practice they are open for a few days at most).


This is an interesting use case, thanks for sharing although these days, setting up kubernetes environment emulators is supported by most CI systems out of the box.


With a spikey workload, it's almost always worth it because we have to scale up 1000x and then 10 mins later scale back down to 1x for the rest of the day, and this can happen any time of day randomly. You can also do this with load balancers but in practice I've found these to be much slower to scale and much more costly with the same workload.

For reference, 90% of our bill is database related currently.


> 90% of our bill is database related currently.

Huh. With the context of the rest of the comment, I realize (the very obvious comparison) that a database engine designed to shard to many thousands of small workers could potentially be a very attractive future development path.

Iff the current trends in cloud computing (workers, lambda, etc) continues and some other fundamental doesn't come along and overtake.

Which is probably (part of) the reason why this doesn't exist, since I think I've basically just described the P=NP of storage engineering :)


> Huh. With the context of the rest of the comment, I realize (the very obvious comparison) that a database engine designed to shard to many thousands of small workers could potentially be a very attractive future development path.

Yes. I've been patiently waiting for the database community to realize this for the last 5 years now :)


I have no delusions I'd be able to viably make a dent in the area (at least anytime soon), but I do wonder how this would actually work.

The optimal solution would of course be to shard both the compute I/O and the storage footprint, so each worker only needed to hold onto maybe 1-100MB of data.

Perhaps some existing (simpler) designs could be modified to "hyper-shard" the compute angle, but would still likely require carrying around a large percentage of the database.

In any case, you'd need an internal signalling fabric capable of (cost-effectively) handling very bursty many-to-many I/O across thousands of endpoints to make consensus work in realtime.

It would honestly be really interesting to see how something like this would work in practice.


I have worked with many teams and found lambda to be by far more cost effective. Did your calculations include the time lost waiting to deploy solutions while infrastructure gets spun up, the payment for staff or developers spending time putting infrastructure together instead of building solutions, the time spent maintaining the infrastructure, the cost of running servers at 2am when there is no traffic. Perhaps even the cost of running a fat relational database scaled for peak load that needs to bill you by the hour, again even when there is no traffic.

Serverless as an architectural pattern is about more than just Lambda and tends to incorporate a multitude of managed services to remove the cost of managment and configuration overhead. When you use IaC tools like Serverless Framework that are built to help developers put together an application as opposed to provisioning resources, it means you can get things up fast and ready to bill you only for usage and that scales amazingly.


Funnily enough, I have made almost the opposite experience. In my experience IAC and serverless bring all the troubles of dev ops to „regular“ developers. Your plain vanilla mid-level full stack dev now needs to understand not only a bunch about FE & BE code, but also a much bigger bunch about servers, networking, VPCs, etc. than in a non-serverless setup.

How do you resolve this in your projects? (Serious question).

This is such a big problem for some of the projects that they are now only able to hire senior develops (which brings it‘s own set of problems).


But VPCs and networking and distributed computing aren't serverless. Serverless is using AWS Lambdas or GCP functions and not dealing with VPCs beyond having an endpoint to hit.

There's not getting around the networking and such - that's the full part of full-stack (FS) - it's more than simply FE+BE. Maybe call them distributed systems engineers instead.

What it sounds like though, is your organization (regardless of what we call it) is large enough to organize into FE, BE, and FS roles, with FS running the platform and being in charge of the fleet and for them to work on the system itself so that FE and BE can work without having to know about fleet - and FS folk are building internal tools that the rest of the org use to do their job, and to shield them from any of the implementation details your fleet has.


Again though on my framwork-ification point -- in our case, we have a full VPC setup, and jets actually allocates the VPC for us we just configure it in our application.rb file. I'm sure the serverless framework has something similar. Either way, we have gotten away with not having a dedicated devops person or persons because of how much the framework does for you.


Yeah, framework-ification of this process has been the real differentiator in the last 5 years that has taken lambda from being obtuse and glitchy to work with to quite a joy if you just lean on your framework.

All of that said, I vastly prefer google cloud functions personally and would switch to that in a heartbeat if they had capabilities like API gateway but it's not there yet.

I also regret that there isn't a better cross-cloud solution currently, but that's something I have a lot of open source ambitions about creating soon. I don't like serverless that much so stay tuned for something from me in the coming years, probably in Rust.


Deploying a Lambda function with Terraform or Pulumi seems to not require knowledge about servers or networking to me


They bill by the millisecond for usage now, which has helped bring down the cost a ton (at least for web services). And if you don't need fancy stuff, they have a cheaper version of API Gateway now too.

Lambda also supports Docker and is pretty damn fast now, so it's less painful to use.


Most seed stage startups don't even surpass the free tier in my experience, and depending on the product this can be true for a lot of Series A startups as well.


My biggest issue with lambda are these two limitations.

1. You have a maximum of 30 seconds to reply to the request

2. You cannot stream data from lambda through the api gateway. Though you can stream to s3 etc.


Max limit is 15 minutes now.

I agree streaming needs work, but if I had to deal with that I'd probably just use naked endpoints.


Not when triggered by the api gateway, that is still 30 seconds for things triggered from sqs, s3 etc its 15 minutes.


Needing anything more than a few seconds out of API gateway is just bad application design though unless we're talking about websockets or streaming as you said. If it's a webhook, you risk the client hanging up if you do all that work up front (you should instead be scheduling a background lambda that will do it based on the webhook params). If it's an actual user, I'm assuming you're not expecting to have a user wait 30+ seconds for a page to load, so probably some JSON API endpoint with a progress bar on the frontend. In those cases there's really no need to use API gateway as it isn't visible to the user anyway (you could just use your lambda naked in that case).

If you're talking about web sockets, yeah, I feel ya. I haven't had to navigate that situation yet but my plan of action is to bypass API gateway entirely and use a naked lambda running at its full execution time, and when we approach the 15 minute boundary, send a command that resets the connection. Don't know if it would work well but I haven't had a need to use web sockets for any projects I work on for a while.

For streaming, like you said, maybe an elastic beanstalk cluster is more the way to go depending on your workflow. If you can find a way to get it all to work in Lambda though, would probably be a game changer cost-wise I would expect as long as you figure out a way to deal with resetting every 15 mins.


I'd say it's still the case for the majority of the cases. Which is a shame, because I like this idea of having hw limits for your software and being able to efficiently maximise hardware at the datacenter level for redundant, trivial applications (like serving web requests). Most web servers are half idle "just in case" because hardware is cheap and engineers time is expensive.

I migrated some mid business (500-1000 people company) lambda setup to ec2 spot instances and a standard app and the costs dropped.

It would have been even cheaper on something like heztner, but good luck getting a buy-in from the infra team.

I venture most small businesses will have less load than them. Certainly it's not worth it for my tiny side businesses.

I was running some calculations for a project though, and if you "abuse" them with lengthy / expensive calculations run very infrequently (think like a cron job), you may end up being quite cost effective.


We trialed switching over to lambda for our worker queues just last month. We went for $500 a month EC2 costs (auto scaling spot fleet), to $500 a day.


I have always wondered if Jets was ready for production. Have you found the documentation & community to be ready for mainstream?


So what I would say is, we've gone ahead and done a lot of the upfront investment of making Jets much more production-ready than it was a year ago. Pretty much every feature path a typical startup Rails app would hit, we've used in production with jets at this point (and in some cases had to get issues fixed, etc). We also sponsor the jets project at $1k/month, and have a great relationship with Tung, the creator of Jets. We're at a point now where we haven't had to get anything fixed for a few months, and everything is stable and working the way we want.

While we did have to spend some $$ to get things up to snuff (particularly so we could pass SOC 2), this pales in comparison to how much we would have had to spend to do devops and the usual server wrangling to get our use-case up and running in a typical elastic beanstalk sort of situation.

One area of difficulty was Oauth so if you ever need help implementing that in Jets feel free to reach out or read the public issue history of how we got it working.


Hey, really appreciate the reply.


That sounds great!

Is there a blog post about the tech stack? My main concern with serverless/lambda is cold start time. How do you deal with it? What does the p99 latency look like?

Also how do you scale the usual bottleneck which is the database?


Discovered Ruby on Jets in this thread and started watching the video on the project page.

The author addressed the cold start problem here(timestamped) https://youtu.be/a0VKbrgzKso?t=439

Hope it helps :)


I've been meaning to put together a blog entry on our whole stack and will post it on HN probably in the next few months!


nice! All the best to Arist and the team!


There is an auto-warming option -- we keep it warmed up every 5 seconds so it's always super peppy -- requests served within 100ms generally, sometimes much faster. Appdex hovers around 0.996 but some webhooks are included in there so it's probably faster in reality.


DB-wise we utilize the usual postgresql cluster setup with read clones in several regions. We could easily partition by course or by org if we had to but honestly we could probably scale up to series C+ before needing to do that.


Out of interest, have you considered moving to a "serverless" db like Aurora Postgres or even DynamoDb, to avoid the cost of unused database capacity at idle times?


I'd love to, but when I've tried to set up Aurora, it seems impossible to do multi-regional with postgresql (not multi-zonal, but multi-regional). Would love to hear how to do this if anyone has got it working. Last time I tried was about 2 years ago.


Ah ok, interesting - I haven't tried Aurora, my company uses a mixture of RDS postgres and Dynamo. Cockroachdb and Yugabyte also seem like good options but a harder sell for us not being AWS native.

More generally though, all of these "newsql" offerings feel a little too good to be true for me, I can't see how you could really have all the relational integrity of postgres with the elastic scalability of a distributed DB without trading something off. Am I too cynical?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: