More

za_mike157 · 2025-07-22T14:19:47 1753193987

Hey! Founder of Cerebrium here.

- Runpod is one of the cheapest but it comes at the price of reliability (critical for businesses) - We have more performant cold start performance with something special launching soon here - Iterating on your application using CPUs/GPUs in the cloud takes just 2–10 seconds, compared to several minutes with Runpod due to Docker push/pull. - Allow you to deploy in multiple regions globally for lower latency and data residency compliance - We provide a lot of software abstractions (fire and forget jobs, websockets, batching, etc) where as Runpod just deploys your docker image. - SOC 2 and GDPR compliant

With that all being said - we are working on optimisations to bring down pricing

benterix · 2025-07-23T09:30:12 1753263012

Thanks, makes sense.

za_mike157 · on Sept 19, 2024

I haven't used SkyPilot so I am unfamiliar with the experience and performance.

However, some of the situations you would like to use Cerebrium over Skypilot are: - You don't want to manage you own hardware - Reduced costs: With serverless Runtime and low cold starts (unclear if SkyPiolet offers this and what the peformance is like if they do) - Rapid iteration: Unclear of the deployment process on SkyPilot and how long projects take to go live - Observability: Looks like you would just have k8s metrics at your disposal

za_mike157 · on Sept 19, 2024

I think we used this UI kit: https://minimals.cc/

za_mike157 · on Sept 19, 2024

I guess then the next question would be how quickly can they start executing your container from cold start when a workload comes in? Typically we see companies on around 30-60s

za_mike157 · on Sept 19, 2024

Do you mean why the individual file names aren't quoted?

You can see an example config file at the bottom of that link you attached - agreed we should probably make it more obvious

mdaniel · on Sept 19, 2024

heh, I don't need an example in the docs, the whole repo is filled with examples, but unless you expect some poor soul to do $(grep -r ^include . | sort | uniq) and guess from there, what I'm saying in that the examples -- including the bare bones one in your documentation -- do not SPECIFY what the glob syntax is. The good thing about standards is that there are so many to choose from, so: python's os.glob, golang's glob, I'm sure rust-lang has one, bash, ... I'm sure I could keep going

As for the quoting part, it's mysterious to me why a structured file would use a quoted string for what is obviously an interior structure. Imagine if you opened a file and saw

  fred = "{alpha: ['beta', 'charlie''s dog', 'delta']}"

wouldn't you strongly suspect that there was some interior syntax going on there?

Versus the sane encoding of:

  fred:
    alpha:
    - beta
    - charlie's dog
    - delta

in a normal markup language, no "inner/outer quoting" nonsense required

But I did preface it with my toml n00b-ness and I know that the toml folks believe they can do no wrong, so maybe that's on purpose, I dunno

za_mike157 · on Sept 18, 2024

Thanks Tom! Excited to to support you and the team as you grow

za_mike157 · on Sept 18, 2024

Thank you - appreciate the kind words! Happy to continue supporting you and the team.

za_mike157 · on Sept 18, 2024

Thank you - updated! My team makes fun of my spelling all the time!

za_mike157 · on Sept 18, 2024

Thanks for pointing that out!

za_mike157 · on Sept 18, 2024

Modal is a great platform!

In terms of cold starts, we seem to be very comparable from what users have mentioned and tests we have run.

Easier config/setup is feedback we have gotten from users since we don't have and special syntax or a "Cerebrium way" of doing things which makes migration pretty easier as well as doesn't lock you in which some engineers appreciate. We just run your Python code as is with an extra .toml setup file.

Additionally, we offer AWS Inferentia/Tranium nodes which offer a great price/performance trade-offs for many open-Source LLM's - even when using TensorRT/vLLM on Nvidia GPU's and gets rid of the scarcity problem. We plan to support TPU's and others in future.

We are listed on AWS Marketplace as well as others which means you can subtract your Cerebrium cost from your commited cloud spend.

Two things we are working on that will hopefully make us a bit different is: - GPU checkpointing - Running compute in your own cluster to use credits/for privacy concerns.

Where Modal does really shine is training/data-processing use cases which we currently don't support too well. However, we do have this on our roadmap for the near future.