Does the Cloud Functions solution meet your current performance requirements? If...

dstaley · on April 9, 2019

I haven't launched it, so I'm not sure about performance. My main concerns are cold starts and concurrency. It's my understanding that Cloud Run has higher concurrency per container instance, so my guess would be that Cloud Run would give me fewer cold starts than Cloud Functions. However, since Cloud Run is a generic runtime, I'd imagine that cold starts there would be on the scale of seconds compared to milliseconds for Cloud Functions.

stewart27 · on April 9, 2019

Cloud Functions PM here.

Your intuition around concurrency is correct: Cloud Functions has "per instance concurrency" of 1. Cloud Run lets you go significantly higher than that (default 80). This means that our infrastructure will generally create more instances to handle a request spike when using Cloud Functions vs. Cloud Run.

Creating an instance incurs a cold start. Part of that cold start is due to our infrastructure (generally this is small) but the other part is in your control. For example: if you create a client that takes X seconds to you initialize, your cold start will be at least X seconds. The initialization time will manifest as part of your cold start.

This has a few practical implications:

* writing code for Cloud Functions is generally more straightforward as single concurrency solves many problems regarding shared variables. You may also see some benefits in terms of monitoring/metrics/logging since you only need to think about one request at a time.

* you will likely see a higher incidence of cold starts on Cloud Functions during rapid scale-up, such as in response to a sudden traffic spike

* the impact of a given cold start will depend heavily on what you're doing in your container

* though I haven't validated this experimentally, I would expect that the magnitude of any given cold start (i.e., total latency contribution) would be roughly the same on Cloud Run as Cloud Functions IF you're running the same code

dstaley · on April 9, 2019

Ah, thanks for the details there! So, given that my Cloud Functions project is a Go app (and would be the exact same code between Functions and Run), if I were to run that in a very minimal container (something like Alpine), I could get roughly the same cold start time as Cloud Functions, but fewer of them since I can respond to multiple requests using the same instance.

I'll probably do some experimentation on my end as well to test. Any suggestion how long I should wait between tests to ensure a cold start on both Cloud Functions and Cloud Run?

stewart27 · on April 9, 2019

I think you can force cold starts between your tests by re-deploying your function/container. You could (optionally) leave a small buffer (<1 minute) after the deployment to ensure that traffic has fully migrated.

d_w_b · on April 9, 2019

I spoke too soon. The deploy will bring up an instance instead of your first request. To force a cold start, you could set concurrency to '1' and send two concurrent requests. You should see a log entry such as the following when a new instance starts up: "This request caused a new container instance to be started and may thus take longer and use more CPU than a typical request." Alternatively, you could set up an endpoint that shuts down the server (which will shut down the instance - not advised for production code).

As an aside, the "K_REVISION" environment variable is set to the current revision. You can log or return this value to test whether traffic has migrated to a new version (instead of waiting a minute).

d_w_b · on April 9, 2019

Yes on both accounts.

d_w_b · on April 9, 2019

Disclosure: Cloud Run Engineer

I'd encourage you to test your particular app, but you should expect similar cold start times in Cloud Run.

You can set "Maximum Requests per Container" on container deployment so you are in control whether a container has single concurrency (i.e. "Maximum Requests per Container = 1"). If your app is not CPU-bound and you allow multiple concurrent requests (the default) you should see fewer cold starts.

hn_throwaway_99 · on April 9, 2019

Thanks very much! Could you answer the following questions about cold start times in Cloud Run or point me to a good resource:

1. I think I have a pretty good understanding of what's going on with the lifecycle of Cloud Functions that leads to the cold start times. What happens with Cloud Run? Does it need to download the whole Docker image to a machine to run it? Seems like that would take longer. 2. App Engine has 'warmup requests', which I think are great. Is there any equivalent on Cloud Run, or plan to add? 3. Is the time that an instance is kept warm during idle similar between Cloud Functions and Cloud Run?

Thanks!

asciimike · on April 10, 2019

1. Both cases grab the image and run it. Better per-layer caching (including very aggressive caching of common layers) is coming soon, so stay tuned. 2. No current equivalent, though there are thoughts on exposing more scaling control knobs (e.g. max-instances, min-instances). Max is easy, min is harder because of the cost implications. GAE was billed on "instance hours" but Run is CPU time, so if you go "min-instances=1" you're paying for a VM. Something like Run on GKE (where you're already paying for the compute) probably makes more sense to expose these controls. 3. Yes, though since Run can be multi-concurrent, for certain (most?) load profiles, you're going to have way fewer cold starts because the instance is already handling requests.

hn_throwaway_99 · on April 10, 2019

Just wanted to say thanks to you and the rest of the GCP crew for being all over this thread. Most appreciated!