Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Funny you say this. At my last job I managed a 1.5TB perforce depot with hundreds of thousands of files and had the problem of “how can we speed up CI”. We were on AWS, so I synced the repo, created an ebs snapshot and used that to make a volume, with the intention of reusing it (as we could shove build intermediates in there too.

It was faster to just sync the workspace over the internet than it was to create the volume from the snapshot, and a clean build was quicker from the just sync’ed workspace than the snapshotted one, presumably to do with however EBS volumes work internally.

We just moved our build machines to the same VPC as the server and our download speeds were no longer an issue.




When you create an EBS volume from a snapshot, the content is streamed in from S3 on a pull-through basis. You can enable FSR which creates the EBS volume with all the data up front, but it is an extra cost option.


Yeah, this is exactly my point. Despite provisioning (and paying for) io1 ssd’s it doesn’t matter because you’re still pulling through on demand over a network connection to access it.

It was faster to just not do any of this. At my current job we pay $200/mo for a single bare metal server, and our CI is about 50% quicker than it was for 20% of the price.


Hmm I don't know that making a new volume from a snap should fundamentally be faster than what a P4 sync could do. You're still paying for a full copy.

You could have possibly had existing volumes with mostly up to date workspaces. Then you're just paying for the attach time and the sync delta.


> I don't know that making a new volume from a snap should fundamentally be faster than what a P4 sync could do. You're still paying for a full copy.

My experience with running a c++ build farm in the cloud is that in theory all of this is true but in practice it costs an absolute fortune, and is painfully slow. At the end of the day it doesn’t matter if you’ve provisioned io1 storage; you’re still pulling it across something that vaguely resembles a SAN, and that most of the operations that AWS perform are not as quick as you think they are. It took about 6 minutes to boot a windows ec2 instance, for example. Our incremental build was actually quicker than that, so we spent more time waiting for the instance to start up and attach to our volume cache than we did actually running CI. The cost of the machines was expensive that we couldn’t justify keeping them running all day.

> You could have possibly had existing volumes with mostly up to date workspaces.

This is what we did for incremental builds. The problem was when you want an extra instance that volume needs to be created. We also saw roughly a 5x difference in speed (IIRC, this was 2021 when I set this up) between a noop build on a mounted volume and a noop build that we had just performed the build on.


I used to use fuse and overlayfs for this, I’m not sure it still works well as I’m not a build engineer and I did it for myself.

Its a lot faster in my case (little over 3TiB for latest revision only).


There’s a service called p4vfs [0] which does this for p4. The problem we had with this at the time was that unfortunately our build tool scanned everything (which was slow in and of itself) but that caused p4vfs to pull the file anyway. So it didn’t actually help.

[0] https://help.perforce.com/helix-core/server-apps/p4vfs/curre...


VMware?


What about it?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: