I tend to create sparsebundles to clone my git repos within, to get around the overhead of having huge numbers of inodes on a volume. (Copying, deleting, unpacking archives, Spotlight indexing—all are way slower when you have the worktrees and .git directories from a thousand large repos splayed out across your disk.) So I was a little worried here.
Thankfully, I had manually been setting my sparsebundles back to HFS+ on creation, because I saw no reason to make them APFS containers.
TBH an I/O system where having a file system inside a loopback device on another file system is faster than using said file system directly in the first place sounds kinda broken-ish/poorly scaling to me.
Having Spotlight ignore .git directories and the like is probably wise, I would agree with that. But it's text, even if it's basically garbage text (from a user perspective). So I can understand how a sparsebundle is a decent end-around.
The Finder in general ends up basically being useless for me for similar reasons; I have dozens of random dependency files I don't even recognize pop up in "All My Files".
Spotlight ignores hidden directories (e.g. .git) and directories whose names end in .noindex. You can create a file in one with a unique name and try to mdfind it to verify this.
I'm sure it gets the events. It probably has to walk back up the tree to determine if the file is hidden. Dunno how much work it does. I presume it doesn't actually do the metadata extraction from the files. (But my presumption is based on "surely they wouldn't do that".)
The biggest offender for me when I touch a lot of files is Dropbox. It seems to use a lot of CPU when, e.g., an Xcode update is being installed. I've read that they had to listen to events for the whole volume because the more specific APIs weren't giving them the data they needed, but you'd think they could fast-path the files that were outside their sandbox.
Is your dev drive a platter drive or SSD? I've found that the last few major releases of osx have big performance issues on systems with old-school hard drives. (Frequent beach-balling, etc.)
Honestly, I don't think there's any way to get around it, if you've got an indexer daemon in the mix. It's pretty much the same as trying to store billions of rows in a RDBMS table, except that file systems+metadata indexes don't have any concept of table partitioning.
Space shared APFS volumes inside a container give you the “table partitioning” you want. You can even set them up to have different case sensitivity options. All your dev work in a case sensitive volume for instance and Adobe software on a case insensitive volume on the same space shared container.
True! My disk-image-centered workflow comes from before APFS volumes were a thing; I haven't bothered to re-evaluate it. (It is nice that I can just schlep one of these volumes around by copying one file, rather than waiting for thousands/millions of small files to compress, copying the archive, and then decompressing on the other end, though. Do you know if there's an easy method of doing a block-level export from an APFS volume to a sparsebundle disk image, or vice-versa? That'd be killer for this.)
Well, APFS is much better suited to this kind of workflow. Create a space shared logical volume inside your container and turn spotlight off on that particular volume (and if you’d like, make that volume case sensitive ). There’s no need to separate that out on a diskimage
There's still the problem of Time Machine (or any other backup software you use) needing to do a complete deep scan of the volume to ensure you haven't made any changes. If you know a git repo is effectively in a "read-only" state—something you just keep around for reference, or an old project you might "get back to" a few years from now—it can speed up those backups dramatically to put the repo into some kind of archive. Disk images, for this use-case, are just archives you can mount :)
Thankfully, I had manually been setting my sparsebundles back to HFS+ on creation, because I saw no reason to make them APFS containers.