How is OSTree working out? Are you still using it?
It seems cool, but for some reason I don't see it mentioned very much, maybe because it's focused more on embedded system image uses, rather than "the cloud" which seems to be more popular on this site?
I would like to see Docker/OCI containers "broken up" into just file formats and just content-addressed data/networking. Not a weird registry with a different local API than remote API. It should be more like git !
Not sure if OSTree fits there -- as far as I remember, it's inspired by git but it's focused on a different use case. This recent project seemed interesting but it's also for more of a machine learning use case: https://news.ycombinator.com/item?id=33969908
OSTree is still working very well for us. At the time I wrote the article we had been using OSTree (and the build system I described in the article) for 4 years; 3 years later not much has changed. In those 3 years we have started building images for different architectures and our build system / OSTree handled it just fine, as you'd expect (it already handled cross-compilation, but for a single architecture).
Our build system also builds Docker containers (for our cloud services) but instead of a Dockerfile we use a custom yaml format that lists the base image (like "FROM" in a Dockerfile), the apt packages to install, and the commands to run (like "RUN"). Then we create a lockfile of all the apt packages, download them (into a local OSTree repository, but that's an implementation detail), and install them with a custom "docker run" command + "docker commit". We end up with the base layer + the apt layer + a single layer produced by concatenating all the RUN commands + a layer with our compiled Rust binaries and Python files. We use apt2ostree to generate the lockfile (really it's our patched version of aptly doing the work) but we use docker (not OSTree) to build the Docker layers. We use Docker's standard push/pull mechanisms to deploy these containers.
To hook this up to Ninja we use "marker files" (a file in the build directory) to track whether this work has been done (e.g. you need to regenerate the apt layer if that layer's "marker file" is older than the lockfile).
def build_docker_container(name):
yamlfile = f"{name}/docker-base-image.yaml"
with copen(yamlfile) as f:
data = yaml.safe_load(f)
ubuntu_version = data["ubuntu_version"]
image = docker_pull("ubuntu:{ubuntu_version}")
image = docker_apt_install(image, data.get("apt_dependencies"),
lockfile="%s/Packages-%s-amd64.lock" % (
name, ubuntu_version),
ubuntu_version=ubuntu_version)
cmd = "..." # RUN commands from yaml
deps = [...] # dependencies from files listed in yaml
return docker_mod(image, cmd, deps)
(Where "copen" is like "open" but it tracks which files have been read by "configure" itself, to detect if we need to re-run "configure").
What I really like about the Python+Ninja combo is that you can pass these targets around as Python variables — you don't have to come up with an explicit filename for each one (the target name is generated by each helper function, e.g. `docker_pull` returns "_build/docker/${name}"). This makes it so convenient to compose these build rules. And if you ever need to debug your build system, everything is very explicit in the generated ninja file.
We don't have a ton of these containers so it has worked well enough for us because the apt layer changes rarely, and the layer above it is small/fast. The main thing we get (that you don't get from vanilla docker/apt) is lockfiles and reproducibility.
I realised this is another example of "trees" as first-class citizens in a build system. In my comment above the tree we're passing around is a docker layer; in my LWN article it's an OSTree ref. We use the former for our cloud containers, the latter for our embedded device rootfs and systemd-nspawn containers.
I suppose we could use systemd-nspawn on our cloud servers too, instead of docker, but when we wrote the build system we were already using docker so it was the expedient thing to do at the time.
I'd be interested in seeing the Python config and Ninja output, to see how it works. Right now it looks to me like the dependencies are more implicit than explicit, e.g. with your copen example
---
The system I ended up with is more like Bazel, but it's not building containers, so it's a slightly different problem (although I guess Bazel can do that now too). But I'm interested in building containers incrementally without 'docker build'.
I made some notes on #containers in the oilshell Zulip about incorrect incremental builds with docker build. It's also slow and has a bunch of ad hoc mechanisms to speed it up
I like the apt lockfile idea definitely ... However I also have a bunch of other source tarballs and testdata files, that I might not want to check into git. How do you handle those -- put them in OSTree?