Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To achieve that it is enough to hash inputs, and cache resulting outputs. Repeating a build from scratch with an emtpy cache would not necessarily have to yield the same hashes all they way down to the last artifact, but that's actually a simplification of the whole process, and not a bad thing per se.


Outputs are used as inputs later. If everything is deterministic, you can actually cache everything by hash


> To achieve that it is enough to hash inputs, and cache resulting outputs.

Thing is, inputs can be nondeterministic too - some programs (used to) embed the current git commit hash into the final binary so that a `./foo --version` gives a quick and easy way for bug triage to check if the user isn't using a version from years ago.


Adding the Git hash is reproducible, assuming you build from a clean tree (which the build script can check). Embedding the current date and time is the canonical cause of non-reproducibility, but that can be worked around in most cases by embedding the commit and/or author date of the commit instead.


This is only a problem if those nondeterministic inputs are actually included in the hash. This is often not the case, because the values are included implicitly in the build rather than explicitly.

(Just playing devil’s advocate here.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: