I just don't understand how the decision of which bits of a project need rebuild...

tetromino_ · 2025-04-10T14:40:43 1744296043

50 lines in a single .cc source file which is only compiled once to produce the final artifact - sure, easy to handle.

Now consider that you are editing 50 lines of source for a tool which will then need to be executed on some particular platform to generate some parts of your project.

Now consider that you are editing 50 lines defining the structure and dependency graph of your project itself.

mike_hearn · 2025-04-10T16:14:02 1744301642

That rule misses important cases:

• Adding a file. It hasn't been read before, so no tasks in your graph know about it. If you can intercept and record what file patterns a build tool is looking for it helps, but you can't easily know that because programs often do matching against directory contents themselves, not in a way you can intercept.

• File changes that yield no-op changes, e.g. editing a comment in a core utility shouldn't recompile the entire project. More subtly, editing method bodies in a Java program doesn't require the users to be recompiled, but editing class definitions or exposed method prototypes does.

• "Building" test cases.

• You don't want to repeat work that has been done before, so you want to cache it (e.g. switching branches back and forth shouldn't rebuild everything).

Nzen · 2025-04-10T15:09:53 1744297793

If your system is C based, tup [0] fulfills your request by watching the outputs of the specified commands. It isn't, however, appropriate for systems like java that create intermediate files that the developer didn't write [1].

[0] https://gittup.org/tup/ex_a_first_tupfile.html

[1] https://github.com/gittup/tup/issues/113

Back to bazel, I am of the impression that some of its complexity comes from a requirement to handle heterogeneous build systems. For example, some python development requires resolving both python dependencies and C. Being good at either is a bunch of work; but, handling both means rolling your own polyglot system or coordinating both as a second class citizen.

rfoo · 2025-04-10T14:43:44 1744296224

It is well possible by changing 50 lines of code in a 10GB project you have to rebuild the entire project, if everything (indirectly) depends on what you just changed.

taeric · 2025-04-10T16:04:23 1744301063

I mean, easy things are easy to build?

It is not at all uncommon to have changes percolate out into larger impacts than you'd expect, though. Especially in projects that attempt whole program optimizations as part of the build.

Consider anything that basically builds a program that is used at build time. Which is not that uncommon when you consider that ML models have grown significantly. Change that tool, and suddenly you have to rebuild the entire project if you didn't split it out into a separate graph. (I say ML, but really any simple linter/whatever is the same here.)

compiler-guy · 2025-04-10T16:07:18 1744301238

> the parts that need rebuilding are the parts that read those files when they were last built...

...and the transitive closure of those parts, which is where things get complicated. It may be that the output didn't actually change, so you can prune the graph there with some smarts. It may be that the thing changed was a tool used in many other rules.

And you have to know the complete set of outputs and inputs of every action.

And on and on.