Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> It's really a workflow automation tool,

That's true.

> and the UX for that is actually pretty close to what you would want.

That is so not true. Make has deeply woven into it the assumption that the product of workflows are files, and that the way you can tell the state of a file is by its last modification date. That's often true for builds (which is why make works reasonably well for builds), but often not true for other kinds of workflows.

But regardless of that, a tool that makes a semantic distinction between tabs and spaces is NEVER the UX you want unless you're a masochist.



> Make has deeply woven into it the assumption that the product of workflows are files, and that the way you can tell the state of a file is by its last modification date.

I've always wondered whether Make would be seen as less of a grudging necessity, and more of an elegant panacea, if operating systems had gone the route of Plan 9, where everything is—symbolically—a file, even if it's not a file in the sense of "a byte-stream persisted on disk."

Or, to put that another way: have you ever considered writing a FUSE filesystem to expose workflow inputs as readable files, and expect outputs as file creation/write calls—and then just throw Make at that?


> everything is—symbolically—a file

How are you going to make the result of a join in a relational database into a file, symbolically or otherwise?


On plan 9, you'd do something like:

     ctlfd = open("/mnt/sql/ctl", OREAD|OWRITE);
     write(fd, "your query");
     read(fd, resultpath);


     resultfd = open(resultpath, OREAD);
     read(fd, result);
     close(resultfd);
This is similar to the patterns used to open network connections or create new windows.


And how would you use that in a makefile?


Something like this would be ideal.

    /mnt/sql/myjoin:
      echo "<sql query>" > /mnt/sql/myjoin
It's just representing writing and reading a database as file operations, they map pretty cleanly. Keep in mind that Plan 9 has per process views of the namespace so you don't have to worry about other processes messing up your /mnt/sql.


I think you're missing the point. I have a workflow where I have to perform some action if the result of a join on a DB meets some criterion. How does "make" help in that case?


In that case, it doesn't help much. For Plan 9 mk, there's a way to use a custom condition to decide if the action should be executed:

    rule:P check_query.rc: prereq rules
        doaction
Where check_query may be a small shell script:

     #!/bin/rc

     # redirect stdin/stdout to /mnt/sql/ctl
     <> /mnt/sql/ctl {
            # send the query to the DB
            echo query
            # print the response, check it for
            # your condition.
            cat `{sed 1q} | awk '$2 != "condition"{exit(1)}'
     }
But I'm not familiar with an alternative using Make. You'd have to do something like:

    .PHONY: rule
     rule: prereq rules
          check_query.sh && doaction


> You'd have to do something like:

That's exactly right, but notice that you are not actually using make at all any more at this point, except as a vehicle to run

check_query.sh && doaction

which is doing all work.


It's being used to manage the ordering of that with other rules, and to run independent steps in parallel.


OK. So here's a scenario: I have a DB table that keeps track of email notifications sent out. There is a column for the address that the email was sent to, and another for the time at which the email was sent. A second table keeps track of replies (e.g. clicks on an embedded link in the email). Feel free to assume additional columns (e.g. unique ids) as needed. When some particular event occurs, I want the following to happen:

1. An email gets sent to a user

2. If there is no reply within a certain time frame, the email gets sent again

3. The above is repeated 3 times. If there is still no reply, a separate notification is sent to an admin account.

That is a common scenario, and trivial to implement as code. Show me how make would help here.


I wouldn't; I don't think that make is a great fit for that kind of long running job. It's a great tool for managing DAGs of dependent, non-interactive, idempotent actions.

You have no DAG, and no actions that can be considered fresh/stale, so there's nothing for make to help with. SQL doesn't have much to do with that.


> How are you going to make the result of a join in a relational database into a file, symbolically or otherwise?

A file that represents the temporary table that has been created. Naming it is harder, unless the SQL query writer was feeling nice and verbose.


You would probably need another query language, but that would come with time, after people had gotten used to the idea.

With that said, there are NoSQL databases these days whose query language is easily expressed as file paths. CouchDB, for example.


A very simple approach is to use empty marker files to make such changes visible in the filesystem. Say,

    dbjoin.done: database.db
        sqlite3 $< <<< "your query"
        touch $@


you could mount the database as a filesystem


With respect to Make, does the database (mounted as a filesystem) retain accurate information that Make needs to operate as designed (primarily the timestamps). To what level of granularity is this data present within the database, and what is the performance of the database accessed in this way? Will it tell you that the table was updated at 08:40:33.7777, or will it tell you only that the whole database was altered at a specific time?


You're talking about a theoretical implementation of a filesystem with a back-end in a relational database. The question is only whether the information is available.

Say directories map to databases and files map to tables and views. You can create new tables and views by either writing data or an appropriate query. Views and result files would be read-only while data files would be writable. Writing to a data file would be done with a query which modifies the table and the result could be retrieved by then reading the file -- the modification time would be the time of the last update which is known.

Views and queries could be cached results from the last time they were ran which could be updated/rerun by touching them or they could be dynamic and update whenever a table they reference is updated.


> but often not true for other kinds of workflows.

Examples? I mean, there are some broken tools (EDA toolchains are famous for this) that generate multiple files with a single program run, which make can handle only with subtlety and care.

But actual tasks that make manages are things that are "expensive" and require checkpointing of state in some sense (if the build was cheap, no one would bother with build tooling). And the filesystem, with its monotonic date stamping of modifications, is the way we checkpoint state in almost all cases.

That's an argument that only makes sense when you state it in the abstract as you did. When it comes down to naming a real world tool or problem that has requirements that can't be solved with files, it's a much harder sell (and one not treated by most "make replacements", FWIW).


> Examples?

Anything where the relevant state lives in a database, or is part of a config file, or is an event that doesn't leave a file behind (like sending a notification).


Like, for example?

To be serious, those are sort of contrived. "Sending a notification" isn't something you want to be managing as state at all. What you probably mean is that you want to send that notification once, on an "official" build. And that requires storing the fact that the notification was sent and a timestamp somewhere (like, heh, a file).

And as for building into a database... that just seems weird to me. I'd be very curious to hear about systems that have successfully done this. As just a general design point, storing clearly derived data (it's build output from "source" files!) in a database is generally considered bad form. It also introduces the idea of an outside dependency on a build, which is also bad form (the "source" code isn't enough anymore, you need a deployed system out there somewhere also).


I need to send an email every time a log file updates, just the tail, simple make file:

  send: foo.log
          tail foo.log | email

  watch make send
Crap, it keeps sending it. Ok, so you work out some scheme involving temporary files which act as guards against duplicate processing. Or you write a script which conditionally sends the email by storing the hash of the previous transmission and comparing it against the hash of the new one.

That last option actually makes sense and can work well and solves a lot of problems, but you've left Make's features to pull this off. For a full workflow system you'll end up needing something more than files and timestamps to control actions, though Make can work very well to prototype it or if you only care about those timestamps.

================

Another issue with Make is that it's not smart enough to know that intermediate files may change without those changes being important. Consider that I change the comments in foo.c or reformat for some reason. This generates a new foo.o because the foo.c timestamp is updated. Now it wants to rebuild everything that uses foo.o because foo.o is newer than those targets. Problem, foo.o didn't actually change and a check of its hash would reveal that. Make doesn't know about this. So you end up making a trivial change to a source file and could spend the afternoon rebuilding the whole system because your build system doesn't understand that nothing in the binaries are actually changing.


How would you fix that with your preferred make replacement? None of that has anything to do with make, you're trying to solve a stateful problem ("did I send this or not?") without using any state. That just doesn't work. It's not a make thing at all.


Lisper was replying to the OP who suggested using Make for general workflows. Make falls apart when your workflow doesn't naturally involve file modification tasks.

With regard to my last comment (the problem with small changes in a file resulting in full-system recompilation), see Tup. It maintains a database of what's happened. So when foo.c is altered it will regenerate foo.o. But if foo.o is not changed, you can set it up to not do anything else. The database is updated to reflect that the current foo.c maps to the current foo.o, and no tasks depending on foo.o will be executed. Tup also handles the case of multiple outputs from a task. There are probably others that do this, it's the one I found that worked well for my (filesystem-based) workflows.

With regard to general workflows (that involve non-filesystem activities), you have to have a workflow system that registers when events happened and other traits to determine whether or not to reexecute all or part of the workflow.


I mean you're just describing make but with hashes instead of file modification times. It's probably the most common criticism of make that its database is the filesystem. If file modification times aren't meaningful to your workflow then of course make won't meet your needs. But saying the solution is 'make with a different back-end' seems a little silly, not because it's not useful, but because they're not really that different.

GNU make handles multiple outputs alright but I will admit they if you want something portable it's pretty hairy.


I love Tup, and have used it in production builds. It is the optimal solution for the problem that it solves, viz, a deterministic file-based build describable with static rules. To start using it, you probably have to "clean up" your existing build.

I don't use it anymore, for several reasons. One is that it would be too off-the-wall for my current work environment. The deeper reason is that it demands a very static view of the world. What I really want is not fast incremental builds, but a live programming environment. We're building custom tooling for that (using tsserver), and it's been very interesting. It's challenging, but one tradeoff is that you don't really care how long a build takes, incremental or otherwise.


    send: foo.log
          tail foo.log | email
          touch send


Correct, that works for this example. But if you have a lot of tasks that involve non-filesystem activities you'll end up littering your filesystem with these empty files for every one of them. This can lead to its own problems (fragility, you forgot that `task_x` doesn't generate a file, or it used to generate one but no longer does, etc.).


> you'll end up littering your filesystem with these empty files for every one of them

These files are information just like files that are not empty.


You're misusing make here. This should be a shell script or a program that uses inotify/kqueue, or a loop with sleeps and stat calls.


just make "send" not be a phony target.

How about "touch send"?

Now "touch -t" will allow you to control the timestamp.

md5sum, diff would be your friends.

Anyway, my C compiler doesn't provide that info, anyway.


What about, for example, a source file that needs to be downloaded and diffed from the web? What about when you need to pull stuff from a database? You can hack your way around but it's not the most fun.


WRT the web file, curl can be run and only download a file if the file has been modified after the file on disk.

DB are harder (yet possible) but not a common request that I’ve seen.


curl -z only works if the server has the proper headers set - good luck with that. The point is, it's great to be able to have custom conditions for determining "needs to be updated", for example.


You can always download to a temporary location and only copy to the destination file if there is a difference. You don't need direct support from curl or whatever other tool generates the data.


A language that uses lots of parens to delimit expressions is incredibly bad UX, especially when you try to balance a complex expression, but hopefully there are tools like Paredit to deal with that, so that I can write my Emacs Lisp with pleasure about every day. Similarly, any decent editor will help you out with using correct indentation with Makefiles.

Last modification date is not always a correct heuristic to use, but it's quite cheap compared to hashing things all the time.

Make is a tool for transforming files. I wonder how it's not quite natural and correct for it to assume it's working with files?


> Make has deeply woven into it the assumption that the product of workflows are files

You're referring to a standard Unix tool, an operating system where EVERYTHING is a file.


Sometimes things in workflows are sending/retrieving data over a network. It may be turning on a light. It could be changing a database. Make has no way of recognizing those events unless you've tied them to your file system. Do you really want an extra file for every entry or table in a database? It becomes fragile and error prone. A real workflow system should use a database, and not the filesystem-as-database.


> Sometimes things in workflows are sending/retrieving data over a network. It may be turning on a light. It could be changing a database. Make has no way of recognizing those events

Why should Make violate basic software design rules and fundamental Unix principles? Do you want your build system to tweak lights? Setup a file interface and add it to your makefile. Do you want your build system to receive data through a network? Well, just go get it. Hell, the whole point of REST is to access data as a glorified file.


The filesystem is, and has always been, a database.


But it's not true that everything is a file. A row in a relational database, for example, is not a file, even in unix.


> A row in a relational database, for example, is not a file, even in unix.

Says who? Nothing stops you from creating an interface that maps that row to a file.

That's the whole point of Unix.

Heck, look at the /proc filesystem tree. Even cpu sensor data is available as a file.


Ha, even eth0 is a file! You can open a network connection by opening this file! Erm... no, that doesn't work.

Then a process! You spawn a process by opening a file! Erm... again, no.

You want me to continue?


In 9front you can. You can even import /net from another machine. Bam, instant NAT.


> Nothing stops you from creating an interface that maps that row to a file.

That's true, nothing stops you, though it is worth noting that no one actually does this, and there's a reason for that. So suppose you did this; how are you going to use that in a makefile?


[flagged]


This seem downvoted, but I would second the opinion. If you're capable of representing a dependency graph, you should be able to handle the tabs. If `make` does your job and the only problem is the tabs, it's not masochism, just pragmatism.


HN has a pretty strong anti-make bias. People here would much rather use build tools that are restricted to specific languages or not available on most systems. Using some obscure hipster build tool means it's a dependency. Though these people who are used to using language-specific package manager seem to take adding dependencies extremely lightly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: