Julia-based Makefile replacement for research workflows

I’m playing around with DrWatson to organize research workflows. It feels like it would be nice to also have a Makefile in a project repository that defines which scripts produce which output file. The idea would be to simply be able to run make in the project root and that would run all my julia scripts to produce output files, plots, and reports.

As far as I can tell, this is not something that DrWatson has built-in: it can keep track of which script (and which revision) generated a particular output file, but not really the inverse direction, and without any good way to automate the entire project.

One problem with Makefiles that I ran into was that, by default, DrWatson likes to put equal signs in the output files, which Makefiles have problems with. Thus (apart from ugly workarounds), I’d either have to convince DrWatson not to do that, or switch to something other than a Makefile.

Is there some package in Julia that can replace make for these kinds of workflows? Something that takes a data structure defining a map from a target to one or more dependencies, and runs a particular piece of Julia code to create/update the target if it doesn’t exist or is older than the dependency?

This doesn’t have to be a full replacement of make suitable for driving software compilation. Something that can emulate a relatively simple Makefile rule like

plot.pdf: plot.jl data/*.dat
    julia $<  # equivalent to `julia plot.jl`

would be sufficient. I could probably code up something quick and dirty in any particular project, but maybe there is an existing solution? I saw Alternatives to Makefile and shebang scripts?, but all projects mentioned there seem to be abandoned.

A huge added benefit of using Julia for this would also be that the entire “make” can happen in a single julia process, reducing the JIT overhead. Also, it would work on non-Unix systems.

Associated issue: https://github.com/JuliaDynamics/DrWatson.jl/issues/315

Hmm… Makeitso.jl looks like an interesting project in terms of tracking a full dependency graph, but it seems pretty far removed from anything file-based like make.

In the meantime, “quick and dirty” would look something like this src/makerules.jl file, which is then included in a make.jl file in the project root (and which in turn can be run from a traditional Makefile). The AbstractRule/ScriptRule should be easy to extend to more complicated situations. This doesn’t do anything like recursive dependencies. The assumption is that the RULES are ordered by hand so that all prerequisites exist by the time the rule is evaluated. That’s where something like Makeitso.jl would come in, presumably, but I’m not sure it’s worth the trouble.

Makeitso is very much file based in that every target is cached on disk. What you probably mean is the ability to produce source and object files as part of a compilation process as in make. I guess this can be easily achieved in Makeitso.jl by creating those output files as a side effect of the recipe. You would have two files on disk then: one containing the payload and another pilot file that keeps track of the recipe hash and the completion timestamp.

I will give it some though how this can be streamlined (perhaps a decicated macro?).

What I do think is missing in Makeitso.jl is the ability - like in the amazing Pluto.jl - to have the dependency graph automatically created and maintained. But contrary to Pluto, which is reactive, Makeitso or any make-like tool should be imperative. Expensive to build targets are only generated on explicit request of the target or its dependant targets.

Yes, that’s exactly what I mean. Sorry I wasn’t clear.

I guess this can be easily achieved in Makeitso.jl by creating those output files as a side effect of the recipe.

Right, it would certainly be possible to use Makeitso.jl as a “backend” in the code I have in makerules.jl/make.jl. Specially, it would allow discarding the manual order of the rules that I have now, and instead have Makeitso.jl figure out the order based on missing/outdated dependencies, exactly like make does it.

You would have two files on disk then: one containing the payload and another pilot file that keeps track of the recipe hash and the completion timestamp.

That actually seems less than ideal, in this context. I don’t know if there would be a way to tweak Makeitso.jl to use the mtime of the “payload file” for the time stamp

I think there are two reasons not to get rid of the pilot files. (i) If produced variables are still in memory, the timestamp does not require poking files on disk (I know this is not of particular concern in the use case of interest to you), and (ii) the pilot file also stores the recipe hash so it can track invalidation of not only the dependent targets but also of the build script itself.

1 Like

Hello all, have you found a solution? I am about to start using ShowYourWork for the research pipeline (and paper writing) but wanted to know if there is any solution in Julia.