So I was talking in chat with @kevbonham and other’s about tools that are missing in the Julia ecosystem. It became readily apparent that the ability to monitor workflows(distributed or just script stacks) is mostly a Python game right now. Meanwhile we have people scaling over HPC and doing heavy lifting with Julia! So, think Apache Airflow, but more integrated and geared toward Julia.
I’m making an RFC on a project before it begins because I don’t know what everyone would want from a tool like this. I know what I’d want, I’m thinking:
- A programmatic DAG handler with emitters & listeners. At the very least this should contain stateful information, and be aware of errors.
- A WebUI to monitor these jobs, kill them, reset them, inspect them.
- Should be able to glue/dispatch a variety of languages/tasks through some kind of abstract interface.
- Handle version control of the pipeline, and potentially the data pouring through it (if any)?
What I’m thinking as far as dependancies…
Dr. Watson, heavy use of Distributed.jl, Interact.jl, LightGraphs.jl, MetaGraphs.jl, libgit2.jl?
What I would like to hear about:
What tools are you all finding useful in your Julia workflows?
What tools do you wish you could incorporate into your Julia workflows?
Wanna pitch in once the ball gets rolling on this?
Do you foresee any serious issues here, or should I get the ball rolling now?
I think superficially the core functionality of this is somewhat trivial, its just abstracting from the monumentous efforts you the community has already done to make life easier. Maybe I’m overlooking something though