Is there a Julia package similar to the Python's "ShowYourWork"?

martinmestre · April 14, 2023, 1:44pm

Hello,
I mean this Python package
Thanks.

tim.holy · April 14, 2023, 2:17pm

Which part are you interested in? Unlike Python, out-of-the-box Julia supports reproducible environments, so if that’s your goal, you don’t need a package.

martinmestre · April 14, 2023, 2:33pm

Thanks Tim, I am interested in the part in which you generate a paper PDF with a link to the specific script that generates each plot. For example, in Python the script is included like this:

\begin{figure}
    \begin{centering}
        \includegraphics{figures/mandelbrot.pdf}
        \caption{This is a pretty visualization of the Mandelbrot set.}
        \label{fig:mandelbrot}
        \script{mandelbrot.py}
    \end{centering}
\end{figure}

lmiq · April 14, 2023, 3:02pm

I have used the jlcode package, with some tweaks. Here is an example:

And the repo with how to use the code: GitHub - m3g/jlcode_example: Example of the use of jlcode and the JuliaMono font to write Julia code in LaTeX

martinmestre · April 14, 2023, 5:45pm

Thanks for this resource.

Nathan_Boyer · April 14, 2023, 8:15pm

Here are a few more packages you may be interested in to show output alongside code:

Literate.jl: generate markdown from Julia script
Weave.jl: generate PDF or HTML from Julia script
Pluto.jl: web notebook that can be exported to HTML or PDF

Note that in Julia it is best to put most code in functions rather than scripts to obtain good performance. So instead of referencing a script mandelbrot.py, you would want to write and then call a function plot_mandelbrot(...). (That function definition may be in the same file or may be in a separate included file.)

To take your organization one step further you may want to define all your functions in a package hosted on Github. (There are templates set up to make this easy.) Then you could either just reference the package URL in your output figures, or you could use weave on a small script containing only simple function calls to your package to give more explanation.

MilesCranmer · April 14, 2023, 9:03pm

I think there are some misconceptions in this thread about what ShowYourWork does. I’m one of the maintainers of the package - it does quite a bit more than make code reproducible or generate TeX. While it is written in Python (and JavaScript), it’s actually language independent and you can execute arbitrary scripts at each build step, including Julia code.

It’s basically a way of involving code execution/data processing/etc., into the source of a LaTeX project in a reproducible way (e.g., \variable{mytable.tex} would declare mytable.tex as a node in the data processing graph, and it would be generated by some script).

It uses Snakemake (a modern version of make, but which also allows Python syntax inside the make file) to declare dependencies in the build process, which is supposed to lazily run your entire research pipeline from raw data (which might stored and versioned on zenodo) to processed data (which might be uploaded to cloud as well) to final plots/tables/or even single numbers.

The goal is for the command showyourwork to run your entire research analysis pipeline all the way from raw data to final PDF compilation in a reproducible way. Changing a single version number of any dependency would result in all dependent tasks to be re-run.

SYW also has really nice GitHub actions integration, and will re-generate dependencies of your paper each time something changes. There’s even an action that will generate a latexdiff whenever there is a PR to your SYW-based repo.

It uses conda for version management, which can install specific versions of julia, and you can totally include a {Manifest,Project}.toml and have the Snakemake file re-compute every Julia step whenever those change. (Similar for whatever other languages are used in your analysis).

MilesCranmer · April 15, 2023, 7:13pm

I thought it might also be helpful if I demonstrate how Julia integration would work. Here’s an example repository: GitHub - MilesCranmer/showyourwork_julia_example. Just fork it.

The following modifications were made to the default template:

Defined src/scripts/paths.jl, replacing src/scripts/paths.py (just a convenience file which defines paths when you include() it).
Created a Project.toml to define Julia dependencies.
Created two example scripts in src/scripts/:
- data.jl, to create a dataset and save it to mydata.csv, and
- plot.jl, to plot the dataset and save it to myplot.png.
Created three Snakemake rules:
- julia_manifest creates Manifest.toml from the Project.toml.
- data calls data.jl, and depends on Manifest.toml.
- plot calls plot.jl, and depends on mydata.csv and Manifest.toml.
Configured showyourwork.yml to map .jl to julia.

The Snakefile also defines the JULIA_PROJECT as ".". These three Julia jobs are dependencies of the final rule, which compiles the LaTeX document using tectonic. The generated PDF and arXiv tarball will contain myplot.png.

For example, the rule plot:

rule plot:
    input:
        "Manifest.toml",
        data="src/data/mydata.csv",
    output: "src/tex/figures/myplot.png"
    script: "src/scripts/plot.jl"

This Julia script is then able to reference the variable snakemake:

using Gadfly
using Cairo
using CSV
using DataFrames

input_fname = snakemake.input["data"]
output_fname = snakemake.output[1]

data = open(input_fname, "r") do io
    CSV.read(io, DataFrame)
end

# Plot x vs y:
p = plot(data, x=:x, y=:y, Geom.line)

# Save:
draw(PNG(output_fname, 10cm, 7.5cm), p)

In ms.tex, we can define the corresponding figure as:

\begin{figure}[h!]
    \centering
    \includegraphics[width=0.5\textwidth]{figures/myplot.png}
    \caption{A figure.}
    \label{fig:fig1}
    \script{../scripts/plot.jl}
\end{figure}

Which will add a hyperlink to the script used to generate the figure inside the compiled PDF:

(This hyperlink also refers to the exact git SHA at that point in time!)

martinmestre · April 15, 2023, 9:08pm

Hi Miles, thank you very much for your explanation and Julia example. I like the idea of being able to re-run the whole work pipeline that goes from raw data to the paper PDF. I didn’t know that ShowYourWork is language independent.
If instead of conda I use juliaup for version management, could I still use this package?
Thanks.

MilesCranmer · April 15, 2023, 9:17pm

If instead of conda I use juliaup for version management, could I still use this package?

Definitely. In that example repo I didn’t even use conda. Snakemake will just use the first julia on PATH.

martinmestre · April 15, 2023, 9:24pm

Great, thanks!

lrnv · April 16, 2023, 12:52pm

Isn’t this workflow exactly what make is made for ? I think that latexmk allows very good integration with makefiles, and actually can do all this under the hood while still producing auto-updating preview pdf.

More precisely, how is showyourwork a progress over a latexmk + make workflow ?

MilesCranmer · April 16, 2023, 6:19pm

make is designed for compiling programs, but snakemake is designed for data analysis workflows. See more on the snakemake docs. (You could try to do complex data analysis with make but it would not be fun time, especially if you need to allocate cluster resources or cloud services for running expensive steps of your workflow.) Snakemake also has native support for Julia/Python/Rust/Bash scripts which is a nice bonus.

Also, rather than latexmk, showyourwork uses tectonic, which is a modern self-contained LaTeX engine. (Again: to emphasize reproducibility)

So at its core SYW is an integration of snakemake and tectonic. But it does more than that. There’s a good diagram on the docs which has some of the features (this diagram should be updated to include julia/rust logos for the “scripts” as well, since I think it might confuse people)

The overleaf + zenodo + github integration is handled by SYW, and the final PDF’s figures are tagged with hyperlinks to the analysis script which produced each figure.

Of course if you are just writing a theory paper you don’t need any of this, it’s more for if you want a reproducible way to work with versioned datasets & versioned analysis pipelines & versioned papers.

The GitHub integration is emphasized a lot, with specific actions to build a version of your paper at each commit (both PDF and arXiv tarball), build a latexdiff version for pull requests, etc. For example, if you look at my demonstration pull request here: [demonstration] Change text and change sin to cos by MilesCranmer · Pull Request #1 · MilesCranmer/showyourwork_julia_example · GitHub, you will see that there is a PDF showing the highlighted changes in the paper:

Vanilla snakemake + tectonic is a good option too if you don’t want the other stuff. (But even when I don’t need it, I tend to prefer SYW because of all the automation and features)

lrnv · April 16, 2023, 7:57pm

I do the same GitHub actions + latexdiff on tagged versions and PR on my papers, this is a very helpful workflow indeed ! Especially, to send back to reviewers after the first round to show them exactly what changed.

But I do that “manually” by writing my actions, makefile and stuff myself (well, I reuse and upgrade them from one paper to the next). I guess that delegating all this management to a purpose-built tool is indeed a very good idea.

You made a fairly good propaganda, I might try SYW it on my next project !

Hyperlinks to scripts that made the figures looks like a very interesting idea.

Ps: is there a way to make SYW use something else than tectonic (at least locally) ? Having to download packages on the fly might cause issues when offline. Or maybe tectonic can be told to download packages from a local repo ?

MilesCranmer · April 16, 2023, 8:26pm

I should be clear that I didn’t create SYW; I just like it enough that I help maintain parts of it. I get nothing out of it if I convince you to use it

Not that I know of. But tectonic allows you to set a custom package proxy so you could download CTAN yourself beforehand and link to it? But if you’ve ever previously used a certain a package it will be cached. ie., this is the same as how all other modern package managers work including Julia (it’s really LaTeX that is the weird one).

mocalvao · April 18, 2023, 10:52am

Could you please compare it with DrWatson.jl?

MilesCranmer · April 19, 2023, 8:13pm

They have the same driving motivation of reproducibility, but they fit into the scientific workflow at different stages: DrWatson.jl while performing the research, and ShowYourWork when presenting it. I think you could even use them together.

For example, I think you could specify a certain DrWatson.jl-versioned simulation in the ShowYourWork Snakemake file, and it could query DrWatson.jl for the raw data when running the analysis and compiling the results into the paper.

(Would be cool to even have a simple plugin between them)

MilesCranmer · May 1, 2023, 10:57am

Made a new overview figure, with Julia included

(The 4 languages listed are the ones that have config variables forwarded by Snakemake automatically, but any ol’ shell script will work too.)

Topic		Replies	Views
Workflow to produce dynamic documents? General Usage juno	2	684	March 14, 2017
Is there anything similar to Rstudio? New to Julia	6	3044	December 20, 2016
Can someone summarize state-of-the-art workflows for generating presentable finished documents? Tooling question , jupyter , plotting , blog-post , pluto	14	2074	May 19, 2021
Literate.jl announcement Package Announcements	47	6545	May 4, 2020
Recommended workflow for creating and maintaining Jupyter tutorials. Weave? General Usage	20	4612	January 15, 2018

Is there a Julia package similar to the Python's "ShowYourWork"?

Related topics