Recommended workflow for creating and maintaining Jupyter tutorials. Weave?


#1

I am creating a bunch of Jupyter notebooks for students with code, markdown, and embedded latex, and am wondering what the suggested workflow for this is (including version control, etc.). Hopefully the reasons for this are obvious (i.e. the .pynb files are a pain to write, and of course don’t work well with seeing diffs).

As far as I can tell, is https://github.com/mpastell/Weave.jl the intended solution for this? If so, I could use a little advice:

  • I assume I write things as .jmd files. is appropriate, but https://atom.io/packages/language-weave is confusing me a little on the possible options.
  • For editing, is vscode a good solution?
  • To output notebooks, the http://weavejl.mpastell.com/latest/notebooks/ instructions are a little confusing, but I assume I do something like convert_doc("my.jmd", "myfile.ipynb")
  • Is there a well established way to run a regression test (hopefully with CI) of a .jmd file)?

Sorry if any of these questions are obvious.


#2

The julia VS Code extension supports Weave.jl and .jmd files. Essentially you create a .jmd file in VS Code. You then get full IntelliSense on the julia code in the .jmd file. We also have three weave commands: two that compile the file and show a preview in the editor, and one that saves the file in one of the supported output formats.

But we warned: weave is slow, so all of these commands end up being slow. At some point I’m planning to add a much more interactive workflow, but that is probably quite a bit off.


#3

@davidanthoff Thanks, very helpful and looking forward to getting a better workflow in the future. In the meantime, the syntax highlighting/etc. in your extension works amazing (as always).


#4

The intention for Weave is to produce HTML, markdown, Latex etc. formats directly, but you can use it for converting to (and from) Jupyter notebooks as well. It doesn’t output results from running the code to notebooks because the output from Weave differs from the output of IJulia. If you want to run the code from notebooks as well you could use nbconvert to do it.

I currently mostly use VS Code or Atom with language-weave and Hydrogen (see https://github.com/mpastell/language-weave for setting keyboard to work with Hydrogen). The nice thing about Hydrogen is that it works well with remote kernels.

I CI to run Weave tests on some .jmd documents. The tests just compare current output against a stored reference. I would be happy to see a better way to do it though.


#5

Unfortunately I don’t see a way making Weave to start faster.

I just keep a Julia session open for calling Weave repeatedly when I’m working on a document to get a reasonable workflow.


#6

I think we have some sort of cycling for julia processes that do the weave stuff right now. Don’t remember the details, too long ago that I worked on it :wink:

What I want to add eventually would bypass Weave.jl for the preview in VS Code entirely and essentially just use it to produce final outputs. The idea would be to reuse the VS Code builtin markdown preview code that is fast enough to run on every key stroke, i.e. that gives you a model where you can edit your md file in one column and the preview updates in real time on the right side. Then, whenever there is a code block in the jmd file, keep an internal list of output from code blocks in the VS Code extension, and just show that in the preview. If folks hit something like Ctrl+Enter in a code block, just run that code block, capture the output and then update the cashed list of outputs, so that the preview updates. The idea would be to provide a much more notebook like experience for editing these jmd files.


#7

I just looked at your github (I like it :slight_smile: ) which links to QuanteCon which linked to blog about jupinx (Sphinx extension that converts reStructuredText files (RST files) into Jupyter notebooks).

It seems interesting. (I would prefer RST as more standardized than Markdown, but your choice could be different).

Blog is also referring nbdime (tools for diffing and merging of Jupyter notebooks).

Maybe too many possibilities for us! :stuck_out_tongue:


#8

@mpastell Thanks for the response. If Jupyter notebooks wasn’t one of the primary targets, is there any library or technology targeting Jupyter more directly (or is it simply that it was previously of lower priority for you, but otherwise makes sense within the scope of the package)?

Even if the output is slightly different, is it possible? I am willing to accept that if one of the users reruns the code, it might look different than the old output (as long as it replaces the old output when they <Ctrl-Enter> in Jupyter. Of course, if I used a {julia;term=true} block, that wouldn’t apply.)

What this would entail is taking whatever you would normally generate, and putting it in a output for the cell? For example, if the .jmd had a julia block with a = 2 + 3 in it, then I think you just need to put your results in the outputs for the cell? The following is a snippet of this directly from a jupyter notebook

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = 2 + 3"
   ]
  }
 ],

#9

@Liso Yes, those guys are working on some great RST tools, which I hope to evaluate as well.

The main advantage I can see with markdown is that it is a lot simpler, and RST has a lot of features which are overkill for what is really wanted (i.e. a clean text way to specify jupyter notebooks with embedded markup and math). The other consideration is that the natural markup within Jupyter is markdown.

But I am open to anything with good tooling.


#10

I have a hacky unregistered package for this task: https://github.com/jverzani/WeavePynb.jl that basically does just this.


#11

It is easy to do, but leads to a bad user experience. Sometimes there is no output from Weave when there is output from IJulia and vice versa. Some packages also detect when they are running in IJulia and customize their output (for some I have adapted the in Weave as well).

nbconvert has a feature for executing notebooks:

Jupyter notebooks are often saved with output cells that have been cleared. nbconvert provides a convenient way to execute the input cells of an .ipynb notebook file and save the results, both input and output cells, as a .ipynb file.

So you could first run convert_doc in Weave and the call nbconvert

jupyter nbconvert --to notebook --execute mynotebook.ipynb

or call nbconvert using PyCall.


#12

That sounds very good, I hope you find the time to implement it. It sounds very similar to the R notebook http://rmarkdown.rstudio.com/r_notebooks.html in R studio.

I’m happy using Hydrogen, it gives nice preview of results inline (figure below) or in a dock and allows running whole chunks using "ctrl+alt+enter! (I rarely preview regular markdown either, so that’s not something I personally miss).


#13

I decided that this in scope of the package so I added a notebook function to master. It uses nbconvert to run the code so you need to have it installed and in your path. You should now (after Pkg.checkout("Weave")) be able to use:

using Weave
notebook("doc.jmd")

and get a notebook with executed code as output. This is just very quick version with no tests so feel free to report any issues.


#14

Thanks for all your work on this. Your solution is perfect (and resolves the workflow issues). I did some simple tests (with both the .jmd and .jl formats) and have found no problems. I will post them as I find them.

When you write docs for this, you may want to point out that JuliaPro installations don’t add jupyter into the path by default, but it can be found in /JuliaPro-0.6.1.1/Python/Scripts folders, depending on the installation.

@j_verzani : your https://github.com/jverzani/WeavePynb.jl package looks great too, but I suspect this may provide what you need for the future.


#15

When you are finished with this, is there anyway you can post your guide for students as a blog post? Or make a post on discourse? I would find it very helpful for sharing results.


#16

For sure. It may take a few weeks, but I will make sure something is written up.


#17

I just check .ipynb files directly into git. github has some support for viewing Jupyter notebooks, but its equation rendering is sub-par, so I usually send my students to nbviewer links. e.g. see the notebook links for my linear-algebra class.

Weave doesn’t fit my use-case (teaching) very well. I don’t want to create a single report, I want to post notebooks, one or two per lecture, that my students can download and directly run (on juliabox.com or on their own computers).


#18

You can also install a git hook to strip the notebook output and clean up the .ipynb files a bit before committing them, which can help reduce the version control noise: https://github.com/kynan/nbstripout


#19

Yes, my plan was to write them in .jmd or the annotated julia, and then post them into git directly as the .ipynb for the class. I like your idea of sending them nbconvert links. It wouldn’t even need to be in the same repository, as I will separate my private course git where I create the material from the one I post up for a given year-class.

It really is convenient for the students to access the notebooks. However I can’t seem to get over:

  • The pure hell to write and maintain something in git that is opaque enough to be binary
  • Working in a web browser
  • With .ipynb I am unable to look at the github commits to see if a grad student introduced an error in the algebra when I ask them to make changes to the code

My hope was that in the long-run I could automate the process with CI to get Travis to generate the notebooks, but it isn’t very onerous to simply output and post them to the students for now.

Out of curiosity, doesn’t a single .jmd file per notebook with posting of the .ipynb separately (and a nbconvert link) work? If you need something fancier, I bet the QuantEcon crew creating Jupinx would love some feedback and contributions. It is Sphinx-based, where they are intending to have a single .rST file generate course pages, jupyter notebooks, etc.

Last: thanks for posting your linear algebra notebooks. It would be great to collect a list of github repositories of courses that use Julia.


#20

Don’t know if you have seen https://julialang.org/teaching/ ? At least some of them have github links.