I am curious how feasible it is to have colab used for a set of notebooks with a “closed” set of packages precompiled into a system image (i.e. for a particular set of lecture notes). For now lets say that I don’t care about fancy TPU or GPU stuff.
A few questions and confirmations if anyone has exposure to this:
Is anyone else working on this problem and interested in working together on the technical solution?
What are the minimalist sets of examples (with the easiest instructions) to look at for Julia 1.1 for using colab. I found GitHub - JuliaTPU/XLA.jl: Julia on TPUs but it seems very specialized.
My understanding from reading the other posts on using colab with julia is that the container disappears every-time the “Reset Runtime…” is clicked. But if I do not, and then I load a notebook in a separate window, does it leave the container up and running and use the image for these notebooks?
Do the colab notebook have any direct mounting of the users Google Drive? (i.e. could stuff be cached there)
Nope. The question is whether a reasonably easy solution could be found for installing into the kernel and running it. bringing a tarball of a julia image down is easy and fast, but the precopilation/etc. takes forever.
You can probably get a good solution for your situation by building Julia with your desired pre-baked system image, then packaging the entire resulting Julia installation as a tarball and hosting it somewhere. Bringing that tarball into a Colab VM should be fairly fast (we do something like that here to support nightly Swift toolchains that are more recent than the one included in the VM).
Some Colab background that might be helpful:
Notebooks are .ipynb files that additionally include a kernel string (Python 2, Python 3, Swift, Julia, etc.) and a VM type (CPU, GPU, or TPU) in their JSON metadata.
Running a cell in a notebook that doesn’t yet have a VM connected causes two things to happen:
The system checks if a VM of the requested hardware type is already running under your account and launches one if it isn’t.
It then requests that the VM launch a new kernel instance for the requested language. If the VM’s Jupyter server doesn’t have a kernel registered for that language, it launches a Python kernel.
The CPU/GPU/TPU VMs come preinstalled with three Jupyter kernels (Python 2, Python 3, and Swift) and a number of packages for those languages (many common Python pip packages, and the Swift for TensorFlow toolchain).
It’s possible to use things like IPython’s line magic (e.g., !apt-get install ...) to run arbitrary shell scripts/installations in a VM connected to a notebook.
These installations will persist for the lifetime of the VM, which varies but is often around a day.
Somewhat confusing things, the Colab UI often refers to both kernels and VMs as “runtimes” (e.g., “restart runtime” means “restart kernel” while “reset runtime” means “reset VM”).
Then I opened up a preexisting juila notebook and it worked. If you open up a new web browser window, should it reuse the existing VM (i.e. is it associated with a particular session or the user account at a particular point).
Are there any major “gotchas” of Jupyter features which work don’t work very well in colab?
This was actually changed very recently (within the past day or so). You now get a nice pop up that tells you that the requested kernel isn’t installed. Maybe we should ask colab to let us put a link to an installation notebook for the relevant kernel into that error message for now (Not sure they want to encourage the proliferation of hacking custom kernels onto colab, but it’d be a fairly low-tech approach to clear up the confusion for how to get the custom kernel).