Google Colab compatibility broken

Edit


Google Colab is a great way to run code in the browser with free GPUs, and is amazingly useful for education. By default it uses Python, but there are a few ways to switch the kernel and use Julia instead (or use PyJulia).

However, it seems with the recent Google Colab update, Julia 1.8.4+ installs (e.g., @JohnnyChen94’s with jill, @ageron’s with a manual install, and my own attempt with juliaup) are breaking with the following error:

ERROR: `ccall` requires the compiler
Stacktrace:
 [1] Vector{Pkg.REPLMode.Statement}(#unused#::UndefInitializer, m::Int64)
   @ Core boot.jl:459
 [2] Vector{Pkg.REPLMode.Statement}()
   @ Core boot.jl:478
 [3] collect(itr::Base.Generator{Base.Iterators.Filter{Base.var"#97#98"{typeof(isempty)}, Vector{Vector{Pkg.REPLMode.QString}}}, Pkg.REPLMode.var"#13#14"})
   @ Base array.jl:784
 [4] map(f::Function, A::Base.Iterators.Filter{Base.var"#97#98"{typeof(isempty)}, Vector{Vector{Pkg.REPLMode.QString}}})
   @ Base abstractarray.jl:2961
 [5] parse(input::String)
   @ Pkg.REPLMode fatal: error thrown and no exception handler available.
ErrorException("`ccall` requires the compiler")
unsafe_convert at pointer.jl:59
pointer at strings/string.jl:99
pointer at strings/string.jl:100
codeunit at strings/string.jl:107
getindex at strings/string.jl:227
joinpath at path.jl:316
#showerror#861 at errorshow.jl:90
showerror##kw at errorshow.jl:86
display_error at client.jl:103
No source directory specified.

I was wondering if anybody knows how this error could occur? I’ve done some initial investigations, and Colab definitely has build-essential already installed, so /usr/bin/gcc is there etc. Julia 1.8.3 and older seem to work.

It looks like the last update to Google Colab was announced was in November.

Does starting julia with

LD_LIBRARY_PATH="" julia

help? It sounds like Julia 1.8.0 hangs on startup when LD_LIBRARY_PATH is set · Issue #46409 · JuliaLang/julia · GitHub.

1 Like

Unfortunately this does not seem fix it, but thanks for the tip.

Here’s the modified attempt, using the jill version: Google Colab

# 1. install latest Julia using jill.py
#    tip: one can install specific Julia version using e.g., `jill install -v 1.7`
!pip install jill && LD_LIBRARY_PATH="" jill install --upstream Official --confirm -v 1.8.4
# 2. install IJulia kernel
!LD_LIBRARY_PATH="" julia -e 'using Pkg; pkg"add IJulia"; using IJulia; installkernel("Julia")'
# 3. hot-fix patch to strip the version suffix of the installed kernel so that this notebook kernelspec is version agnostic
!jupyter kernelspec install $(jupyter kernelspec list | grep julia | tr -s ' ' | cut -d' ' -f3) --replace --name julia

Same error, occuring at the jill install step:

----- Post Installation -----
remove downloaded files...
remove /content/julia-1.8.4-linux-x86_64.tar.gz
remove /content/julia-1.8.4-linux-x86_64.tar.gz.asc
Done!
┌ Warning: The Pkg REPL mode is intended for interactive use only, and should not be used from scripts. It is recommended to use the functional API instead.
└ @ Pkg.REPLMode /cache/build/default-aws-shared0-3/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/Pkg/src/REPLMode/REPLMode.jl:379
ERROR: `ccall` requires the compiler
Stacktrace:
 [1] Vector{Pkg.REPLMode.Statement}(#unused#::UndefInitializer, m::Int64)
   @ Core boot.jl:459
 [2] Vector{Pkg.REPLMode.Statement}()
   @ Core boot.jl:478
 [3] collect(itr::Base.Generator{Base.Iterators.Filter{Base.var"#97#98"{typeof(isempty)}, Vector{Vector{Pkg.REPLMode.QString}}}, Pkg.REPLMode.var"#13#14"})
   @ Base array.jl:784
 [4] map(f::Function, A::Base.Iterators.Filter{Base.var"#97#98"{typeof(isempty)}, Vector{Vector{Pkg.REPLMode.QString}}})
   @ Base abstractarray.jl:2961
 [5] parse(input::String)
   @ Pkg.REPLMode fatal: error thrown and no exception handler available.
ErrorException("`ccall` requires the compiler")
unsafe_convert at pointer.jl:59
pointer at strings/string.jl:99
pointer at strings/string.jl:100
codeunit at strings/string.jl:107
getindex at strings/string.jl:227
joinpath at path.jl:316
#showerror#861 at errorshow.jl:90
showerror##kw at errorshow.jl:86
display_error at client.jl:103
No source directory specified.

I see the same error with the manual install version: Google Colab.

Here’s the line where it fails, in cell 1:

    echo "Installing Julia package $PKG..."
    LD_LIBRARY_PATH="" julia -e 'using Pkg; pkg"add '$PKG'; precompile;"'

Note that you have to delete &> /dev/null so the ccall error actually shows up.

Update: it now looks like Google Colab is currently broken on all Julia versions. Even the one from apt-get install doesn’t seem to work. Both jill.sh and juliaup result in Julia binaries which get pointer errors.

Not really answering the question, but as you likely know Julia (and e.g. R) is not officially supported:

We’re aware that users are interested in support for other Jupyter kernels (eg R or Scala). We would like to support these, but don’t yet have any ETA.

I’m not up-to-speed on Amazon Sagemaster, but it seems like a competitor, supports (Python and) R, I found official docs for R, and something mentioning Julia and other languages here:

I can’t confirm Julia is supported there (is it known to work?) officially. I would like Julia to be supported officially at both places, at least to work, as much as you do, but maybe SageMaker works for you, and even better if supported officially, it might put pressure on Google… in case competition works.

so basically, Google not supporting Julia by… never supporting Julia :slight_smile:

I see no reason why this is broken, though, even though it’s not “officially supported.” Google Colab runs Ubuntu 20.04, with sudo access to apt-get, etc. You can install a bunch of different tools on it that are not, per se, officially supported. So why does installing Julia break…? Even the legacy aptitude version of Julia, version 1.4, breaks…

It’s not just the jupyter kernel. Even just installing Julia as a background tool doesn’t work.

GitHub issue: Broken on Ubuntu 20.04 inside Google Colab VM · Issue #48461 · JuliaLang/julia · GitHub

google can do any number of things to the libraries to mess up with JIT, or even deny JIT kind of code from execution.

Colab is not “just” a full vCPU VM running Ubuntu I think.

I really cannot think of other reason why Julia is all of a sudden broken for Ubuntu 20.04

Not sure if the following is a test for this or not, but I just tried with julia --compile=no -O0 and got the same error.

1 Like

What is the context, like a notebook I could look at to see what’s happening, and how it (usually) makes Julia work despite not officially supported?

In this function (is this error misleading? ccall only requires a library, .so, not a compiler?! So this “No source directory specified.” the more helpful part?):

static jl_value_t *eval_value(jl_value_t *e, interpreter_state *s)  

Here is the MWE I linked on the GitHub issue: Google Colab. I wonder if it might be a problem with one of the shared libraries (libc.so?) on the Colab VM being incompatible with the one used to compile the Julia binaries?

I get all errors 1.5, 1.6, 1.7 from there:
http://wiki.seas.harvard.edu/geos-chem/index.php/Other_less-common_errors#Memory_error:_.22munmap_chunk:_invalid_pointer.22

This happens when the pointer passed to (C-library language routine free(), which is called from Fortran routine NULLIFY()) is not valid or has been modified somehow.

I will look into more (and may then edit this post, but send it now, since I have to run, it’s a bit incomplete.

I could confirm your new stactrace/error, but I could also start Julia (got the banner, then hang), with just !julia`, but I would debug third MWE stacktrace:

!julia -e ''

munmap_chunk(): invalid pointer

signal (6): Aborted
in expression starting at none:0
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7f28bd0ea26d)
unknown function (ip: 0x7f28bd0f22fb)
unknown function (ip: 0x7f28bd0f254b)
close_unit_1 at /workspace/srcdir/gcc-12.1.0/libgfortran/io/unit.c:742
close_units at /workspace/srcdir/gcc-12.1.0/libgfortran/io/unit.c:800
unknown function (ip: 0x7f28bd49ff6a)
unknown function (ip: 0x7f28bd0a38a6)
exit at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
main at julia (unknown line)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x401098)
Allocations: 2909 (Pool: 2895; Big: 14); GC: 0

It should at least rule out some Download/curl problem I suspected from your second stacktrace.

GCC 12.1 was released in May 6, 2022, so has Julia for sure been tested with it before, or did Colab just recently update to it (it might be unusual for other software they actually need to rely on libgfortran?)?

I was just looking at these lines, that may be the reason/indicated 12.1, or just a false alarm:

close_unit_1 at /workspace/srcdir/gcc-12.1.0/libgfortran/io/unit.c:742
close_units at /workspace/srcdir/gcc-12.1.0/libgfortran/io/unit.c:800

It’s non-deterministic, sometimes I get free(): invalid size and sometimes (otherwise the stactrace very similar, didn’t read to closely, at least addresses with “unknown function” differ):

free(): invalid pointer

signal (6): Aborted
in expression starting at none:1
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7f4e4141d26d)
unknown function (ip: 0x7f4e414252fb)
unknown function (ip: 0x7f4e41426b2b)
curl_slist_free_all at /usr/local/bin/../lib/julia/libcurl.so (unknown line)
[..]

or:

free(): invalid size

signal (6): Aborted
in expression starting at none:1
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7fa50c95826d)
unknown function (ip: 0x7fa50c9602fb)
unknown function (ip: 0x7fa50c961b3b)
curl_slist_free_all at /usr/local/bin/../lib/julia/libcurl.so (unknown line)
Curl_cookie_loadfiles at /usr/local/bin/../lib/julia/libcurl.so (unknown line)
Curl_pretransfer at /usr/local/bin/../lib/julia/libcurl.so (unknown line)
multi_runsingle at /usr/local/bin/../lib/julia/libcurl.so (unknown line)
multi_socket at /usr/local/bin/../lib/julia/libcurl.so (unknown line)
curl_multi_socket_action at /usr/local/bin/../lib/julia/libcurl.so (unknown line)
curl_multi_socket_action at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/Downloads/src/Curl/Curl.jl:48 [inlined]
curl_multi_socket_action at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/Downloads/src/Curl/Curl.jl:56 [inlined]
macro expansion at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/Downloads/src/Curl/utils.jl:28 [inlined]
do_multi at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/Downloads/src/Curl/Multi.jl:114
#32 at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/Downloads/src/Curl/Multi.jl:131 [inlined]
lock at ./lock.jl:185
#31 at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/Downloads/src/Curl/Multi.jl:128 [inlined]
macro expansion at ./asyncevent.jl:281 [inlined]
#666 at ./task.jl:134
jfptr_YY.666_32825.clone_1 at /usr/local/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/gf.c:2377 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/gf.c:2559
jl_apply at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/julia.h:1843 [inlined]
start_task at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/task.c:931
Allocations: 2908 (Pool: 2895; Big: 13); GC: 0

Installing known registries into ~/.julia
double free or corruption (out)

I at least doubt it’s incompatible libc.so.

curl_multi_socket_action

https://curl.se/libcurl/c/curl_multi_socket_action.html

When the application has detected action on a socket handled by libcurl, it should call curl_multi_socket_action with the sockfd argument set to the socket with the action. […] The curl_multi_socket_action function informs the application about updates in the socket (file descriptor) status by doing none, one, or multiple calls to the socket callback function set with the CURLMOPT_SOCKETFUNCTION option to curl_multi_setopt. They update the status with changes since the previous time the callback was called.

Get the timeout time by setting the CURLMOPT_TIMERFUNCTION option with curl_multi_setopt.

signal (6): Aborted
in expression starting at none:1
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7f3c1db7826d)
unknown function (ip: 0x7f3c1db802fb)
unknown function (ip: 0x7f3c1db81f9f)
curl_slist_free_all at /usr/local/bin/../lib/julia/libcurl.so (unknown line)

It looks like colab uses gcc 9.4.0. The printout

/workspace/srcdir/gcc-12.1.0/libgfortran/io/unit.c:742

is actually (I think) debugging information leaking from the machine used to compile Julia binaries. I remember seeing this issue before, where the Julia compilation directory structure was being printed instead of a relative path.

1 Like

Not sure if that matters but:

  • Need at least Julia 1.8.4 when executing programs compiled with GCC12
  • There are potential issues with system’s and Julia libcurl for some Linux distros #48419

I have now raised the issue on the Google Colab issues page: Colab update breaks compatibility with JuliaLang · Issue #3385 · googlecolab/colabtools · GitHub.

3 Likes

Update: On the Colab issue @ragman-google shared the following workaround, which works (!) as a temporary solution until they are able to patch the current runtime:

  • Open command pallette (bottom left menu → second icon from the bottom)
  • Search for “use fallback runtime version” and hit enter

Colab will now use the older runtime version, which still works with Julia :tada:. So if you maintain any Colab notebooks (@ageron @JohnnyChen94 @jonathan-laurent), you should add temporary instructions at the top of the notebook that users should switch to the old runtime version to get things working. Then, hopefully the current runtime will be patched soon :crossed_fingers:.

2 Likes