Execution of notebook in Literate.jl not working but notebook executes fine in Jupyter: serialization issue

I have a julia script that executes fine and I can generate a Jupyter notebook with Literate.jl fine, if I use execute=false. If I open the generated notebook in Juptyer it executes fine. But if I specify execute=true in Literate execution fails. The problem occurs at the point where I am deserialising an MLJ model stored as a JLSO file. I get

[warn | JLSO]: UndefVarError: ##302 not defined

and the deserialised object is apparently not useable, throwing the exception caught by Literate.

Unfortunately this does not happen with every MLJ model and I’m struggling to isolate the issue. Before proceeding too far, I wanted to check if someone had some idea about what might be going on.

My models are not built using macros.

@fredrikekre @oxinabox @Pangoraw

If you can post some code that reproduces it it would be much simpler to debug.

@fredrikekre Thanks for the offer of help. I’ve pared down my original script which is copied below.

Some observations:

  • The plain booster model (from EvoTrees.jl) deserialises fine
  • If booster is wrapped in TunedModel (to give tuned_booster) the wrapped model fails to deserialise
  • If I substitute the booster model for a simpler decision tree (from MLJDecisionTreeInterface.jl) the wrapped model does not fail.
  • To re-iterate earlier comment, the plain .jl script runs fine. And the generating a Juptyer notebook with execute=false will execute fine when launched from Juptyer

To reproduce the error:

  • Name the script below “notebook.jl” and startup julia REPL from a directory containing the script
  • Create an new Julia environment with only Literate.jl in it (version 2.12.1)
  • Run using Literate; Literate.notebook("notebook.jl", ".")

The full stack trace is given under “details” at the end.

The notebook.jl script

using Pkg
Pkg.activate(temp=true)
Pkg.add(name="EvoTrees", version="0.9.4")
Pkg.add(name="MLJBase", version="0.19.6")
Pkg.add(name="MLJSerialization", version="1.1.3")

Pkg.add(name="MLJTuning", version="0.6.16")

using MLJBase
using MLJSerialization
import EvoTrees

#-

const X, y = data = make_blobs(100, 3, centers=2)

# ## booster

const Booster = EvoTrees.EvoTreeClassifier
const booster  = Booster()

mach = machine(booster, X, y) |> fit!
MLJSerialization.save("booster.jlso", mach)

# Try to deserialize:

machine("booster.jlso") # works!

# ## tuned

using MLJTuning

r1 = range(booster, :max_depth, lower=2, upper=5)
r2 = range(booster, :η, lower=-3, upper=-2, scale=x->10^x)

tuning = RandomSearch()

tuned_booster = TunedModel(model=booster,
                                 range=[r1, r2],
                                 tuning=tuning,
                                 measures=[brier_loss, auc, accuracy],
                                 resampling=StratifiedCV(nfolds=2, rng=123),
                                 n=2)

mach = machine(tuned_booster, X, y) |> fit!
MLJSerialization.save("tuned_booster.jlso", mach)

# Try to deserialize:

machine("tuned_booster.jlso") # fails using Literate.notebook(..., execute=false)

The stack trace

julia> using Literate; Literate.notebook("notebook.jl", ".")
[ Info: Precompiling Literate [98b081ad-f1c9-55d3-8b20-4c87d4299306]
[ Info: generating notebook from `~/GoogleDrive/Julia/MLJ/MLJ/examples/telco/notebook.jl`
[ Info: executing notebook `notebook.ipynb`
┌ Error: error when executing notebook based on input file: `~/GoogleDrive/Julia/MLJ/MLJ/examples/telco/notebook.jl`
└ @ Literate ~/.julia/packages/Literate/mGrly/src/Literate.jl:703
ERROR: LoadError: TaskFailedException

    nested task error: IOError: stream is closed or unusable
    Stacktrace:
      [1] check_open
        @ ./stream.jl:386 [inlined]
      [2] uv_write_async(s::Base.PipeEndpoint, p::Ptr{UInt8}, n::UInt64)
        @ Base ./stream.jl:1018
      [3] uv_write(s::Base.PipeEndpoint, p::Ptr{UInt8}, n::UInt64)
        @ Base ./stream.jl:981
      [4] unsafe_write(s::Base.PipeEndpoint, p::Ptr{UInt8}, n::UInt64)
        @ Base ./stream.jl:1064
      [5] unsafe_write
        @ ./io.jl:361 [inlined]
      [6] write
        @ ./strings/io.jl:185 [inlined]
      [7] print
        @ ./strings/io.jl:187 [inlined]
      [8] with_output_color(f::Function, color::Symbol, io::IOContext{Base.PipeEndpoint}, args::String; bold::Bool)
        @ Base ./util.jl:77
      [9] printstyled(io::IOContext{Base.PipeEndpoint}, msg::String; bold::Bool, color::Symbol)
        @ Base ./util.jl:105
     [10] emit(handler::Memento.DefaultHandler{Memento.DefaultFormatter, IOContext{Base.PipeEndpoint}}, rec::Memento.DefaultRecord)
        @ Memento ~/.julia/packages/Memento/Qk5GZ/src/handlers.jl:211
     [11] log(handler::Memento.DefaultHandler{Memento.DefaultFormatter, IOContext{Base.PipeEndpoint}}, rec::Memento.DefaultRecord)
        @ Memento ~/.julia/packages/Memento/Qk5GZ/src/handlers.jl:44
     [12] log(logger::Memento.Logger, rec::Memento.DefaultRecord)
        @ Memento ~/.julia/packages/Memento/Qk5GZ/src/loggers.jl:366
     [13] _log(logger::Memento.Logger, level::String, msg::String)
        @ Memento ~/.julia/packages/Memento/Qk5GZ/src/loggers.jl:411
     [14] log
        @ ~/.julia/packages/Memento/Qk5GZ/src/loggers.jl:390 [inlined]
     [15] warn(logger::Memento.Logger, exc::UndefVarError)
        @ Memento ~/.julia/packages/Memento/Qk5GZ/src/loggers.jl:526
     [16] getindex(jlso::JLSO.JLSOFile, name::Symbol)
        @ JLSO ~/.julia/packages/JLSO/QLXip/src/serialization.jl:56
     [17] macro expansion
        @ ~/.julia/packages/JLSO/QLXip/src/file_io.jl:131 [inlined]
     [18] (::JLSO.var"#33#35"{Dict{Symbol, Any}, JLSO.JLSOFile, Symbol})()
        @ JLSO ./threadingconstructs.jl:169
    
    caused by: UndefVarError: ##257 not defined
    Stacktrace:
      [1] deserialize_module(s::Serialization.Serializer{IOBuffer})
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:965
      [2] handle_deserialize(s::Serialization.Serializer{IOBuffer}, b::Int32)
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:864
      [3] deserialize(s::Serialization.Serializer{IOBuffer})
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:782
      [4] deserialize_datatype(s::Serialization.Serializer{IOBuffer}, full::Bool)
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:1287
      [5] handle_deserialize(s::Serialization.Serializer{IOBuffer}, b::Int32)
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:835
      [6] deserialize(s::Serialization.Serializer{IOBuffer})
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:782
      [7] deserialize_datatype(s::Serialization.Serializer{IOBuffer}, full::Bool)
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:1312
      [8] handle_deserialize(s::Serialization.Serializer{IOBuffer}, b::Int32)
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:835
      [9] deserialize(s::Serialization.Serializer{IOBuffer})
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:782  
     [10] handle_deserialize(s::Serialization.Serializer{IOBuffer}, b::Int32)
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:842
     [11] deserialize_fillarray!(A::Vector{MLJBase.NumericRange{T, MLJBase.Bounded, D} where {T, D}}, s::Serialization.Serializer{IOBuffer})
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:1187
     [12] deserialize_array(s::Serialization.Serializer{IOBuffer})
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:1179
     [13] handle_deserialize(s::Serialization.Serializer{IOBuffer}, b::Int32)
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:833
     [14] deserialize(s::Serialization.Serializer{IOBuffer}, t::DataType)
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:1391
     [15] handle_deserialize(s::Serialization.Serializer{IOBuffer}, b::Int32)
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:851
     [16] deserialize(s::Serialization.Serializer{IOBuffer})
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:782
     [17] handle_deserialize(s::Serialization.Serializer{IOBuffer}, b::Int32)
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:888
     [18] deserialize
        @ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:782 [inlined]
     [19] deserialize(s::IOBuffer)
        @ Serialization /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:769
     [20] deserialize(#unused#::JLSO.Formatter{:julia_serialize}, io::IOBuffer)
        @ JLSO ~/.julia/packages/JLSO/QLXip/src/serialization.jl:10
     [21] deserialize(format::Symbol, io::IOBuffer)
        @ JLSO ~/.julia/packages/JLSO/QLXip/src/serialization.jl:4
     [22] getindex(jlso::JLSO.JLSOFile, name::Symbol)
        @ JLSO ~/.julia/packages/JLSO/QLXip/src/serialization.jl:54
     [23] macro expansion
        @ ~/.julia/packages/JLSO/QLXip/src/file_io.jl:131 [inlined]
     [24] (::JLSO.var"#33#35"{Dict{Symbol, Any}, JLSO.JLSOFile, Symbol})()
        @ JLSO ./threadingconstructs.jl:169
in expression starting at string:1
when executing the following code block in file `~/GoogleDrive/Julia/MLJ/MLJ/examples/telco/notebook.jl`

    ```julia
    machine("tuned_booster.jlso") # fails using Literate.notebook(..., execute=false)
    ```

Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] execute_block(sb::Module, block::String; inputfile::String)
   @ Literate ~/.julia/packages/Literate/mGrly/src/Literate.jl:821
 [3] execute_notebook(nb::Dict{Any, Any}; inputfile::String)
   @ Literate ~/.julia/packages/Literate/mGrly/src/Literate.jl:719
 [4] (::Literate.var"#37#39"{Dict{String, Any}})()
   @ Literate ~/.julia/packages/Literate/mGrly/src/Literate.jl:700
 [5] cd(f::Literate.var"#37#39"{Dict{String, Any}}, dir::String)
   @ Base.Filesystem ./file.jl:106
 [6] jupyter_notebook(chunks::Vector{Literate.Chunk}, config::Dict{String, Any})
   @ Literate ~/.julia/packages/Literate/mGrly/src/Literate.jl:699
 [7] notebook(inputfile::String, outputdir::String; config::Dict{Any, Any}, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})                   
   @ Literate ~/.julia/packages/Literate/mGrly/src/Literate.jl:636
 [8] notebook(inputfile::String, outputdir::String)
   @ Literate ~/.julia/packages/Literate/mGrly/src/Literate.jl:632
 [9] top-level scope
   @ REPL[7]:1

julia> versioninfo()
Julia Version 1.6.5
Commit 9058264a69 (2021-12-19 12:30 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin19.6.0)
  CPU: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake)
Environment:
  JULIA_LTS_PATH = /Applications/Julia-1.6.app/Contents/Resources/julia/bin/julia
  JULIA_PATH = /Applications/Julia-1.6.app/Contents/Resources/julia/bin/julia
  JULIA_EGLOT_PATH = /Applications/Julia-1.6.app/Contents/Resources/julia/bin/julia
  JULIA_NUM_THREADS = 5
  JULIA_NIGHTLY_PATH = /Applications/Julia-1.7.app/Contents/Resources/julia/bin/julia

This error can show up if you have stored e.g. stdout in one block and you try to use it in the next block. This will not work since the stdout is redirected for every block execution, and a new stdout is setup for the next one. Maybe this is the problem?

Thanks.

I’m not sure I understand the details, but this sounds like Literate.jl can not guarantee to reproduce the execution behaviour of Juptyer notebooks in every case, right? That is, I should regard this as a limitation of Literate.jl, not some bug in the packages I am using?

I’m not sure, if the issue described by @ablaom is really related to Literate.jl. I’ve experienced the same problem in a Pluto notebook from time to time (it’s this tutorial: MLJTutorial.jl - 02_models).

There we save a model:
mach2 = machine("neural_net.jlso")

… and a few lines later it is retrieved again from disk:
mach3 = machine("neural_net.jlso", XIris, yIris)

When I load that Pluto notebook (und thus execute it automatically), I’m running often into an error message on the line where the model is loaded.

But when I execute the two lines manually, the problem never occurs. So my impression is, that there is some timing problem. I.e. when the execution takes place automatically there seems to be not enough time between saving and loading.