CSV.jl fails precompling, TypeError: in Type{...} expression, expected UnionAll, got Type{Parsers.Options}

Hi all,
Yesterday I found that CSV.jl is failing in precompiling when it is imported in a Jupyter notebook. For testing, it is tried in a new environment only with CSV, but it failed the same way. See below for the error message.
An intriguing thing is that CSV.jl gets imported and precompiled fine when it is tried in the REPL or a script.
The version of Julia is Version 1.6.2 on Mac (official one from the Julia website), and CSV.jl v0.8.5 is added using Pkg.
I wonder what I am doing wrong. Any idea? Thanks!

using CSV
┌ Info: Precompiling CSV [336ed68f-0bac-5ca0-87d4-7b16caf5d00b]
└ @ Base loading.jl:1342
ERROR: LoadError: LoadError: TypeError: in Type{...} expression, expected UnionAll, got Type{Parsers.Options}
Stacktrace:
  [1] top-level scope
    @ ~/.julia/packages/CSV/Zl2ww/src/detection.jl:164
  [2] include(mod::Module, _path::String)
    @ Base ./Base.jl:386
  [3] include(x::String)
    @ CSV ~/.julia/packages/CSV/Zl2ww/src/CSV.jl:1
  [4] top-level scope
    @ ~/.julia/packages/CSV/Zl2ww/src/CSV.jl:27
  [5] include
    @ ./Base.jl:386 [inlined]
  [6] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::Nothing)
    @ Base ./loading.jl:1235
  [7] top-level scope
    @ none:1
  [8] eval
    @ ./boot.jl:360 [inlined]
  [9] eval(x::Expr)
    @ Base.MainInclude ./client.jl:446
 [10] top-level scope
    @ none:1
in expression starting at /Users/foo/.julia/packages/CSV/Zl2ww/src/detection.jl:164
in expression starting at /Users/foo/.julia/packages/CSV/Zl2ww/src/CSV.jl:1

Failed to precompile CSV [336ed68f-0bac-5ca0-87d4-7b16caf5d00b] to /Users/foo/.julia/compiled/v1.6/CSV/jl_cgSkrZ.
2 Likes

also got the same issue

2 Likes

This has come up a couple of times recently and is due to a new breaking release in the Parsers.jl v2.0 release, which in and of itself shouldn’t cause problems, since CSV.jl 0.8.5 has dependencies correctly capped, but still becomes an issue because there are unfortunate ways for package dependencies to get loaded without having the right versions. See this slack thread for more details. Basically, it boils down to some 3rd-party package (in this case, probably IJulia.jl) loading a package like JSON.jl, which also depends on Parsers.jl, and does have a current release compatible with Parsers.jl 2.0, so 2.0 gets loaded, but then the current environment gets switched, and CSV.jl is loaded and the right versions can’t be resolved quite right and it’s assumed that the already loaded Parsers module is the “right one”.

This can usually be fixed by pinning Parsers.jl to v1.2.1 for the time being. A new CSV.jl release compatible with Parsers.jl 2.0 is coming soon, hopefully in the next week or so.

11 Likes

Thank for the explanation. I kinda suspected what you mentioned from the line where the issue occurs and by looking through the master branch. But, I did not know IJulia can behave that way, so I ended up putting up this question. Good to know.

Hiya, I am not exactly a ‘newbie’ but I have not used pinning before… I wonder if you might be able to give me a hint as to how I would achieve the ‘pinning’? [and thanks for your help here]

I tried

] add Parsers@1.2.1

But got pages of ‘Unsatisfiable requirements’
Thanks for any help getting back ‘on the road’ again.

Error message:

Unsatisfiable requirements detected for package Parsers [69de0a69]:
 Parsers [69de0a69] log:
 \u251c\u2500possible versions are: 0.1.0-2.0.3 or uninstalled
 \u2514\u2500restricted to versions 1.2.1 by an explicit requirement \u2014 no versions left

Stacktrace:
  [1] check_constraints(graph::Pkg.Resolve.Graph)
    @ Pkg.Resolve /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Resolve/graphtype.jl:978
  [2] Pkg.Resolve.Graph(versions::Dict{Base.UUID, Set{VersionNumber}}, deps::Dict{Base.UUID, Dict{VersionNumber, Dict{String, Base.UUID}}}, compat::Dict{Base.UUID, Dict{VersionNumber, Dict{String, Pkg.Types.VersionSpec}}}, uuid_to_name::Dict{Base.UUID, String}, reqs::Dict{Base.UUID, Pkg.Types.VersionSpec}, fixed::Dict{Base.UUID, Pkg.Resolve.Fixed}, verbose::Bool, julia_version::VersionNumber)
    @ Pkg.Resolve /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Resolve/graphtype.jl:371
  [3] deps_graph(ctx::Pkg.Types.Context, uuid_to_name::Dict{Base.UUID, String}, reqs::Dict{Base.UUID, Pkg.Types.VersionSpec}, fixed::Dict{Base.UUID, Pkg.Resolve.Fixed})
    @ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:537
  [4] resolve_versions!(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec})
    @ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:407
  [5] targeted_resolve(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}, preserve::Pkg.Types.PreserveLevel)
    @ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:1214
  [6] tiered_resolve(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec})
    @ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:1200
  [7] _resolve
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:1220 [inlined]
  [8] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}, new_git::Vector{Base.UUID}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform)
    @ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:1235
  [9] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform, kwargs::Base.Iterators.Pairs{Symbol, IJulia.IJuliaStdio{Base.PipeEndpoint}, Tuple{Symbol}, NamedTuple{(:io,), Tuple{IJulia.IJuliaStdio{Base.PipeEndpoint}}}})
    @ Pkg.API /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:204
 [10] add(pkgs::Vector{Pkg.Types.PackageSpec}; io::IJulia.IJuliaStdio{Base.PipeEndpoint}, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Pkg.API /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:80
 [11] add(pkgs::Vector{Pkg.Types.PackageSpec})
    @ Pkg.API /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:78
 [12] do_cmd!(command::Pkg.REPLMode.Command, repl::IJulia.MiniREPL)
    @ Pkg.REPLMode /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/REPLMode/REPLMode.jl:408
 [13] do_cmd(repl::IJulia.MiniREPL, input::String; do_rethrow::Bool)
    @ Pkg.REPLMode /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/REPLMode/REPLMode.jl:386
 [14] top-level scope
    @ In[24]:1
 [15] eval
    @ ./boot.jl:360 [inlined]
 [16] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
    @ Base ./loading.jl:1116

Thanks

please paste the full error message other wise it’s impossible for people to debug for you

Thanks - see above

Could I roll back versions of CSV? What would be the one?

It turns out that pinning the version does not work in the notebook for me. Still looking for an answer for this…

Check this: Issues with CSV in Julia 1.6.2 - #23 by leandromartinez98

And that thread in general for some other ideas.

Loading IJulia for work with Jupyter will load Parsers@v2. Activating your notebook then activates an environment requiring Parsers@v1.1.2, which will fail since Parsers@v2 is already loaded. The solution should be to pin Parsers in the environment where IJulia is loaded (before loading IJulia), which may be different than your notebook environment.

If you use the notebook viewer in VSCode instead of Jupyter directly, you shouldn’t have the problem at all, since VSCode seperates code required for viewing & modifying a notebook properly from the code running inside of that notebook.

1 Like

@Sukera It is a super helpful hint! Unfortunately, Julia notebooks in VSCode did not work for me, either.

I tracked down what’s happening again with your hint, and I found that Conda.jl depends on JSON.jl, JSON.jl on Parsers.jl in turn. The Parsers.jl compat option in JSON.jl is 1, 2, which leads to loading Parsers.jl v2 by Jupyter.

So when the compat option is modified to 1, CSV.jl now gets loaded in Jupyter.

I have not tested the functionality of CSV.jl fully yet. I looked through what JSON.jl uses from Parsers.jl, and I do not think this monkey patch would affect things negatively.

I wouldn’t expect it to - it’s the two different versions that muck things up, not whatever JSON.jl patches in.


Since it happens with Conda.jl as well, I suspect this is a good bit worse - anything that happens to load Parsers@v2 (even rightfully so) where subsequently something else is bound to load Parsers@v1 will run into this, because there can only be one version of a package loaded at a time. In theory, downgrading the Parsers version by loading e.g. CSV.jl@v0.8.5 should cause Conda.jl to also reload, since either Conda.jl can’t rely on 2.0 functionality that it’s now been compiled with, or CSV.jl breaks because it isn’t compatible with Parsers@v2.

The “general” solution should be to load packages in such an order that the versions of their dependencies grow maximally while obeying all versions bounds. This is not that easy to do manually sadly. Thankfully, there are only 6 direct dependents of Parsers. The vast majority of its 1848 dependencies are through JSON.jl , which has 1787 dependencies (so “only” 61 dependencies of Parsers that are not using JSON could be hit by this).

I have no idea how many users access each of those dependencies - I suspect most that are hit by this are using either IJulia in a Jupyter notebook, Conda or CSV directly. I’ll crosspost this finding about Conda to the issue, thanks for investigating.

@Sukera I agree with what you commented. The right way to deal with this is at the level of packages, e.g. how packages are loaded correctly and properly. I do not recommend the monkey patch that I did in any way. It is at your own risk, no warranty. :wink:

And thank you for cross-posting. Keep me posted.

I don’t mean to derail this thread but I’m curious to learn more about isolation of the notebook environment from content in Jupyter and VS Code. I tried searching a few phrases but could not turn up relevant links. @Sukera do you happen to have links to where I might learn more about how Jupyter and VS Code differ in this regard?

I don’t have a link with a writeup of differences at hand and I don’t think such a summary exists yet. What I can tell you from my understanding though is that, as far as I know, it boils down to Jupyter using the stacking behavior of environments as the code in question is using for running. This leads to the observed clash. In contrast, VSCode has a seperate environment and julia instance, combined with a seperate parser, to isolate the linting/code checking environment from the environment of the code that you’re writing.

This may not be 100% accurate, but I think that’s the distinction this boils down to. If the code that’s doing the linting doesn’t influence the code that’s running, it’s a better situation.

The strategy in VS Code is that we never load the code from the packages we use via the normal package loading mechanism, but instead we have one global module called VSCodeServer, and then we essentially include all the code from packages that we want to use into that global module. That way none of these packages appear to be “loaded” already from Julia’s point of view, and then a user can load a different version of the same package via the normal using or import mechanism.

If you want to see the details, here is some more info:

  • The top level file that we load into the notebook kernel process is this. Note how we manipulate LOAD_PATH temporarily there to make sure we will load the package VSCodeServer from a specific location on disc.
  • Inside VSCodeServer we then load lots of other packages that we use, that code starts here. But note how we never say e.g. using JSON, instead we include the root file for JSON.jl. That way JSON.jl is loaded as just a child module of VSCodeServer. If a user later at the REPL writes using JSON, the Julia module loading system will just load whatever version of JSON the user has in their environment, and that can be a different version of JSON from the one we loaded inside VSCodeServer.
  • This method of including packages gets tricky if you have a package that itself relies on some other package. We get around that by working with those packages and begging them to structure their source code slightly differently, with a packagedef.jl file, so that we can load those packages in such a way that they will actually use say the version of JSON.jl that we previously loaded into VSCodeServer, instead of trying to load a “global” version of JSON.jl. So this is not a generic approach, but it works well enough as long as all the packages we want to use make this small structural change that doesn’t really have any impact on normal operations.

The short version of all of this is that VS Code loads a lot of packages into the notebook kernel process, but all of those are “hidden” inside VSCodeServer and thus a user can load a different version of those packages in their code without problems.

5 Likes