Issues with CSV in Julia 1.6.2

compleat · September 1, 2021, 2:44pm

This morning I did a standard upgrade to Julia 1.6.2 and loaded the basic packages I use for Data Science. After adding everything, I tried to get started, beginning with the simple statement ‘using CSV’ and got a page of (apparently unresolveable) package dependency issues.

Fact:

Julia 1.6.2 is not (reliably) compatible with any version of the package CSV apparently at the moment (a known issue - thread on Discourse)

What I want to know is how such basic functionality as csv-reading/writing can become broken in a standard/default installation of Julia at this late stage of it’s development!!!

I am posting on here because this seems to be more than just a minor software bug, but the kind of systemic failure (if I may say) that threatens the growth and acceptance of Julia itself, and (speaking as an academic) makes it very hard for me to make a case for using Julia in my teaching and research, which (given Julia’s obvious potential) I see as a terrible waste.

Apologies, but I could not let this go without noting

lmiq · September 1, 2021, 2:47pm

Can you provide more details of the error you are having? I have installed CSV now here in 1.6.2 and had no issues.

compleat · September 1, 2021, 2:48pm

Apologies I should have given the link to the (very active) thread… it will take me a minute to find it…

Sukera · September 1, 2021, 2:51pm

Not sure what you mean by

as the thread in question has a solution - don’t do using JuliaFormatter (or Atom for that matter, as it’s been out of support I believe for quite a while) without having added CSV first.

You’ve also posted in that other thread (the one with an extended debugging session) that you’re not interested in “why it is failing”, even though that’s crucial to solving the individual problem you’re having.

–

Let’s back up a bit:

Are you using a custom startup.jl with using JuliaFormatter?
Are you using Atom or VSCode (or another editor)?
What does ]st for the active environment say?

compleat · September 1, 2021, 2:54pm

The ‘solution’ there is not a solution because I had exactly the same problem and the proposed solution did not work. Anyway, that is not my point, which is that Julia should be able to reliably read and write data without having to resort to bug-fixes!

lmiq · September 1, 2021, 2:59pm

I understand your frustration, bugs are very frustrating. Unfortunately they exist. Sometimes they affect only very few people and we happen to be one of the unlucky ones.

If you want to use Julia, and want help, people here are most willing to have that working, and maybe fix the bug (which may not be the same that the one of that thread) - If you can, just follow what @Sukera is asking, and you may find what is going wrong there.

compleat · September 1, 2021, 3:04pm

People seem to know what has gone wrong, but that is not my point here, in a general discussion, which is this seems to be quite a widespread thing that started happening a couple of days ago is already being experienced by alot of people. I am not talking about an esoteric, advanced feature here, but the ability of reading and writing data, which should be sacrosanct in any software package like Julia. I am talking from the point of view of someone who regularly has to assess whether to use Julia in classes

sostock · September 1, 2021, 3:07pm

CSV.jl is not a part of Julia. There is the DelimitedFiles stdlib, which is part of Julia and can read CSV files.

lmiq · September 1, 2021, 3:09pm

One way to get around those compatibility issues is to work on a new environment, where you can install independent versions of a package:

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.6.2 (2021-07-14)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

(@v1.6) pkg> activate MyWorkingEnvironment
  Activating new environment at `~/Downloads/MyWorkingEnvironment/Project.toml`

(MyWorkingEnvironment) pkg> add CSV
    Updating registry at `~/.julia/registries/General`
    Updating git-repo `https://github.com/JuliaRegistries/General`
   Resolving package versions...
    Updating `~/Downloads/MyWorkingEnvironment/Project.toml`
  [336ed68f] + CSV v0.8.5
    Updating `~/Downloads/MyWorkingEnvironment/Manifest.toml`
  [336ed68f] + CSV v0.8.5
  [9a962f9c] + DataAPI v1.7.0
  [e2d170a0] + DataValueInterfaces v1.0.0
  [82899510] + IteratorInterfaceExtensions v1.0.0
  [69de0a69] + Parsers v1.1.2
  [2dfb63ee] + PooledArrays v1.3.0
  [91c51154] + SentinelArrays v1.3.7
  [3783bdb8] + TableTraits v1.0.1
  [bd369af6] + Tables v1.5.0
  [2a0f44e3] + Base64
  [ade2ca70] + Dates
  [9fa8497b] + Future
  [b77e0a4c] + InteractiveUtils
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [a63ad114] + Mmap
  [de0858da] + Printf
  [9a3f8284] + Random
  [9e88b42a] + Serialization
  [8dfed614] + Test
  [4ec0a83e] + Unicode

julia> using CSV

julia>

I think that should work independently of any other package you have installed before in your system. Please someone correct me if I’m wrong.

compleat · September 1, 2021, 3:14pm

Thanks, I think one can’t use DataFrames without CSV
It would be great if these could be integrated, since they are so central to anything a Data Scientist would do.

GunnarFarneback · September 1, 2021, 3:17pm

Sadly not always. Under unfortunate circumstances (breaking releases of common dependencies with some packages updated and some not) you can run into interference with your startup.jl or packages like IJulia. See e.g. Reverting back to avoid precompile errors? - #21 by GunnarFarneback.

compleat · September 1, 2021, 3:23pm

I guess what I wonder is whether it might be formally acknowledged that a large part of Julia’s usage is Data Science, so that certain basic functionality like DataFrames and CSV should be included in Base and thus ‘ring-fenced’.

mcreel · September 1, 2021, 3:26pm

Out of curiousity, I just added CSV and DataFrames to my Julia 1.6.2 installation, and read in a file, no problems. There may be some order of operations that is not so smooth, but that’s just a bug, and bugs tend to get fixed quickly. Opening an issue for the relevant package is the standard procedure.

(@v1.6) pkg> st
      Status `~/.julia/environments/v1.6/Project.toml`
  [336ed68f] CSV v0.8.5
  [a93c6f00] DataFrames v1.2.2
  [5fb14364] OhMyREPL v0.5.10

compleat · September 1, 2021, 3:27pm

Do you use Jupyter? I think the issues are stemming from that.

mbauman · September 1, 2021, 3:29pm

Data science is absolutely a core staple of Julia. Baking it into the Julia Language itself would be terribly stifling to its growth and progress.

What’s happened here is a core dependency of both CSV and some IDEs released a breaking change. If you’re using an older IDE like a Jupyter notebook or Atom, this causes trouble. It’ll be resolved in a week or so as everyone gets on board with the new version. In the meantime, there are ways to ensure both are using the same version.

Lots of effort has gone into making sure VS Code and Pluto both manage their dependencies separately from the code you write. This means the Julia process that manages the IDE itself is independent from the Julia that you run, completely avoiding these issues. It would indeed be great if a similar approach could work for Jupyter notebooks, but I’m not an IJulia dev nor do I know what’s required to do this.

GunnarFarneback · September 1, 2021, 3:34pm

One can see a pattern here. DelimitedFiles is built into Julia (an stdlib) and isn’t progressing at all and can’t be used with DataFrames. CSV has progressed a lot and can be used with DataFrames.

Sukera · September 1, 2021, 3:35pm

Thank you for mentioning that you observe this in Jupyter, that clears it up!

IJulia itself depends on JSON, which brings in the Parsers problem that’s been observed in the other threads. Since in this case (as @mbauman mentions) the dependencies for the IDE (IJulia/Jupyter) are not seperated from the regular code, you get the same kind of error as observable in Atom (which has the exact same problem, as I understand it).

Unfortunately, there’s no one party that could see this coming - if anything, it’s a legacy problem stemming from the approach both Atom and Jupyter take for loading code, not directly a problem of Parsers releasing a new version.

If I’m not mistaken, the solution should be to pin the package (]pin Parsers@v1.1.2, possibly followed by ]up, not 100% sure) in whichever environment you’re starting the IJulia kernel from (before using IJulia or starting the jupyter kernel).

jules · September 1, 2021, 4:33pm

I just wanted to add that Julia is not a software package. It’s a language with standard libraries, which one could count as a software package, yes, but most functionality comes in third party packages. There is just not really a “governing body” that might have messed up here or whose priorities need to be straightened out because they don’t take csv import seriously.

It’s simply collateral damage of the normal process of updating and versioning disparate pieces of software. Note that if you’re at the forefront of versions, you’re bound to have issues like this once in a while. That’s just due to the complexity of the ecosystem, especially because the Julia ecosystem has so much code reuse and composition, where one tiny issue in a dependency can have ripple effects.

If you use Julia for courses, you can’t go wrong in setting up environments with known good versions, then distributing them to all your students. There might come a time when there can be made no more substantial improvements to CSV and DataFrames, so people won’t have to care anymore whether they’re at the bleeding edge or not, but this time is not here, yet.

compleat · September 1, 2021, 4:44pm

OK, thanks for your thoughts. I guess I am assuming that there is a community that would like to see Julia prosper and grow, giving my thoughts from that point of view.
[BTW, I did manage to fix things with CSV by fiddling with some of the comments above.]

mbauman · September 1, 2021, 5:09pm

Please don’t be overly incendiary. This is a very engaged community who wants to see Julia prosper and grow, and you know this. You’ve previously asked how to gear your courses for success and gotten lots of engagement — and lots of suggestions on how to avoid surprises by distributing a stable and known-good environment of specific package versions to your students. I know we can continue to do better, but that’s a really good way of making sure things like this don’t happen to you or your students in the first place.

Very glad to hear you’ve resolved this!

Topic		Replies	Views
CSV.jl fails precompling, TypeError: in Type{...} expression, expected UnionAll, got Type{Parsers.Options} General Usage question , csv	19	3639	December 8, 2021
Precompiling CSV.jl fails General Usage package , error , csv	13	3324	May 9, 2023
Reverting back to avoid precompile errors? New to Julia question	26	2000	September 1, 2021
Julia CSV.read stopped working Data csv	32	2807	April 30, 2022
CSV.read not recognizing "select" keyword Data csv , ijulia	11	883	June 6, 2022

Issues with CSV in Julia 1.6.2

Related topics