[SOLVED] Usage of CSV.read recently deprecated?

danilo-bc · December 2, 2020, 4:58pm

In 1 week, I used the following lines:

using CSV: read
    In_vec = read(in_vec_file,header=false);
    In_vec = In_vec[1] #turn into usable Julia vector

They would return me an Array{Float64,1} of the size of my file. Now, after doing no changes, I get this error “ERROR: ArgumentError: provide a valid sink argument, like `using DataFrames; CSV.read(source, DataFrame)”

Did the functionality recently change? What is the easiest way to collect my ‘read’ as an Array? I know I can use DataFrames and Table.matrix to try to solve this, but I want to first understand the problem and second try to solve it without using more additional packages.

pdeffebach · December 2, 2020, 5:18pm

If it’s just a super clean CSV file with Float64 values, you can use readdlm

heliosdrm · December 2, 2020, 5:24pm

And even if the contents are not only numbers, readdlm will work too, only that it will produce an Array{Any, 2}.

Wikunia · December 2, 2020, 5:31pm

Have a look at CSV.File instead. I was wondering this as well yesterday.

danilo-bc · December 2, 2020, 5:36pm

So odd, but I’m glad I’m not the only one who noticed this issue.

@pdeffebach I’ll take a look at readdlm, but is there a way to keep using the same CSV.read structure with minimal changes?

jonathanBieler · December 2, 2020, 5:37pm

You should be able to do read(in_vec_file, DataFrame, header=false) to get the previous behavior back. Apparently this was changed to avoid having DataFrames as a dependency in CSV.jl.

pdeffebach · December 2, 2020, 5:38pm

Yeah, you can use

CSV.read(file, NamedTuple)

then work with the vector from that named tuple. You don’t even have to have DataFrames as a dependency for this.

danilo-bc · December 2, 2020, 5:54pm

Thanks @jonathanBieler and @pdeffebach for the inputs. Using DataFrame (which should have been 1:1 with my initial solution) requires changing In_vec[1] to In_vec[!,1] which is fine, but requires “using DataFrames”.

The solution with NamedTuple gives the exact behavior I had before, where I need to do In_vec = In_vec[1] to get a usable vector.

I tried In_vec = read(in_vec_file, collect, header=false), which is a bit frustrating because it’s almost a one-liner. It outputs a 1-element Array{Any,1} similar to the NamedTuple method, requiring the second line to work.

I’m satisfied with this solution because, currently, I only have 1 data point per line. If I had multiple columns to be loaded at once (like a f(x,y) function dump), I think the NamedTuple method would give me a workable variable as well.

Only an ‘off-topic’ question remains: why doesn’t the official documentation have a simple entry on CSV.read? Is it really deprecated and not intended for future use?

pdeffebach · December 2, 2020, 5:57pm

I don’t know, I noticed that yesterday. It should be fixed.

Topic		Replies	Views
Read file with CSV.read New to Julia	8	19791	September 9, 2019
Alternative to DataFrame Readtable to read large data files with headers Data	17	4042	November 12, 2018
UndefVarError: readcsv not defined New to Julia csv	3	5359	July 8, 2019
CSV.read Error - provide a valid sink argument General Usage csv	11	10471	February 22, 2021
DataFrames/CSV: how to read vectors from *.csv? General Usage	9	2848	March 26, 2021

[SOLVED] Usage of CSV.read recently deprecated?

Related topics