Deprecated csv prefix

question
deprecation

#1

Hi,

Can you please clarify the newer syntax?

csv"""
x, y, z
1, 2, 3
"""

@csv_str and the csv""" syntax are deprecated. Use CSV.read(IOBuffer(…)) from the CSV package instead.

Why the csv prefix was deprecated, couldn’t it be moved to CSV.jl instead?


#2

We felt that the following (and variations of it) was a similarly straightforward and more general approach:

julia> using DataFrames, CSV

julia> csv = """
       x, y, z
       1, 2, 3
       """;

julia> CSV.read(IOBuffer(csv))
1×3 DataFrames.DataFrame
│ Row │ x │ y │ z │
├─────┼───┼───┼───┤
│ 1   │ 1 │ 2 │ 3 │

That’s what was happening behind the scenes. But I think it was also because there was so much going on in preparation for the DataFrames 0.11 release nobody involved was able to prioritize it and take it on.

couldn’t it be moved to CSV.jl instead?

Yes! If you’d like to prepare a PR to move it to CSV.jl, all of the relevant code is here


#3

Thank you @cjprybol, does the PR consist of copying the lines your highlighted into an appropriate place in CSV.jl or it needs further work? I am a little busy this weekend, but I can try look into it…


#4

I never quite understood the point of the csv""" family of string macros in the first place for reading in-file tables of data. Since Matrix{Any} can hold arbitrary data types, why not use that instead? The main advantages are inline comments in the data tables and nice syntax highlighting in your editor.

Just add a simple helper function:

function array2dataframe(arr, header=true)
    if header
        DataFrame([[arr[2:end,i]...] for i=1:size(arr,2)], [arr[1,:]...])
    else
        DataFrame(arr)
    end
end

And then you can do this:

data = [
    :x  :y   :z
    1   2.0  "three"   # Comments work ...
    4   5.0  "six"     # ... and you get nice syntax highlighting in your editor
]

julia> df = array2dataframe(data)
2×3 DataFrames.DataFrame
│ Row │ x │ y   │ z     │
├─────┼───┼─────┼───────┤
│ 1   │ 1 │ 2.0 │ three │
│ 2   │ 4 │ 5.0 │ six   │

Sorry for hijacking the topic to mention this but your question seemed answered already, and I think it would be great if DataFrames.jl (or CSV.jl) could have built-in support for reading tables in Matrix{Any}-format.


#5

Hi @NickNack, I think the main benefit of the string macro is for teaching or just quick example notebooks. There is no need to manually format the csv into a matrix with spaces and convert column names to symbols, etc.

The trick you shared is interesting anyways. Thanks.