Index data from DelimitedFiles by name

feanor12 · March 16, 2021, 9:28am

I know there is CSV.jl, but I am wondering what the easiest way to access the data by their column names would be if I just use DelimitedFiles.jl.

My first approach would look like this, but maybe there is a more elegant solution.

using DelimitedFiles
data,header = readdlm(path,header=true)
coldata = data[:,header .== "column name"]

lmiq · March 16, 2021, 4:12pm

You can convert the header and columns to a dict with, for example:

The data:

shell> more test.dat
A B C
1 2 3
1 2 3
1 2 3

julia> data, header = readdlm("./test.dat",header=true)
([1.0 2.0 3.0; 1.0 2.0 3.0; 1.0 2.0 3.0], AbstractString["A" "B" "C"])

julia> d = Dict(header[i] => data[:,i] for i in 1:length(header))
Dict{SubString{String},Array{Float64,1}} with 3 entries:
  "B" => [2.0, 2.0, 2.0]
  "A" => [1.0, 1.0, 1.0]
  "C" => [3.0, 3.0, 3.0]

julia> d["A"]
3-element Array{Float64,1}:
 1.0
 1.0
 1.0

That copies the data. If that is not desirable, you can use dictionary only associating the header with the column indexes:

julia> hind = Dict(header[i] => i for i in 1:length(header))
Dict{SubString{String},Int64} with 3 entries:
  "B" => 2
  "A" => 1
  "C" => 3

julia> data[:,hind["A"]]
3-element Array{Float64,1}:
 1.0
 1.0

Or if you are just willing to do this once in a while:

julia> data[:, findfirst(==("A"),header)]
3-element Array{Float64,1}:
 1.0
 1.0
 1.0

feanor12 · March 16, 2021, 5:07pm

The DIct option looks very nice.
Maybe views will remove the need to copy the data

 Dict(header[i]=>view(data,:,i) for i in 1:length(header))

or a named tuple(although the code looks a bit messy)

(;((Symbol.(header[i]),view(data,:,i)) for i in 1:length(header))...)

lmiq · March 16, 2021, 5:20pm

indeed

Topic		Replies	Views
Access columns by variable name with `readdlm` from `DelimitedFiles` General Usage	4	1130	October 15, 2019
Read vector from data file Data csv , io	8	739	January 18, 2024
Setting up Dict example puzzled by syntax and functionality New to Julia dictionary	16	1033	February 21, 2022
Load large Datamatrix with column and rownames New to Julia	3	342	September 29, 2020
Unable to create a dict from split New to Julia	6	974	June 3, 2018

Index data from DelimitedFiles by name

Related topics