Trying to understand why the solution proposed by @Dan using this function
Tables.columntype(sch, :Close) is much faster than the others, I came to narrow down the reason for the difference in the use of structures in a similar way (if I understand correctly) to what was discussed here.
julia> using CSV, Tables, DataFrames, Dates, TSFrames
julia> df=CSV.read("table.txt",DataFrame,delim=' ', ignorerepeated=true);
julia> tsdf=TSFrame(df);
julia> sch = Tables.schema(tsdf)
Tables.Schema:
:Index Date
:Open Float64
:High Float64
:Low Float64
:Close Float64
:Volume Float64
julia> using BenchmarkTools
julia> n,t= sch.names,sch.types
((:Index, :Open, :High, :Low, :Close, :Volume), (Date, Float64, Float64, Float64, Float64, Float64))
julia> @btime t[findfirst(==(:Close),$n)]
191.888 ns (0 allocations: 0 bytes)
Float64
julia> n,t= sch.names,Tuple{sch.types...}
((:Index, :Open, :High, :Low, :Close, :Volume), Tuple{Date, Vararg{Float64, 5}})
julia> @btime fieldtype(t, findfirst(==(:Close),$n))
10.010 ns (0 allocations: 0 bytes)
Float64
julia> @btime Tables.columntype(sch, :Close)
12.813 ns (0 allocations: 0 bytes)
Float64
What I couldn’t delve into(*) is why the fieldtype(Tuple{t...},i)
function is so much faster than getindex(t,i)
.
(*) ```
julia> @edit fieldtype(t, findfirst(==(:Close),n))
ERROR: could not determine location of method definition
Stacktrace: