Hi,
I think my code is highly sub-optimised, is there a more monolithic way (and preferably faster) to interpolate, e.g. without using eachrow, push etc.
Code:
using DataFrames
using Interpolations
# Dates, then values
data = [2000 2001 2002 2003; 1 20 30 4; 2 30 40 5; 3 40 50 6; 4 50 60 7]
df = DataFrame(Tables.table(data))
# Collect the dates
A_x = collect(df[1,:])
nodes = (A_x,)
# Find the column interpolation for this date (decimal year)
date = 2001.5
# Where to store the result
interpolated_column = Float64[]
# Interpolation for each row containing values
for row in eachrow(df[2:end,:])
A = collect(row)
itp = interpolate(nodes, A, Gridded(Linear()))
res = itp(date)
push!(interpolated_column, res)
end
# Result
interpolated_column
Execution:
julia> using DataFrames
julia> using Interpolations
julia> data = [2000 2001 2002 2003; 1 20 30 4; 2 30 40 5; 3 40 50 6; 4 50 60 7]
5×4 Matrix{Int64}:
2000 2001 2002 2003
1 20 30 4
2 30 40 5
3 40 50 6
4 50 60 7
julia> df = DataFrame(Tables.table(data))
5×4 DataFrame
Row │ Column1 Column2 Column3 Column4
│ Int64 Int64 Int64 Int64
─────┼────────────────────────────────────
1 │ 2000 2001 2002 2003
2 │ 1 20 30 4
3 │ 2 30 40 5
4 │ 3 40 50 6
5 │ 4 50 60 7
julia> A_x = collect(df[1,:])
4-element Vector{Int64}:
2000
2001
2002
2003
julia> nodes = (A_x,)
([2000, 2001, 2002, 2003],)
julia> date = 2001.5
2001.5
julia> interpolated_column = Float64[]
Float64[]
julia> for row in eachrow(df[2:end,:])
A = collect(row)
itp = interpolate(nodes, A, Gridded(Linear()))
res = itp(date)
push!(interpolated_column, res)
end
julia> interpolated_column
4-element Vector{Float64}:
25.0
35.0
45.0
55.0
Any suggestion welcome.