Interpolate DataFrame columns

I have a DataFrame sorted on one column (X):

df = sort(DataFrame(Y = rand(4), X = rand(4)), [:X])
4×2 DataFrame
 Row │ Y         X
     │ Float64   Float64
─────┼────────────────────
   1 │ 0.211423  0.491155
   2 │ 0.839856  0.506252
   3 │ 0.344844  0.731106
   4 │ 0.196441  0.963763

I would like to create a new DataFrame whose X points are evenly distributed across [0,1] that interpolate Y.

Something like that:

4×2 DataFrame
 Row   │ Y         X
       │ Float64   Float64
───────┼────────────────────
   1   │ ...       0
   2   │ ...       0.01
....................................
   N-1 │ ...       0.99
   N   │ ...       1

I’ve tried using interpolation and extrapolations without success.

I don’t understand the question - are you saying you have some observations Y at some points X and want to interpolate the value of Y on some points X* that aren’t observed?

If so, that doesn’t seem to have anything to do with DataFrames, and you should probably look at Home · Interpolations.jl (and the other packages mentioned on the landing page of the docs) to find an appropriate interpolation scheme for your data.

1 Like

For example:

using DataFrames, Interpolations

df = sort(DataFrame(Y = rand(4), X = rand(4)), [:X])

intlin = LinearInterpolation(df[:,:X], df[:,:Y], extrapolation_bc=Line());
x = 0:0.02:1;
y = intlin(x)

df_int = DataFrame(Xint = x, Yint = y)

DataFrame_Interpolations

1 Like

Yes, one of the issues I’m having I guess is converting the two DataFrame columns to the format required by interpolate