Linear interpolation in 4D?

Hi Forum!
Yes, another question on interpolation.
I have this 5D data that I need linearly interpolated (spline and quadratic are good, but I need to avoid any overshoot). If anyone os curios the data is something like this:

Now, columns 1 and 2 are OK, but 3:5 are peculiar they must add to 100%. Column 6 is what needs to be interpolated (actually 6:8 will all be interpolated in different variables).

I tried using Interpolations like this:

DATA_FELSIC=readdlm(“Felsic.txt”,Float64,comments=true,comment_char=’#’); #This is the IASP91 with LAB
Temperature=DATA_FELSIC[:,1]; #in K
Pressure=DATA_FELSIC[:,2]; #in Gpa
Qz=DATA_FELSIC[:,3]; #in %
A=DATA_FELSIC[:,4]; #in %
P=DATA_FELSIC[:,5];#in %
Vp=DATA_FELSIC[:,6]; #in km/s
Vs=DATA_FELSIC[:,7]; #in km/s
Rho=DATA_FELSIC[:,8]; #in g/cm3



for ti=1:size(t,1)
for pri=1:size(pr,1)
for qi=1:size(q,1)
for ai=1:size(a,1)
for pli=1:size(pl,1)
if q[qi]+a[ai]+pl[pli]==100
indx=findall( (DATA_FELSIC[:,1].==t[ti]) .& (DATA_FELSIC[:,2].==pr[pri]) .& (DATA_FELSIC[:,3].==q[qi]) .& (DATA_FELSIC[:,4].==a[ai]) .& (DATA_FELSIC[:,5].==pl[pli]));



Felsic_Rho = interpolate((t,pr,q,a,pl), Felsic_Rho_M, Gridded(Linear()));

But to fill Matrix I have used zeros and this makes my interpolation have all minimum values outside the knots.

Any way around this? Thanks!

Maybe you can use ‘upsample_trilinear‘ in NNlib for this. It upsamples the first three dims of a 5D array. Sorry, I‘m on the phone and cant give a proper example.

It looks like you’re trying to delete duplicate data in the Felsic.txt source data.

It appears you reduce each variable to a unique value, then do a nested loop to look up each dependent variable to re-create the matrix.

Here is an easier way:

using CSV
using DataFrames 
df = CSV.File("C:\\Users\\<user>\\Documents\\JuliaCode\\Felsic.txt",header=0) |> DataFrame #This is the IASP91 with LAB

In this case, your data doesn’t have any duplicate rows.

One style note: don’t end Julia lines with “;” – not needed and not recommended.

In addition, good news, your data is 4D, not 5D. Columns 3,4,5 (Qz,A,P) are a composition or mixture that adds up to 100%. So if you know 2 of the values, the third is known by subtracting those two from 100%. For example, Qz = 100 - A - P. In short, your have multicollinearity in your dataset. You only need to include any 2 of the 3 variables in your model.

More of a question than an answer: could this be a use case for NearestNeighbors.jl?

Say, one would get k-neighbors first and then use inverse distance weighting (or other) to compute the interpolated value.

I can give that a try

Well, the repeated values are no problem. I do not think there are any. My problem is how to turn that into a interpolated function.

I will give that a try too! Thanks Rafael!

Will this package do what you want?

1 Like

@RJDennis, thanks for the package link.

Could you please confirm, say for the 4d case with variables x1,x2,x3,x4, that we will need to define a range for each one (nodes / hypercube grid) and that the dependent variable y to interpolate at a point (x1ₒ,x2ₒ,x3ₒ,x4ₒ) needs to be defined everywhere over the hypercube grid?

Sounds like it could work. Can you give an example of how it works?

Just an update…
I tried ScatteredInterpolation.jl with some success but it overshoots in some positions.
Any idea if there is a similar package that includes linear interpolation. This one doesn’t.
Thanks for the help!

1 Like

I think rafael.guerra is right in that the package wont help you because it requires y to be defined at each point on a hypergrid.

@marianoarnaiz, thanks for highlighting the very interesting ScatteredInterpolation.jl package.

The results from its Shepard interpolation method look good to me:

using ScatteredInterpolation, CSV, DataFrames

df = CSV.File(filename; header=false) |> DataFrame

rename!(df,["T", "Pr", "Qz", "A", "P", "Vp", "Vs", "Rho"])

# Qz + A + P = 100 => variables domain is 4d
# interpolate Vp (Vs and Rhob) function of T, Pr, Qz and A

points = permutedims(Matrix(df[:,1:4]))

itp_Vp = interpolate(Shepard(), points, df[:,"Vp"]);
itp_Vs = interpolate(Shepard(), points, df[:,"Vs"]);
itp_Rho = interpolate(Shepard(), points, df[:,"Rho"]);

xₒ = [1000; 2e9; 50; 30]    # "T", "Pr", "Qz", "A"

Vpₒ = evaluate(itp_Vp, xₒ)        # 6.8336 km/s
Vsₒ = evaluate(itp_Vs, xₒ)        # 4.0778 km/s
Rhoₒ = evaluate(itp_Rho, xₒ)      # 2.7836 g/cc

It is not perfect. But it is the closet solution to the problem!