 # How to predict X given a set of Ys in Julia?

Hello,
I have a vector V of y values describing a curve for a series of time points x; I have a threshold level Yt. How can I predict the exact x value for Yt?
Essentially: how do I interpolate Y to find X in Julia?
I can see there is a specific package (interpolations.jl) but requires the creation of a matrix first. IS there a simpler method (without the requirement of a matrix) that works on curves (rather than linear data)?
Thank you

``````V = [100.0, 118.68686748100414, 197.53957741968554, 405.39743832303634, 909.384506158371, 2298.8440311204035, 6352.7577141124375, 19137.261442828225, 61473.086836666625, 204601.00673831577, 662816.455857954, 1.3996572716346516e6, 1.9351053374287586e6, 2.0969909161631712e6, 2.1526434991115257e6, 2.1632872818975276e6, 2.1650948890150622e6, 2.1652845329468683e6, 2.165299329101924e6, 2.165299308178196e6, 2.165300516603722e6, 2.165289152377223e6]
Yt = 500000
``````

Do you want linear interpolation between points, a polynomial fit or something else?

If you want linear interpolation, it looks like `Interpolations.jl` does exactly that, what’s your issue with it?

Generally, which function you want to fit is a choice you will have to make, and motivate, yourself. There’s an infinite number of functions which fit any finite set of points. For discussion’s sake, say you choose `model(x, p) = p .+ p*atan.(p*x .+ p)`. You can then say

``````using LsqFit
fit = curve_fit(model, x, y, [0.0, 1.0, 1.0, 0.0])
p = fit.param
``````

to get the best choice of `p`. For your query, you want the inverse, but that’s just a question of massaging the model equation with some simple arithmetic (noticing that the inverse of `atan` is `tan`). You can use the same parameters `p`.

2 Likes

Complete example:

``````using Plots, LsqFit
default(;fontfamily="Computer Modern", label="") # Vanity settings
V = [100.0, 118.68686748100414, 197.53957741968554, 405.39743832303634, 909.384506158371, 2298.8440311204035, 6352.7577141124375, 19137.261442828225, 61473.086836666625, 204601.00673831577, 662816.455857954, 1.3996572716346516e6, 1.9351053374287586e6, 2.0969909161631712e6, 2.1526434991115257e6, 2.1632872818975276e6, 2.1650948890150622e6, 2.1652845329468683e6, 2.165299329101924e6, 2.165299308178196e6, 2.165300516603722e6, 2.165289152377223e6]
V_t = 500_000
t = range(0, 60; length=length(V))
model(x, p) = 1e6*(p .+ p*atan.(p*x .+ p)) # Scaling by 1e6 to make more sense numerically
fit = curve_fit(model, t, V, [0.0, 1.0, 1.0, 0.0])
p = fit.param
t_t = (tan((V_t/1e6 - p)/p)-p)/p
plot(t, V; st=:scatter)
plot!(x->model(x, p))
plot!([t_t], [V_t]; st=:scatter)
`````` 6 Likes

Interpolations.jl allows for SPlines, you don’t need “the creation of a matrix” and “works on curves”

``````using Plots

x = 1:0.5:5
y = 1 ./ (1 .+ exp.(-x))
scatter(x,y)

using Interpolations
bsi=Interpolations.scale(interpolate(y,BSpline(Cubic(Line(OnGrid())))),x)

x = 1:0.1:5
plot!(x,bsi.(x))
`````` 7 Likes

Thank you. I tried with a Michaelis-Menten (more biologically relevant) but it does not fit:

``````x = collect(0:2.8:60)
y = V
par = [0.0, 1.0]
model = michaelis_menten(x, p) = (x .* p) ./ (x .+ p)
fit = curve_fit(model, x, y, par)
P = fit.param
Y = michaelis_menten(x, P)
``````

I am trying with a logistic but it also does not give good results:

``````model = logistic(x, p) = ((p.*x) .* (1 .- (x./p)))
fit = curve_fit(model, x, y, par)
``````

With a ‘real’ logistic, I get a bit better but still a lot off:

``````par = [10^7, 0.0, 1.0] # par = [0.0, 1.0]
model = true_logistic(x, p) = p ./ (1 .+ exp.(-p.*(x.-p)))
fit = curve_fit(model, x, y, par)
``````

How can I improve the regression with these functions?

So I’m not intimately familiar with the michaelis-menten function, but it might be worth considering adding more parameters (for instance amplitude, width, x-shift, y-shift). And try to rescale the data so the parameters are within a couple of orders of magnitude (like I did with the `1e6` above).

Plotting with linear axes will probably make the behavior more intuitively obvious.

It seem like having zeroes in your model’s parameters might not make much sense since, for instance, for `michaelis_menten` you would only have zeros as a result. In any case, The plots you show do not correspond to the `curve_fit` results either, see below changing 0.0 to 0.2 in `param`:

``````using Plots
using LsqFit

x = collect(0:2.8:60)

param = [0.2, 1.0]
model = michaelis_menten(x, p) = (x .* p) ./ (x .+ p)
fit = curve_fit(model, x, model(x,param), param)
scatter(x,model(x,param))
plot!(x,model(x,fit.param))
`````` ``````model = logistic(x, p) = ((p.*x) .* (1 .- (x./p)))
fit = curve_fit(model, x, model(x,par), par)
scatter(x,model(x,param))
plot!(x,model(x,fit.param))
`````` ``````param = [10^7, 0.2, 1.0]
model = true_logistic(x, p) = p ./ (1 .+ exp.(-p.*(x.-p)))
fit = curve_fit(model, x, model(x,param), param)
scatter(x,model(x,param))
plot!(x,model(x,fit.param))
`````` I think the problem was in the X data, now I got this fitting:

the logistic model with exponents gives me the best fitting. I’ll try now to interpolate.
Tx

Hello,
assuming the fitting phase is done (I have a function with fitted parameters that represent the observed data), how do I run the interpolation? From Interpolations, it looks that (a) the main process is linear (whereas the curve here is not linear by definition), (b) it requires a matrix (whereas I have a single value to interpolate: 500 000), (c) curved interpolation already has its functions (`Constant, Linear, Quadratic, and Cubic`) thus there was a need to do the fitting part, (d) there is no space for the function I defined in the fitting phase, for instance the object `itp = interpolate(a, BSpline(Constant()))` takes `a` as the value to be interpolated, and provides already the function for the task (`BSpline`).
How can I find the X value for Y=500000 given the function `true_logistic` I defined and the fitted parameters I have found:

``````3-element Vector{Float64}:
2.1647983679173873e6
0.7211511621759724
13.214716462754794
``````

Thank you

If you have a fitted function, you don’t need `Interpolations`. Just do

``````model(x, p) = ...
fit = curve_fit(model, x_data, y_data, p_0)
p = fit.param

new_y = model(new_x, p)
new_ys = model.(new_xs, Ref(p))
``````

At least in physics, curve fitting is strictly better than naive interpolation, because you leverage the knowledge you have about the system to find the most probable intermediate values. And you can get estimates for the quality of fit, and confidence bounds for your predictions.

4 Likes

In this case, though, I need:

``````new_x = model(new_y, p)
``````

if I put the new_y as the level of 500 000 that I want, with the fitted p, I get two million, that is obviously wrong. How do I get X given Y?
Thanks

can you try using this parametrization? to see what comes out?

``````f1(x,p) = p ./ (p .* x .^ p .- 1.0)
param = [2.165e6, 10.0, -2.0]
``````

also, those points look a lot like bode plots for amplitude of response systems

I get:

``````julia> f1(x,p) = p ./ (p .* x .^ p .- 1.0)
f1 (generic function with 1 method)
julia> param = [2.165e6, 10.0, -2.0]
3-element Vector{Float64}:
2.165e6
10.0
-2.0
julia> fit_x = f1(500000, param)
-2.1650000000866e6
``````

Does it mean that to get X given Y (Y->X) I need the inverse function and then run X->Y?

Yes, given a function `f : x, p -> y`, you’ll need to either

1. Fit `f` to your `x` and `y` data (i.e. find appropriate `p`), then find the inverse `g : y, p -> x` and use those `p` to find `x_n` given `y_n`,
2. Find `g`, then fit it to your `y` and `x` data to find `x_n` given `y_n`, or
3. Fit `f` to your `x` and `y` data, then use some numerical solving method to find `x_n` given `y_n`

The two former methods should be roughly equivalent, and in my opinion superior to the last one as long as `f` is (easilly) invertible (always do analytics as far as you can, involving numerics only at the last step). I think I would go with option 1, because it’s more immediately intuitive.

I used symbolab to find the inverse function of the logistic and followed option 1:

``````julia> model = true_logistic(x, p) = p ./ (1 .+ exp.(-p.*(x.-p))) # set regression
true_logistic (generic function with 1 method)
julia> inverse_logistic(x, p) = ( p .* p .* exp.(-p .* (x .- p)) ) ./ ( (1 .+ exp.(-p .* (x .- p))^2 )
inverse_logistic (generic function with 1 method)
julia> fit = curve_fit(model, x, y, par)
LsqFit.LsqFitResult{Vector{Float64}, Vector{Float64}, Matrix{Float64}, Vector{Float64}}([2.1647983679173873e6, 0.7211511621759724, 13.214716462754794], [57.26888702184834, 65.57394242932473, 97.69294768592027, 169.4743733818491, 308.55542081798865, 593.6144246584468, 1148.52487052776, 2116.5614422187937, 3246.5892025041103, 2642.0985198940616  …  2621.5980348929297, 1631.373413761612, 194.25900849187747, -341.1922586141154, -480.2387717682868, -499.5315029202029, -501.42603116389364, -500.9472773480229, -502.1486880327575, -490.7844598405063], [7.264828417848866e-5 -2078.1127703397933 -113.40640133348667; 8.51168462811274e-5 -2394.275920789786 -132.86858696614163; … ; 1.0000000000034532 7.689947814299762e-5 0.0; 1.0000000000034532 0.0 0.0], true, Float64[])
julia> P = fit.param
3-element Vector{Float64}:
2.1647983679173873e6
0.7211511621759724
13.214716462754794
julia> fit_x = inverse_logistic(Xp, P)
0.0
``````

but the expected value should be a bit more than 11…

Just as a sanity check, try

``````plot(x,y;st=:scatter)
plot!(x,true_logistic.(x,Ref(P)))
plot!(inverse_logistic.(y,Ref(P)),y)
``````

These should all look about the same. Without looking too close into your inverse function, it looks a bit strange that `p` is in the numerator of both functions, I would expect it to be in the delimiter for the inverse function (I.e. if the “amplitude” of `f` is large, the “amplitude” of `f^{-1}` should be small).

Having looked a bit closer at your function, i.e. manually solving `y = a/(1+exp(-b(x-c)))` for `x`, the inverse function should be

``````g(y,p) = p - log(p/y - 1)/p
``````

No idea what the inverse calculator was doing.

1 Like

interesting! I used the formula given by Wikipedia: and I changed L=a, k=b, x₀=c, thus: I’ll try your inverse, thank you!