# Nested Multivariate Regression

Title: Nested Multivariate Regression
Appreciate some guidance and sample codes. I’m new to Julia language, but have used Python and R a bit. This is a curve fitting / regression of data (csv) exercise. The equation to fit is as below.

``````w(x, h) = s(x) + (1 - s(x))*(e(x) / h) ^ (1 / n(x)
``````

whereby s(x), e(x) and n(e) can be a constant, linear, power, exponential or logarithmic format, as below.

``````	s_con(x) = a
s_lin(x) = a * x + b
s_log(x) = a * log10(x) + b
s_pow(x) = a * x ^ b
s_exp(x) = a * exp10(x * b)

e_con(x) = c
e_lin(x) = c * x + d
e_log(x) = c * log10(x) + d
e_pow(x) = c * x ^ d
e_exp(x) = c * exp10(x * d)

n_con(x) = e
n_lin(x) = e * x + f
n_log(x) = e * log10(x) + f
n_pow(x) = e * x ^ f
n_exp(x) = e * exp10(x * f)
``````

The intended workflow/algo is a below.

``````s_list = [s_con, s_lin, s_log, s_pow, s_exp]
e_list = [e_con, e_lin, e_log, e_pow, e_exp]
n_list = [n_con, n_lin, n_log, n_pow, n_exp]

for s in s_list
for e in e_list
for n in n_list
w(x, h) = min(1, max(0, s + (1 - s) * (e / h) ^ ( 1 / n)))
-- doing the solver to determine paramater of a, b, c, d, e, f
-- reuse solved w(x, h) for later analysis, e.g. plot and reports.
end
end
end
``````

The plot & report will be analyzed to determine best match s(x) e(x) and n(x) for w(x, h). Thank you.

I am not sure that there is an existing package that can do all of this, but you could try looking at

for fitting some of the models. Also, you can just write a likelihood and optimize with

That said, my major concern is conceptual: a fishing expedition where both `log` and `exp` are considered may not be a good starting point for data analysis. Also the nesting of highly nonlinear forms will yield parameters that will be very sensitive to outliers, or even just noise.

The real question is how to iterate array/ list of possible functions and aggregate into complete function for the optimisation step. I’ve done a bit with some of the solvers, e.g. LsqFit or Optim. I may try others, e.g. JuMP and GLM, as suggested. This is not social experiment data, but physics. The model is empirical at best, thus the quest for best combination (of 125 combinations) to fit the data.

I would consider something like

``````function do_everything(data, s, e, n)
# do the optimization
# return whatever you need as NamedTuple
end

results = [do_everything(data, s, e, n)
for s in s_list, e in e_list, n in n_list]
``````

Sorry for my ignorance <- Julia newbie. Is it possible to do list/array of functions like below?

s_list = [s_con, s_lin, s_log, s_pow, s_exp]

This is just my conceptual algorithm.

Yes, you can create a vector of arbitrary objects.

Thanks very much. Will try it out later this week (or next).

After many attempts, I ended up got it working in R. Will retry Julia when time is more forgiving.

Sharing the solution to my initial problem.

``````using DataFrames, CSV, Cairo, Gadfly, Dates, Optim, Compose, Statistics

# Julia ver. 1.4.0, Optim ver. 0.20.1 # required for code below

struct MySimplexer <: Optim.Simplexer end
Optim.simplexer(S::MySimplexer, initial_x) = [rand(length(initial_x)) for i = 1:length(initial_x)+1]

function loop_all(loop_flag)

function ww(param, x1, x2, loop_flag)

if loop_flag == 1
ss = param * x1 + param # linear
elseif loop_flag == 2
ss = param * log10(x1) + param # logarithmic
elseif loop_flag == 3
ss = param * x1 ^ param # power
elseif loop_flag == 4
ss = param * exp(x1 * param) # exponnenntial
else
ss = param + param # connstannt
end
ss = min(1, max(0, ss))

if loop_flag == 1
ee = param * x1 + param # linear
elseif loop_flag == 2
ee = param * log10(x1) + param # logarithmic
elseif loop_flag == 3
ee = param * x1 ^ param # power
elseif loop_flag == 4
ee = param * exp(x1 * param) # exponnenntial
else
ee = param + param # connstannt
end
ee = max(0, ee)

if loop_flag == 1
nn = param * x1 + param # linear
elseif loop_flag == 2
nn = param * log10(x1) + param # logarithmic
elseif loop_flag == 3
nn = param * x1 ^ param # power
elseif loop_flag == 4
nn = param * exp(x1 * param) # exponnenntial
else
nn = param + param # connstannt
end

ww_ww = ss + (1 - ss) * (ee / x2) ^ (1 /  nn)
return min(1, max(0, ww_ww))
end

function sqerror(param)
err = 0.0
for i in 1:length(yy)
pred_i = ww(param, x1[i], x2[i], loop_flag)
err += (yy[i] - pred_i)^2
end
return err
end

init_param = [-4.2, 1.6, 4.0, -2.2, 2.5, 4.8]

res = optimize(sqerror, init_param, nnelderMead(initial_simplex = MySimplexer()))
par = Optim.minimizer(res)

return 0
end

for i in 1:5
for j in 1:5
for k in 1:5
loop_flag = [i, j, k]
loop_all(loop_flag)
end
end
end``````