Help with dot macro broadcasting

optim
broadcasting
#1

Hi there,

I’m trying to minimize a residual function f using LsqFit.jl since it is the only package that uses Levenberg-Marquardt (trying to do a course where the instructions and assignments assume Matlab usage, and we were explicitly asked to use LM).

function residual(t,k,p)
    ...
end

which is a function of two variables t and k, and I want to optimize the parameter set p.

I’ve run through the single variable example, but still trying to wrap my head around the multivariate example.

I’ve tried doing something like this:

@. model(x,p) = residual(x[:,1], x[:,2], p)
xdata = rand(2,2)
ydata = rand(2)
p = rand(6)
fit = curve_fit(model, xdata, ydata, p)

But end up with a Dimension Mismatch error during broadcasting.

I’ve narrowed the dimension mismatch error to just calling the model

model(xdata, p)

but I have no idea why.

Simple functions like

@. model(x,p) = x[:,1] + x[:,2] + p[1]

evaluate correctly. I think I’m just not understanding the @. broadcasting stuff? Would be great if I could get some clarification.

Thanks!

1 Like

#2

@. converts every function call into a dot call. So you are effectively calling residual.(x[:,1], x[:,2], p). Here you have the dimension mismatch: x[:,1] and x[:,2] are length 2 but p is length 6.

0 Likes

#3

Is there a way to leave p out of the dot call?

edit:

To be more clear, the example on LsqFit.jl has us do

@. model(x, p) = p[1]*exp(-x*p[2])
xdata = range(0, stop=10, length=20)
ydata = model(xdata, [1.0 2.0]) + 0.01*randn(length(xdata))
p0 = [0.5, 0.5]

and evaluating works fine

model(xdata, p0)

even though xdata and p0 have different dimensions, so why isn’t a dimension mismatch happening here?

0 Likes

#4

for what i understand of broadcasting, normaly there is broadcast with one variable, so, if you have:

f(x)=2x

the broadcasting works by applying the function over the entire vector:

f.(rand(6,6))= 2*rand(6,6)

in LM, you have a vector f(x,p) = y where x an y are values, and p is a set of parameters(where X and Y can have a lot of values, the length of the vector p is defined by the formulation of the function. by broadcasting, you are setting an operation that executes the operation f over a array of values(lets say X), giving an array of results, (Y).

Now, i will suppose thatresidual(t,k,p) operates over a single value t and k, using the parameters in p., using the example in Lsqfit.jl:

residual(t,k,p) = (x, p) = p[1]*exp(-t*p[2]+k*p[3]) #without the @. macro, 
#just from single values t and k to a single value y = residual(t,k,p) 

now coming to julia, and understanding that we iterate over x and not over p, we can do a simple loop:

function multimodel(x,p) 
y = zeros(size(x)[1])
    for i = 1:size(x)[1] #length of x
    y[i] = residual(x[i,1],x[i,2],p)
    end
return y
end

there are ways to vectorize and mantain p, like using Ref() , but I’m not an expert julia programer myself :smile: , so i will go with a for loop, because loops in julia are fast.

finaly, using the same data:

xdata = rand(2,2)
ydata = rand(2)
p = rand(3)
fit = curve_fit(model, xdata, ydata, p)

can give you a result. (it gave me one, at least)

Maybe is longer that expected, but it works, and you understand what is happening

2 Likes

#5

As longemen3000 said, you can use Ref

model(x,p) = residual.(x[:,1], x[:,2], Ref(p))

LsqFit.jl example is slightly different. There, the right hand side is simply converted to

p[1].*exp.(-x.*p[2])

So p[1] and p[2] are broadcasted as scalars, i.e they are expanded to match the dimension of x.

2 Likes

#6

Thank you! your multimodel function explained to me clearly what the @. is doing

1 Like