The function to optimize in JuMP


#1

Folks, my prior experience is mainly in Matlab and I’m new to Julia, so forgive me if my question is basic.

I’ve written a likelihood function in Jupyter and want to maximize this function in JuMP. Since it’s 600 lines or so, I don’t want to type it directly in the REPL window/shell and don’t think it would work that way. I’ve saved it as a .jl file.

I find that functions that work fine on their own aren’t recognized when I try to maximize them in JuMP, at least if the function exists only inside a Jupyter file. To give a simple example, I type
include(“gammalikely.jl”)
data=[1;4;2;6;12;2]
M=Model(solver=IpoptSolver())
@variable(M,alpha>=0,start=0.0)
@variable(M,Beta>=0,start=0.0)
@NLobjective(M,Max,gammalikely2(alpha,data,Beta))
and get
"ERROR: Unrecognized function “gammalikely2” used in nonlinear expression"

Do you know how I can maximize a function if it’s defined in a file? Thanks in advance.


#2

You need to ‘register’ your user-defined function as described in the documentation here. I’ve found that section of the docs to be a bit confusing so give it a try and post back if you still have problems?


#3

Thanks. I’d thought JuMP.register() might be the key function but then I was confused becayse mysquare(x)=x^2 seemed to be typed directly into the command window in their example.

It still doesn’t seem to be working for me. Using the example of a function of mine, “gammalikely2”, JuMP.register works without complaint from the program, I define my 2 variables, but then at the NLobjective line it complains that

“ERROR: Unrecognized function “gammalikely2,jl” used in nonlinear expression”. My goal is to maximize the function gammalikely2 by optimizing the two parameters, alpha and beta, given the data inputted. I also tried typing just “gammalikely2” instead of “gammalikely2.jl”. I’ve attached a screenshot of the last 11 commands leading up to the error.

I’m also confused by the note in the documentation that “All vectors to user-defined functions are scalars, not vectors.” It seems that in the very example they give, x[1:2] is a vector with two elements. Also, what about data? Can’t that be an input? It’s not a scalar.


#4

OK, I think I solved my JuMP problems for now. The problem about the “unrecognized function” after the NLobjective line was because the third argument in the @NLobjective line should be the second argument, not the fourth argument, in the JuMP.register line above, but without the colon. To use data, I can define a function in the REPL window to use data of the dataset of my choice.

fn1(alpha,beta)=gammalikely2(alpha,data,beta)

My new problem is related to the function itself. It returns a number of type “Void”, which the JuMP optimization won’t accept. I’m not sure why or what type “Void” means, I understand integer and floating point. The convert function can’t seem to fix the problem.

fn1(alpha,beta)=convert(Float64,gammalikely2(alpha,data,beta))

fn1(1.2,2)

ERROR: MethodERrror: Cannot convert an object of type Void to an object of type Float64.

since Gammalikely2 is only 9 lines, I’ve attached it to take a look.


#5

I don’t see the attachment. Just copy and paste those 9 lines into a post, with triple backticks ``` before and after for code formatting? I’m happy to take a look.


#6

Thank you for your patience. The code for that function, including the print lines, is


function gammalikely2(alpha,data,Beta)

x=data[:,1]

println(typeof(x))

nn=size(data,1)

println(typeof(nn))

println(typeof(ones(1,nn)))

y=zeros(nn,1)

println(typeof(y))

for i=1:nn

    y[i]=(alpha-1)*log(x[i,1])-x[i,1]/Beta-log(gamma(alpha))

-alpha*log(Beta)

end

gammalikely2=ones(1,nn)*y

println(typeof(gammalikely2))

println(gammalikely2)

end

gamma (not gammalikely2) is the the gamma function and comes with Julia.


#7

Your function doesn’t actually return anything, so the return type is Void.


#8

It seems to return something to me. For example, if I type

data=[1;2;1;3;.5;1]

gammalikely(1.2,data,1.5)

It states the value of the function gammalikely2,

-7.45023

for those parameter values and data. For other parameter values or different data it returns other values.

So it seems that I must misunderstand the meaning of “return” something. If it computes the value of the variable gammalikely2 and prints it out for me, how can I change the function so that it “returns” the value of gammalikely2? Why doesn’t it “return” anything?


#9

Pardon the typo. I meant to type

gammalikely2(1.5,[1;2;1;3;.5;1],1.2)

results in the screen printing out -7.45023 in the 6th line of output, and the final line of the code is to print the value of gammalikely2, so I thought it was returning the value of gammalikely2.


#10

Returning and printing are 2 different things! Use typeof(func(args)) to figure out the type of the returned value, which is the returned value of the last line in the function by default unless you use return explicitly, print returns nothing and typeof(nothing) is Void.


#11

Thanks.

I replaced the print line at the end with return gammalikely2 and the function returns a type Float64. The function still doesn’t optimize in JuMP for me, though. The new error seems to have something do with the automatic differentiation. I type

data=[1;0;1;0;1;1]

myfun(alpha,Beta)=gammalikely2(alpha,data,Beta)

using JuMP

using Ipopt

m=Model(solver=IpoptSolver())

JuMP.register(m,:myfun,2,myfun,autodiff=true)

@variable(m,alpha>=0,start=1)

@variable(m,beta>=0,start=1)

@NLobjective(m,Max,myfun(alpha,Beta))

solve(m)

and get the error message

MethodError:Can not convert an object of type ForwardDiff.Dual{2,Float64} to an object of type Float64.

I don’t understand what that error message means and how to avoid getting that error. I haven’t found the 1 page “limitations of ForwardDiff” PDF to be helpful in writing a function that Julia can use automatic differentiation on. Any help avoiding this error and getting automatic differentiation to work is appreciated.

The code for gammalikely2 is as follows

``

function gammalikely2(alpha,data,Beta)
x=data[:,1]
nn=size(data,1)
y=zeros(nn,1)
for i=1:nn
y[i]=(alpha-1)log(x[i,1])-x[i,1]/Beta-log(gamma(alpha))-alphalog(Beta)
end
gammalikely2=ones(1,nn)*y
gammalikely2=squeeze(gammalikely2,1)
gammalikely2=gammalikely2[1]
return gammalikely2
end


#12

First off your function is ill-defined,

There is a missing operator between alpha and log(Beta) and you are taking log of 0 when x[i] is 0 which returns NaN. You should first run your function and make sure it does what you think it should.

Beta here should be beta too.


#13

In the actual function definition there isn’t a missing operator. It’s

for i=1:nn

y[i]=(alpha-1)log(x[i,1])-x[i,1]/Beta-log(gamma(alpha))-alphalog(Beta)

end

However, because the interface makes it almost impossible to copy and paste (I highlight the text, right click, and the line(s) that I just highlighted disappear so I can’t copy!) there are inevitable typos in the process of transcribing the function definition into this email.

As I said in my previous email, I’ve tested the function and for positive values of alpha and beta it gives a number, and that number is Float64. For example,

data=[3;4;7;6;4]

gammalikely2(1.5,data,2)

-12.79

typeof(gammalikely2(1.5,data,2))

Float64


#14

This version works.

using JuMP
using Ipopt

gammalikely2(alpha,data,beta) = reduce(+,
    (alpha-1) * log.(data) .- data / beta - 
    log(gamma(alpha)) - alpha * log(beta))

data=[3;4;7;6;4]
myfun(alpha,beta)=gammalikely2(alpha,data,beta)
m=Model(solver=IpoptSolver())
JuMP.register(m,:myfun,2,myfun,autodiff=true)
@variable(m,alpha>=0,start=1.)
@variable(m,beta>=0,start=1.)
@NLobjective(m,Max,myfun(alpha,beta))
solve(m)

Apparently one or more of the features you used in your function are not supported by the package used for autdiff which may or may not be expected. I will check it later.


#15

Thanks.

Would you or anyone else know the key difference that caused that version to work unlike my version? The fact that the summation is accomplished with reduce(+,…) rather than premultiplication by a row vector of ones? Or can anyone explain the error "can not convert

an object of type ForwardDiff.Dual{2,Float64} to an object of type Float64" so I can avoid it in the future? What is this ForwardDiff.Dual{2,Float64}, the vector of partial derivatives? Why does it have to be converted to a Float64 scalar?

You see, I have a much larger (about 500 line) function that I want to apply Julia’s automatic differentiation on, one that I imagine no one else has the time to read. Does it matter if the places where various parameters can affect the function value are not confined to a single line?


#16

For more details on ForwardDiff.Dual, you can refer to this nice talk from 2016 https://www.youtube.com/watch?v=r2hhRSHiQwY.

After a bit of fiddling, turns out the problem was that you were initializing y to be a Float64 zeros vector and then trying to modify its elements using a function of alpha and beta which does not return Float64 during the autodiff. During autodiff, dual numbers are passed in place of the arguments and this causes an implicit conversion request from ForwardDiff.Dual{2,Float64} to Float64 when trying to set the elements of y which obviously was not supported. To work around this, I used y = zeros(typeof(alpha),nn,1) and it worked. I am not sure if this qualifies as an issue/bug or not but at least you know what is going on.


#17

So is the moral that I always want to use functions that return Float64, or some precision of floating point, during autodiff? Or more generally that I should always use functions that return the same number type as the numbers they act on during autodiff? How can I be sure that they do this?


#18

In most cases Julia’s type inference will take care of things for you. As a general note, not just related to autodiff, try to avoid assigning the same name to data of different types, this so-called type stability. For arrays you will have to be a little careful because the type of an array does not change automatically when setting an element of the array to a different type. Instead, implicit conversion of your new element to the type of the array elements will be attempted. If unsuccessful, an error will be thrown like the one you got. So you will have to predict such error and choose the type of your array carefully as the type that will happily encompass all your data. And of course Julia will save you most of the time, because type inference works on whole array assignments that will just pick the right type for your new array. An example is ones(1,nn)*y from your previous case which produces a new array of the right element type that comes out of the addition of products of the Float64 and ForwardDiff.Dual elements. So you only have to be careful when initializing and handling array elements yourself to make sure you don’t face any type conversion error. And even then you might still get some of these errors, but that’s what debugging is for!

That’s probably the best advice I can give in that matter, AFAIK.


#19

The JuMP documentation (now correctly) links to this ForwardDiff documentation page which answers some of your questions.


#20

Thanks and sorry for being so slow about this. I’ve looked at that page entitled “Limitations of ForwardDiff”. I’m not sure which of those 4 bullet points on that page my error message,
“can not convert
an object of type ForwardDiff.Dual{2,Float64} to an object of type Float64” or more generally, "can not convert an object of type ForwardDiff.Dual{k,Float64}…"goes under.

Does it go under #4, “the types of array inputs must be subtypes of AbstractArray”? If so can I avoid this by always using the type Array, since all arrays are subtypes of AbstractArray? Or are they?