ForwardDiff fast construction of vectors/matrices without cats

Hi,

I’m trying to use Optim to do maximum-likelihood estimation of some parameters in a custom likelihood model. In particular, I would like to incorporate autodiff to increase the accuracy of my estimates. However, setting the parameter autodiff = :forward causes errors in code that looks (approximately) like this:

function test(vec_len, p1)
    p = ones(Float64, vec_len) 
    for i = (vec_len - 1):-1:1 
        p[i] = p[i + 1] * p1
    end
    
    return p
end

I suspect it is because of the matrix/vector assignment which similar autodiff packages in other languages forbid. Indeed, I am able to get around this if, instead of preallocating the vector, I simply build it up by doing something like p = hcat(p[1] * p1, p). However, this becomes pretty slow (I imagine due to the loss of the preallocation).

Thus, my question: is there a way to construct such vectors in Julia either with preallocation or in some other way that doesn’t take a huge performance hit?

For completeness, the error message I get when trying to run the following code snippet:

function test(vec_len, p1)
    
    p = ones(Float64, vec_len) 
    for i = (vec_len - 1):-1:1 
        p[i] = p[i + 1] * p1
    end
    
    return p
end

lower = [0.0, 0.0, 0.0]
upper = [1.0, 1.0, 1.0]
x0 = [0.95]

result = optimize(x -> test(15, x), lower, upper, x0, Fminbox(LBFGS()), autodiff = :forward)
Optim.minimizer(result)

is:

MethodError: Cannot `convert` an object of type Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(Main, Symbol("##25#26")),Float64},Float64,1},1} to an object of type Float64

Any help would appreciated, thanks.

I could be wrong but I think it is because you specifically request the matrix to hold Float64 values, which does not work with the DualNumbers that AD packages use. You need to widen the type.

Your code needs to be more flexible: you must not hard-code the type of p to be Float64, but make it p=similar(p1), or p=zeros(eltype(p1), vec_len). Otherwise, that assignment in the loop tries to convert a dual (the product of a float and a dual is a dual) into a float, which fails.

Oh, oops, I did actually try this before. Setting p = ones(eltype(p1), vec_len) yields a similar error, unfortunately:

MethodError: Cannot `convert` an object of type Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(Main, Symbol("##31#32")),Float64},Float64,1},1} to an object of type Float64

I am new to Julia, so this might be inappropriate syntax, but setting p = similar(p1, vec_len) yields an identical error.

Yes, I was thinking about this as well but I wasn’t sure what else to change it to. Attempts along the lines of p = ones(eltype(p1), vec_len) fail (see above).

I also tried specifying ForwardDiff.Dual as a type; this yields Cannot determine ordering of Dual tags ForwardDiff.Tag{getfield(Main, Symbol("##17#18")),Float64} and Nothing as an error. This is different, but I don’t really understand what it means.

I don’t think test is doing what you intend.

You are passing in x as a 1-element vector, so
p[i] = p[i + 1] * p1 doesn’t work. You probably mean p1[1].

However, the bigger issue is that return p returns a 15 element vector (in your case). It needs to return a scalar. Perhaps you mean return sum(p)?

function test(vec_len, p1)
    p = ones(Float64, vec_len) 
    for i = (vec_len - 1):-1:1 
        p[i] = p[i + 1] * p1[1]
    end
    
    return sum(p)
end

You may also want to read the Optim documentation: Optim.jl

Once you have something working, you could try to improve the speed.

Apologies, you are correct. In attempting to write a self-contained example, I’ve overlooked this. As you note, running

function test(vec_len, p1)
    p = ones(eltype(p1), vec_len)
    for i = (vec_len - 1):-1:1 
        p[i] = p[i + 1] * p1[1]
    end
    
    return sum(p)
end

lower = [0.0]
upper = [1.0]
x0 = [0.95]

result = optimize(x -> test(15, x), lower, upper, x0, Fminbox(LBFGS()), autodiff = :forward)
Optim.minimizer(result)

does produce the correct answer on this toy example.

However, in my actual code (which does run correctly without the specification of the autodiff = :forward parameter) the analogous function to test above produces a vector of probabilities that is then used to calculate a scalar likelihood. I’ll post later today with an attempt to derive a minimal subset of that code which produces this error.