 Is there a findmin(A) that uses local search instead of visiting all elements of matrix A?

I have a large matrix that contains sampled values of a convex 2D function. I want to find the minimum efficiently without visiting all elements, given indexes for a good starting guess close to the minimum.

For example:

x = LinRange(-3,3,100)
y = LinRange(-3,3,100)
z = (x .- 1.1).^2 .+ (y' .- 1.1).^2  # my matrices are much larger than 100x100

julia> findmin(z)
(0.0008999081726354381, CartesianIndex(69, 69))

julia> findminopt(z, CartesianIndex(65, 67))
# same answer but uses local search starting at (65,67)

Is there a package for simple hill climbing that operates on matrices? So something like Optim.jl except discrete instead of continuous optimization (and with a matrix instead of an explicit function to minimize).

1 Like

Aren’t you rather looking for the minimum of the function f(x,y) = (x-1.1)^2+(y-1.1)^2 ?

If so you should use an optimization package, like Optim.jl, for example, working directly on the analytic function f.

I’n not aware of any function that searches locally in discrete matrices for just the nearest local minimum, but it should be straightforward to roll your own. Just look at the neighboring element, move to the smallest, and stop when you are at a local minimum.

5 Likes

You could use Interpolations.jl to get a continuous function of the position in z, and then Optim.jl to minimise.

1 Like

Thanks but no, that was just an example. I don’t have the explicit function. I did end up rolling my own implementation of Coordinate descent:

function choosedirection(A, index, deltas)
best, i = findmin(get(A, index + d, Inf) for d in deltas)
dir = deltas[i]
return dir, best, index + dir
end

function descend(A, dir, best, index)
while true
newindex = index + dir
val = get(A, newindex, Inf)
if val < best
best = val
index = newindex
else
return best, index
end
end
end

function coordinate_descent(A, index::CartesianIndex{2})
same = CartesianIndex(0, 0)
deltas = CartesianIndices((-1:1,-1:1))
while true
dir, best, index = choosedirection(A, index, deltas)
dir == same && return best, index
best, index = descend(A, dir, best, index)
end
end

julia> @btime findmin(\$z)  # see calculation of z in my first post
19.900 μs (0 allocations: 0 bytes)
(0.0008999081726354381, CartesianIndex(69, 69))

julia> @btime coordinate_descent(\$z, \$CartesianIndex(60,60))
145.530 ns (1 allocation: 32 bytes)
(0.0008999081726354381, CartesianIndex(69, 69))

julia> @btime coordinate_descent(\$z, \$CartesianIndex(1,1))
210.467 ns (1 allocation: 32 bytes)
(0.0008999081726354381, CartesianIndex(69, 69))

I’m sure this can be improved upon, but I’m happy with the 100x speedup over findmin() in this example. (Of course this only works when you know the matrix was sampled from a convex function.)

5 Likes

Your very nice code seems to work well for “locally convex” functions too:

x = y = LinRange(-10,10,1001)
f(x,y) = @. 2x^3 + 6x*y^2 - 3y^3 - 150x
zmin, ci = coordinate_descent(f.(x,y'), CartesianIndex(800,600))  # (-500.0, CartesianIndex(751, 501))

using Plots; gr()
heatmap(x, y, f)
contourf!(x, y, f, levels=20, clims=(-1500,1500), widen=false)
scatter!([x[ci]], [y[ci]], ms=6, mc=:cyan, label="Array local min")  # (5.0, 0.0) --> local min of F

With proper initial guess, the array local minimum was found spot on! 5 Likes

This is a clever solution for handling the boundary. I didn’t realize get worked for arrays. But Inf as default value seems like recipe for type instabilities. How about

get(A, index + d, A[index])

and

get(A, newindex, best)

?

1 Like

Oops, I didn’t notice that potential instability since my example and real matrices were floats. Thanks! Also please note that I hacked this together in an hour, so use it at your own risk!

Your second suggestion is fine but the first one costs an extra array lookup. Also it would error if the main function is called with an out of bounds index. Maybe typemax(eltype(A)) instead?

3 Likes

Indeed, I was aware of that and just wrote it as a shorthand for for

a = A[index]
get(A, index + d, a)

But, ok, that’s still one extra lookup. The typemax solution is probably a good idea anyway.

Yeah, but maybe that should be an error anyway? I mean, how far out of bounds should you allow?

2 Likes

One of the main advantages of CartesianIndices is that they allow N-dimensional generalization. FWIW, please find a suggestion below:

function coordinate_descent(A::AbstractArray, index)
ci = CartesianIndex(index)
same = zero(ci)
deltas = -oneunit(ci):oneunit(ci)
while true
dir, best, ci = choosedirection(A, ci, deltas)
dir == same && return best, ci.I
best, ci = descend(A, dir, best, ci)
end
end

where the input argument index is now a tuple of integer indices.

4 Likes