The call to Statistics.var creates a dense quadratic matrix, so with n = 1_000 there are 500,500 terms in your objective function.
Do instead:
julia> using JuMP, Statistics
julia> function main(n)
model = Model()
@variable(model, x[1:n], Bin)
@objective(model, Min, var(x .* rand(n)))
return model
end
main (generic function with 1 method)
julia> function my_main(n)
model = Model()
@variable(model, x[1:n], Bin)
@variable(model, ÎĽ)
@expression(model, y, x .* rand(n))
@constraint(model, ÎĽ == sum(y) / n)
@objective(model, Min, sum((yi - ÎĽ)^2 for yi in y))
return model
end
my_main (generic function with 1 method)
julia> @time main(200);
1.320286 seconds (16.76 k allocations: 598.565 MiB, 4.88% gc time)
julia> @time my_main(200);
0.000551 seconds (7.21 k allocations: 765.625 KiB)
Many thanks. The speedup is excellent. Could you please explain more about “the call to Statistics.var creates a dense quadratic matrix”? Did you mean that the dense matrix memory allocation is expensive? But the memory itself should not be that expensive.
When dealing with numbers:
julia> using Statistics
julia> x = rand(1000)
julia> @allocated var(x)
16
The other issue is just the sheer number of operations that happen. The default implementation in Statistics.var does not exploit common subexpressions. So it’s an O(n^3) operation. (There O(n^2) to compute each term, looped over yi in y for the extra n)
using JuMP, Statistics
function main(n)
model = Model()
@variable(model, x[1:n])
return var(x)
end
main(10);
x = [10, 25, 50, 75, 100, 125, 150, 175, 200, 225, 250]
y = [(@elapsed main(n)) for n in x]
a = Statistics.mean(y ./ x.^3)
Plots.plot(x, y)
Plots.plot!(x, a * x.^3)