# Very slow array creation (and Benchmark iterate error)

I am a beginner Julia user and am having a hard time developing a model because it takes a very long time to initialize certain arrays. I tried using `BenchmarkTools.jl` to get an accurate time but it generated an iterate error that I cannot replicate. Anyway, I think the time with a smaller sample size (`NumTotalSpecies`) used for testing takes too long. Could I have some suggestions for my code?

``````time_FoodWeb = @time create_FoodWeb(BM, Ropt,γ)
82.159197 seconds (155.09 M allocations: 76.550 GiB, 12.35% gc time)
161.635944 seconds (383.93 M allocations: 195.124 GiB, 12.91% gc time)
51.302236 seconds (125.56 M allocations: 63.508 GiB, 14.34% gc time)
``````
``````using Distributions, Random

const D = 0.25
const Ropt = 100
const γ= 2
const q = rand(Normal(0.5,0.2)) #Hill exponent, μ = 0.5, σ = 0.2

#const = #was a constant but removed to allow testing

NumPlantSpecies = 8 #const
NumAnimalSpecies = 12 #const
NumTotalSpecies = NumPlantSpecies + NumAnimalSpecies #const

BM_P = sort(rand(Uniform(10^0, 10^6), NumPlantSpecies)) #body mass of Plant species #const
BM_A = sort(rand(Uniform(10^2, 10^12), NumAnimalSpecies)) #body mass of Animal species #const
BM = vcat(BM_P,BM_A) #placing BM_P and BM_A together under BM as mass of all species #const

function create_FoodWeb(BM,Ropt,γ)
function create_L(BM,Ropt,γ)
L = @. ((BM/(BM'*Ropt))' * exp(1-(BM/(BM'*Ropt))'))^γ
# Weak links (Lij <p) are removed
L[:,1:NumPlantSpecies] .= 0.0
L[L .<= 0.01] .= 0.0
return(L)
end

condition = true
while condition
BM_P = sort(rand(Uniform(10^0, 10^6), NumPlantSpecies))
BM_A = sort(rand(Uniform(10^2, 10^12), NumAnimalSpecies))
BM = vcat(BM_P,BM_A)

global L = create_L(BM,Ropt,γ)
#take the link probabilities of the previous matrix and determine if L[i,j] > p
global FoodWeb = zero(L)
FoodWeb[:,1:NumPlantSpecies] .= 0.0 # keeps plants as basal species
FoodWeb[L .> rand.(Uniform(0, 1))] .= 1.0 # Sets F[i,j] = 0  wherever  L[i,j] < F[i,j]

condition = any(sum(x->x>0, FoodWeb, dims=1)  + sum(x->x>0, FoodWeb, dims=2)' .== 0.0)
#dims=(1) for column, dims=(2) for row
end
return(BM, L, FoodWeb)
end
``````

Here I wanted to use `permutedims` instead of `'` because `BM` is an array but this leads to an error.

``````        L = @. ((BM/(BM'*Ropt))' * exp(1-(BM/(BM'*Ropt))'))^γ
``````
``````ERROR: MethodError: no method matching permutedims(::Float64)
``````

The first issue I notice is that you have a lot of non-constant global variables, and you’re creating even more non-constant globals inside your `while` loop. The very first Julia performance tip is “avoid global varaibles”, and we’re not kidding about that–it can have a huge impact on the performance of your code.

The first thing I would do is make all of your global parameters `const` and remove the `global L` and `global FoodWeb` annotations. The next thing I would do is pre-allocate those matrices outside of the loop and update them in-place, rather than constructing new ones every time.

4 Likes

(If you want to pass non-`const` data into a function, use function arguments. If you want to pass data out of a function, use the return value or, occasionally, mutable arguments.)

3 Likes

In fact, avoiding global variables isn’t just a Julia performance tip. In pretty much any programming language global variables are considered bad, and they should be avoided in general.

Could you give more details on the `@btime` error? I can’t reproduce using the code you posted above and running `@btime create_FoodWeb(BM, Ropt,γ)`

That said the suggestions given above help quite a bit with performance, the following slightly simplified version gives benchmark results between 1 and 15 seconds (trials are quite variable maybe due to the randomness in the `while` condition?)

I’ve added some comments as I’m not 100% sure my simplifications haven’t accidentally changed your algorithm, so do double check!

``````using Distributions, Random, BenchmarkTools

# note BM is not required as an input to this function
# Julia convention is to only capitalise types, not variable names
function create_foodWeb(Ropt, γ, no_plant, no_animal)

function create_L(bm, Ropt, γ, no_plant)
x = (bm ./ (bm' .* Ropt))'
L = @. ( x * exp(1.0 - x) )^γ
L[:, 1:no_plant] .= 0.0
L[L .<= 0.01] .= 0.0
return L # return is a keyword rather than a function
end

# preallocating arrays
foodWeb = zeros(no_plant + no_animal, no_plant + no_animal)
L = zeros(size(foodWeb)...)
bm = zeros(no_plant + no_animal)

condition = true
while condition
# assigning directly rather than creating new BM_A/BM_P arrays and vcating
bm[1:no_plant] = sort(rand(Uniform(10^0, 10^6), no_plant))
bm[no_plant+1:end] = sort(rand(Uniform(10^2, 10^12), no_animal))

L = create_L(bm, Ropt, γ, no_plant)
foodWeb .= 0.0 # setting existing array to zero
foodWeb[L .> rand()] .= 1.0 # note rand() already samples from Uniform(0,1)
# removed the x->x>0 as entries in foodWeb are either 0 or 1 anyway
condition = any(sum(foodWeb, dims=1) + sum(foodWeb, dims=2)' .== 0.0)
end

return bm, L, foodWeb
end

Ropt = 100
γ = 2
no_plant = 8
no_animal = 12

@benchmark create_foodWeb(Ropt, γ, no_plant, no_animal)
``````

Thank you for your replies. I have adopted the format @nilshg suggested and have been pre-allocating arrays - your example was very helpful to me compared to the manual.

@rdeits @stevengj I would like to make `L` and `FoodWeb`, once generated, become global constants, as they are used as parameters for functions further on. Is there a way to do this? `return const bm, L, foodWeb` does not work.

@nilshg Regarding the `@benchmark` error, I cannot recreate it either. Maybe it was just being strained by my very slow code.

I have a few structural suggestions for you. You probably don’t want L and FoodWeb as global constants, it might be better to pass the data around so your code is flexible. Using globals limits future modularity, and makes your code harder to think about in isolation as you keep accessing state from outside of the current scope.

You should make a struct that contains the arrays, instead of passing around separate arrays.

``````struct FoodWeb{D,L,B}
foodweb::D
l::L
bm::B
end
``````

But use better names for the last two fields :). Then all your data is lumped, and you can easily pass the struct around your program, and add new fields if you need them, of write a slightly different formulation and use multiple dispatch to trigger different methods.

Make `create_Foodweb `a constructor `FoodWeb() `that does the same things it currently does, but actually builds the struct as the last step and returns it.

1 Like