# Pass return of recursive function up the call stack?

Hello there, this is sort of a follow up to the question I posted yesterday about improving memory allocation in a simulation. You can find that question here for any context.

Following the advice @DNF, I decided to implement pre-allocated arrays and return a copy of the array with the updated simulation.

Now I am adding layers of complexity. I’m really working on generating training data for a Temporal Difference algorithm using random walks, but this example shows generating a random number until we reach an even number and returning that sequence.

Now I would like to generate 2 episodes 5 times such that my training data becomes a nested array typed as `Array{Array{Array{Int64}}}(undef, 5)`.

The size of 2nd array should be 2, and the inner most array is technically “unknown” in size. But I wish to limit it to 10 or less.

Here is my code example:

``````function gen_array!(cache)
for i in 2:length(cache)
num = rand(eltype(cache))
cache[i] = num
iseven(num) && return cache[1:i]
end
gen_array!(cache)
end

function episode!(episode, sim)

for i in 1:length(episode)
episode[i] = gen_array!(sim)
end
return episode
end

function simulate!(train, E)
START = Array{Int64}(undef, 10)
START[1] = 1333

EPISODES = Array{Array{Int64}}(undef, E)

for i in 1:length(train)
train[i] = episode!(EPISODES, START)
end
end

function main()
N::Int64 = 5
E::Int64 = 2

train = Array{Array{Array{Int64}}}(undef, N)
simulate!(train, E)
train
end
``````

The general idea is that we preallocate train and initialize simulation. We then start to fill our episodes through `gen_array!()`.

You’ll notice this is a recursive function.

My reasoning here is, say we have hit the size limit of the array and still have not generated an even number. I don’t want to grow the size of the vector, but restart the function and write over the array until we reach a sequence that generates a random number.

I thought this was working until I closely inspected the output

``````include("example.jl")
main (generic function with 1 method)

julia> main()
5-element Array{Array{Array{Int64,N} where N,N} where N,1}:
[[1333, 5175378229263772070], [1333, -7110769136095499899, 2890408208590898501, -2599245076049037924]]
[[1333, 5175378229263772070], [1333, -7110769136095499899, 2890408208590898501, -2599245076049037924]]
[[1333, 5175378229263772070], [1333, -7110769136095499899, 2890408208590898501, -2599245076049037924]]
[[1333, 5175378229263772070], [1333, -7110769136095499899, 2890408208590898501, -2599245076049037924]]
[[1333, 5175378229263772070], [1333, -7110769136095499899, 2890408208590898501, -2599245076049037924]]
``````

Each sequence is different, but every episode is the same. I think this has something to do with the function being recursive. That it gets so deep in the stack that once it hits the return it passes it back to the previous function call and it disappears or something.

Any ideas?

1 Like

You are allocating one array `START` and then manipulating it over and over again. Each `episode` contains a pointer to the same memory! You either need to copy start before you do something or instead of working with array of arrays use a multi-dimensional array from the getgo.

1 Like

I was thinking about multidimensional array, but doesn’t reach row need to have the same amount of elements?

Could I do something like `train = Array{Int64}(undef, at most 10, 2, 5)`?

Also doesn’t `cache[1:i]` make a copy of start and return that? So it should be changing the values of start and returning a copy if it hits an even number, if it doesn’t it just rewrites over start until it does?

So, since you asked for performance. Julia `Array`s are actually pretty heavy, and aren’t true julia objects; instead they are implemented in `array.c` and `julia.h` and then bolted on. This means that tiny arrays are no good: Almost all of the time and memory is spent on overhead (`Array` is designed for very large chunks of memory).

Presumably generation time doesn’t really matter, and most of your time will be spent on training. So it would probably pay to get creative with your data layout. Just to give an example,

``````struct RunContainer{T} <: AbstractMatrix{SubArray{T,1,Array{T,1},Tuple{UnitRange{Int64}},true}}
contents::Vector{T}
colptr::Vector{Int}
end

RunContainer{T}() where T = RunContainer(T[], [1])

Base.size(rc::RunContainer) = (2, (length(rc.colptr)-1)>>1)

Base.@propagate_inbounds function Base.getindex(rc::RunContainer, i, j)
@boundscheck checkbounds(rc, i, j)
from = rc.colptr[2*(j-1) + i]
to = rc.colptr[2*(j-1) + i + 1]
return view(rc.contents, from:(to-1))
end

function append_rc!(rc::RunContainer, run1, run2)
append!(rc.contents, run1)
push!(rc.colptr, length(rc.contents)+1)
append!(rc.contents, run2)
push!(rc.colptr, length(rc.contents)+1)
return rc
end
``````

This would be used like

``````julia> rc=RunContainer{Float64}(); append_rc!(rc, [1,2,3], [4,5,6, 7]); append_rc!(rc, [4.5], [12, 1.0]); append_rc!(rc, [1], [2]); rc
2×3 RunContainer{Float64}:
[1.0, 2.0, 3.0]       [4.5]        [1.0]
[4.0, 5.0, 6.0, 7.0]  [12.0, 1.0]  [2.0]
``````

This kind of layout is pretty good if the `view`s don’t get allocated, i.e. if your `getindex` is close to the actual use, which is hopefully inlined into the same context.

1 Like

Does this example utilize the way Julia prefers to index matrices? By column? I’m used to iterating row by row.

Would I just generate the first episode for every set and then generate the 2nd episode for every set?

The optimal iteration order is the same as the storage order, i.e. `rc[1, j][k]` then `rc[1, j][k+1]`, …, `rc[2, j][1]`, …, `rc[1, j+1][1]`, …

If you only want to generate one run at a time, you can simply `append(rc.contents, run); push!(rc.colptr, length(rc.contents)+1);`, and every second run you append will open a new pair of runs. These pairs are e.g. accessible by `view(rc, :, j)` or by `(rc[1, j], rc[2, j])`. They are not accessible by `rc[:, i]` because I did not implement `getindex(rc::RunContainer, ::Colon, j)`.

Nobody prevents you from implementing e.g.

``````Base.@propagate_inbounds function Base.getindex(rc::RunContainer, k, i, j)
@boundscheck checkbounds(rc, i, j)
from = rc.colptr[2*(j-1) + i]
to = rc.colptr[2*(j-1) + i + 1]
@boundscheck checkbounds(from:to, k)
return rc.contents[from + k - 1]
end
``````

such that the optimal memory order looks more julian (`rc[k, i, j]` is the same as `rc[i, j][k]`). This might confuse some readers of your code, though, because `rc[k, i, j]` is idiomatically used on `<:AbstractArray{T, 3} where T` instead of `<:AbstractMatrix`.

It is your choice what kind of API you want to use, as long as you provide whatever your later processing steps need.

1 Like