Pass return of recursive function up the call stack?

dylanjm · September 28, 2019, 6:38pm

Hello there, this is sort of a follow up to the question I posted yesterday about improving memory allocation in a simulation. You can find that question here for any context.

Following the advice @DNF, I decided to implement pre-allocated arrays and return a copy of the array with the updated simulation.

Now I am adding layers of complexity. I’m really working on generating training data for a Temporal Difference algorithm using random walks, but this example shows generating a random number until we reach an even number and returning that sequence.

Now I would like to generate 2 episodes 5 times such that my training data becomes a nested array typed as Array{Array{Array{Int64}}}(undef, 5).

The size of 2nd array should be 2, and the inner most array is technically “unknown” in size. But I wish to limit it to 10 or less.

Here is my code example:

function gen_array!(cache)
    for i in 2:length(cache)
        num = rand(eltype(cache))
        cache[i] = num
        iseven(num) && return cache[1:i]
    end
    gen_array!(cache)
end

function episode!(episode, sim)

    for i in 1:length(episode)
        episode[i] = gen_array!(sim)
    end
    return episode
end


function simulate!(train, E)
    START = Array{Int64}(undef, 10)
    START[1] = 1333

    EPISODES = Array{Array{Int64}}(undef, E)

    for i in 1:length(train)
        train[i] = episode!(EPISODES, START)
    end
end

function main()
    N::Int64 = 5
    E::Int64 = 2

    train = Array{Array{Array{Int64}}}(undef, N)
    simulate!(train, E)
    train
end

The general idea is that we preallocate train and initialize simulation. We then start to fill our episodes through gen_array!().

You’ll notice this is a recursive function.

My reasoning here is, say we have hit the size limit of the array and still have not generated an even number. I don’t want to grow the size of the vector, but restart the function and write over the array until we reach a sequence that generates a random number.

I thought this was working until I closely inspected the output

include("example.jl")
main (generic function with 1 method)

julia> main()
5-element Array{Array{Array{Int64,N} where N,N} where N,1}:
 [[1333, 5175378229263772070], [1333, -7110769136095499899, 2890408208590898501, -2599245076049037924]]
 [[1333, 5175378229263772070], [1333, -7110769136095499899, 2890408208590898501, -2599245076049037924]]
 [[1333, 5175378229263772070], [1333, -7110769136095499899, 2890408208590898501, -2599245076049037924]]
 [[1333, 5175378229263772070], [1333, -7110769136095499899, 2890408208590898501, -2599245076049037924]]
 [[1333, 5175378229263772070], [1333, -7110769136095499899, 2890408208590898501, -2599245076049037924]]

Each sequence is different, but every episode is the same. I think this has something to do with the function being recursive. That it gets so deep in the stack that once it hits the return it passes it back to the previous function call and it disappears or something.

Any ideas?

vchuravy · September 28, 2019, 7:13pm

You are allocating one array START and then manipulating it over and over again. Each episode contains a pointer to the same memory! You either need to copy start before you do something or instead of working with array of arrays use a multi-dimensional array from the getgo.

dylanjm · September 28, 2019, 7:17pm

I was thinking about multidimensional array, but doesn’t reach row need to have the same amount of elements?

Could I do something like train = Array{Int64}(undef, at most 10, 2, 5)?

dylanjm · September 28, 2019, 7:24pm

Also doesn’t cache[1:i] make a copy of start and return that? So it should be changing the values of start and returning a copy if it hits an even number, if it doesn’t it just rewrites over start until it does?

foobar_lv2 · September 28, 2019, 8:37pm

So, since you asked for performance. Julia Arrays are actually pretty heavy, and aren’t true julia objects; instead they are implemented in array.c and julia.h and then bolted on. This means that tiny arrays are no good: Almost all of the time and memory is spent on overhead (Array is designed for very large chunks of memory).

Presumably generation time doesn’t really matter, and most of your time will be spent on training. So it would probably pay to get creative with your data layout. Just to give an example,

struct RunContainer{T} <: AbstractMatrix{SubArray{T,1,Array{T,1},Tuple{UnitRange{Int64}},true}}
contents::Vector{T}
colptr::Vector{Int}
end

RunContainer{T}() where T = RunContainer(T[], [1])

Base.size(rc::RunContainer) = (2, (length(rc.colptr)-1)>>1)

Base.@propagate_inbounds function Base.getindex(rc::RunContainer, i, j)
@boundscheck checkbounds(rc, i, j)
from = rc.colptr[2*(j-1) + i]
to = rc.colptr[2*(j-1) + i + 1]
return view(rc.contents, from:(to-1))
end

function append_rc!(rc::RunContainer, run1, run2)
append!(rc.contents, run1)
push!(rc.colptr, length(rc.contents)+1)
append!(rc.contents, run2)
push!(rc.colptr, length(rc.contents)+1)
return rc
end

This would be used like

julia> rc=RunContainer{Float64}(); append_rc!(rc, [1,2,3], [4,5,6, 7]); append_rc!(rc, [4.5], [12, 1.0]); append_rc!(rc, [1], [2]); rc
2×3 RunContainer{Float64}:
 [1.0, 2.0, 3.0]       [4.5]        [1.0]
 [4.0, 5.0, 6.0, 7.0]  [12.0, 1.0]  [2.0]

This kind of layout is pretty good if the views don’t get allocated, i.e. if your getindex is close to the actual use, which is hopefully inlined into the same context.

dylanjm · September 28, 2019, 9:12pm

Does this example utilize the way Julia prefers to index matrices? By column? I’m used to iterating row by row.

Would I just generate the first episode for every set and then generate the 2nd episode for every set?

foobar_lv2 · September 28, 2019, 9:35pm

The optimal iteration order is the same as the storage order, i.e. rc[1, j][k] then rc[1, j][k+1], …, rc[2, j][1], …, rc[1, j+1][1], …

If you only want to generate one run at a time, you can simply append(rc.contents, run); push!(rc.colptr, length(rc.contents)+1);, and every second run you append will open a new pair of runs. These pairs are e.g. accessible by view(rc, :, j) or by (rc[1, j], rc[2, j]). They are not accessible by rc[:, i] because I did not implement getindex(rc::RunContainer, ::Colon, j).

Nobody prevents you from implementing e.g.

Base.@propagate_inbounds function Base.getindex(rc::RunContainer, k, i, j)
@boundscheck checkbounds(rc, i, j)
from = rc.colptr[2*(j-1) + i]
to = rc.colptr[2*(j-1) + i + 1]
@boundscheck checkbounds(from:to, k)
return rc.contents[from + k - 1]
end

such that the optimal memory order looks more julian (rc[k, i, j] is the same as rc[i, j][k]). This might confuse some readers of your code, though, because rc[k, i, j] is idiomatically used on <:AbstractArray{T, 3} where T instead of <:AbstractMatrix.

It is your choice what kind of API you want to use, as long as you provide whatever your later processing steps need.

Topic		Replies	Views
Simulate Nested Array Without Allocating New Memory Every Loop? Performance question	13	1210	September 28, 2019
Returning a function multiple time to an array General Usage	1	506	February 18, 2021
Alternative to Any[ ] ? or efficient usage idea? New to Julia question , memory-allocation , push	9	233	July 12, 2024
Return value optimization and named return value optimization General Usage	7	533	September 25, 2022
Help in understanding efficiently handling multidimensional arrays New to Julia	15	1757	January 19, 2019

Pass return of recursive function up the call stack?

Related topics