Vector of pointers to struct array elements

Hi, I’m trying to avoid multiple vector allocations to speed up the weighted sampling of a vector of structs:

using StatsBase
function chooseBox(room::Vector{Box})
    v = [x.k for x in room]
    sample(room, Weights(v))
end

The mutable struct Box contains a field k::Int which changes during the simulation.
If I were in C I would use a vector of pointers initialized at the first function call, something like
v=[ &x.k for x in room] (sorry for the language mixing)

I’ve browsed around here looking for something about pointers and did not find what I need.
Probably there is a Julia-like elegant solution but I do not get it.
Thanks in advance for any hint :slight_smile:

1 Like

Hello and welcome to the community!
You could probably use GitHub - JuliaArrays/MappedArrays.jl: Lazy in-place transformations of arrays for this, where the map would be a function that gets the field k. That way, you never have to allocate any new vector. You may pay a slight performance hit for doing so though, so it’s worth benchmarking.

2 Likes

I think that defining a vector of pointers like that (here or in C) won’t give you any benefit if Box contains only one integer element k, or very few immutable elements.

But you could define your own struct to handle a pointer to the whole vector, like this:

julia> struct Box
         k::Int
       end

julia> struct Room{T} <: AbstractVector{T}
         v::Vector{Box}
       end
       Base.getindex(room::Room{Int},i) = room.v[i].k
       Base.size(room::Room{Int}) = size(room.v)
       Base.length(room::Room{Int}) = length(room.v)

julia> room = [ Box(rand(Int)) for i in 1:1000 ];

julia> using StatsBase, BenchmarkTools

julia> function chooseBox(room::Vector{Box})
           v = [x.k for x in room]
           sample(room, Weights(v))
       end
chooseBox (generic function with 1 method)

julia> function chooseBox2(room::Vector{Box})
           v = Room{Int}(room)
           sample(room, Weights(v))
       end
chooseBox2 (generic function with 1 method)

julia> @btime chooseBox($room)
  525.382 ns (2 allocations: 7.97 KiB)
Box(-4332088731253364447)

julia> @btime chooseBox2($room)
  773.944 ns (1 allocation: 32 bytes)
Box(-4332088731253364447)

julia> @btime chooseBox2($room)
  331.156 ns (1 allocation: 32 bytes)
Box(-4332088731253364447)

Note that chooseBox2 doesn’t allocate almost anything, probably only a reference to the original vector. But for some reason I don’t understand it takes either less or more than time than the first one (770 or 330 ns).

1 Like

Thank you. Sorry, I edited the initial post adding that the struct Box is mutable and in fact, the values of k change in time (rich-get-richer mechanism). If k would be constant I would fill the vector with the values of k once at the first instance of the function.

1 Like

You don’t to have a mutable struct to modify the array contents. In @lmiq’s example, if you have a v::Vector{Box}, you can still do v[i] = Box(newk), for example.

2 Likes

Thank you @lmiq ,
This is what I meant with “elegant” solution. Probably it’s the sample function that creates a copy of the vector, which would be a little weird imho. I need to look into it.

No It does not. That option does only one allocation of 32 bytes, which is likely the pointer to the room vector in the new Room struct. Nothing else.

edit: actually the allocation there is on the Weights function:

julia> v = rand(1000);

julia> @btime Weights($v);
  74.145 ns (1 allocation: 32 bytes)

1 Like