This struct breaks an interface

I didn’t know how to word the title. But this object makes Juno throw errors.

For context. I have some objects which are located all over the place (memory-wise). I want to query them and pass them to Optim. But before that I want to place the query results in vectors in order to obtain better memory locality, as Optim will be accessing the results over and over, probably sequentially. I don’t know the size of the results (I know an upper bound at startup) and I also don’t want to allocate these vectors every time I query these objects. So I created a struct called MaterializedStuff which on creation allocates these vectors with a size larger than the known bound, and define a materialize!(ms::MaterializedStuff, stuff) function which collects the query results, writes them to vectors, and stores the vector size. Then I never access the vectors directly, I only access views. This is ok, because Optim will only be reading. Here’s my implementation:

mutable struct MaterializedStuff
    l::Int
    arr::Vector{Float64}
    MaterializedStuff(n::Int) = new(n, Vector{Float64}(undef, l))
end
Base.length(ms::MaterializedStuff) = getfield(ms, :l)
Base.getproperty(ms::MaterializedStuff, f::Symbol) = view(getfield(ms, f), 1:length(ms))

and the materialize! function is schematically

function materialize!(ms::MaterializedStuff, stuff)
    counter = 1
    for i in stuff
       if condition 
           write to ms.arr
           counter += 1
       end
       ms.l = counter
    end
end

The two important points are that A) materialize! sets ms.l which is only known at runtime (not to be confused with MaterializedStuff(n::Int), which is the upper bound known at startup time), and B) getproperty returns a view with a size equal to ms.l

I like this because as a user I can type ms.arr and be sure that it will have a variable size and won’t cause any allocations.

In Juno, this code:

ms = MaterializedStuff(100)
ms.

The moment I type the last dot . I get a Juno error:

Julia Client – Internal Error

MethodError: no method matching view(::Int64, ::UnitRange{Int64})
Closest candidates are:
  view(!Matched::AbstractUnitRange, ::AbstractUnitRange{var"#s91"} where var"#s91"<:Integer) at subarray.jl:167
  view(!Matched::StepRange, ::AbstractRange{var"#s91"} where var"#s91"<:Integer) at subarray.jl:175
  view(!Matched::StepRangeLen, ::OrdinalRange{var"#s91",S} where S where var"#s91"<:Integer) at subarray.jl:179
  ...
getproperty(::MaterializedStuff, ::Symbol) at replayer.jl:143
completionreturntype(::FuzzyCompletions.PropertyCompletion) at completions.jl:269
completion(::Module, ::FuzzyCompletions.PropertyCompletion, ::String) at completions.jl:179
_broadcast_getindex_evalf at broadcast.jl:648 [inlined]
_broadcast_getindex at broadcast.jl:621 [inlined]
getindex at broadcast.jl:575 [inlined]
macro expansion at broadcast.jl:932 [inlined]
macro expansion at simdloop.jl:77 [inlined]
copyto! at broadcast.jl:931 [inlined]
copyto! at broadcast.jl:886 [inlined]
copy at broadcast.jl:862 [inlined]
materialize at broadcast.jl:837 [inlined]
fuzzycompletionadapter(::String, ::String, ::String, ::Int64, ::Int64, ::Bool) at completions.jl:158
(::Atom.var"#275#276")(::Dict{String,Any}) at completions.jl:54
handlemsg(::Dict{String,Any}, ::Dict{String,Any}) at comm.jl:169
(::Atom.var"#31#33"{Array{Any,1}})() at task.jl:356

As a user I never access ms.l, but I guess Juno struct inspection call getproperty(ms, l) which acts with a view on an int.

So my question is: What is the good way of implementing this idea?

The issue is this line:

Base.getproperty(ms::MaterializedStuff, f::Symbol) = view(getfield(ms, f), 1:length(ms))

If you try to access ms.l then your getproperty is called. And view() does not exist for an integer. You need to check f and if it’s :l then just return l otherwise you can do the view() you want to.

This line in the stack trace:

getproperty(::MaterializedStuff, ::Symbol) at replayer.jl:143

Kind of hint’s it’s not a Juno thing but your function crashing…

2 Likes

Right, so my question was: how do I implement this idea in a way that doesn’t break anything?

function Base.getproperty(ms::MaterializedStuff, f::Symbol) 
    if f != :arr
        getfield(ms, f)
    else
        view(getfield(ms, f), 1:length(ms))
    end
end

That should work for what you want…I think.

Or maybe this, not sure why i focused on the not equal first, seems backward now that I think about it. Either one will work.

function Base.getproperty(ms::MaterializedStuff, f::Symbol) 
    if f == :arr
        view(getfield(ms, f), 1:length(ms))
    else
        getfield(ms, f)
    end
end
1 Like

Hah I thought we could come up with something more “elegant”. But yes, that definitely works :slight_smile: Thank you

I’m not sure how you are using arr but if you wanted elegant I would probably go with:

Base.length(ms::MaterializedStuff) = ms.l
function Base.getindex(ms::MaterializedStuff, i::Int)
    1 <= i <= ms.l || throw(BoundsError(S, i))
    return ms.arr[i]
end
function Base.setindex(ms::MaterializedStuff, v::Float64, i::Int)
    1 <= i <= ms.l || throw(BoundsError(S, i))
    ms.arr[i] = v
end

Which basically turns your MaterializedStuff into an array which now has me wondering why you are redefining an array? Couldn’t you do something like:

function materialize!(ms::Vector{Float64}, stuff)
    sizehint!(ms, l)
    for i in stuff
       if condition 
           push! to ms.arr
       end
    end
end

The sizehint! should keep the re-allocations down if that’s what you are trying to do.

The problem with turning ms into a vector is that the ms I showed you was just a sample. In my code, ms has like 10 vectors, all of the same size, and from the REPL I find it very comfortable to access them as ms.v1, ms.v2 etc.

The sizehint! suggestion might be a more proper way to do it though, thank you.

If you are not putting values into the array in order…like you need to jump ahead and add the 5th element before adding 3 and 4 you might like into resize!(). So you could resize!() to l put in the data then resize down to the actual length. The only reason I can think of off the top of my head to manage your “own” vector is if you wanted to resize smaller, then increase the length without losing the valuesyou had added before resizing smaller. (I think resize! would still work but there is no guarantee that you wouldn’t lose the “extra” values.)

Depending on how your code is structured, it might be worth looking at StructArrays or ComponentArrays, they solve a similar problem and might make things a little simpler.

2 Likes