Overcoming performance hit of accessing struct field

Hi all,

I’m working on a simulator for a Monte Carlo simulation of a spin ising model. I was working with a custom struct holding a few arrays and dicts. As an example, one of the arrays holds 512^2 Int8’s, which represent the state of every point in a square lattice. I’m looking into expanding my simulation with more general properties, and I was looking into MetaGraphs for that. I was expecting a slight performance hit, since during the simulation I’m for the most part just reading out some arrays, and just accessing an array field of a struct with an integer index seems as simple as possible. However, during benchmarking I found that actually, just accessing the value in the state array of my struct g as such: g.state[idx] takes about 30ns according to @btime. Implementing something minimal in MetaGraphs and accessing takes about 12ns. This is pretty significant since I’m accessing these values a few hundred thousand times every 60th of a second.

If I make a ref of the state vector and access it

stateref = g.state
@btime stateref[idx]

I get around 11ns now. Looking at the source code of MetaGraphs, it seems to be implementing a struct internally, and to get a property seems to be accessing some array or dict as a field of this struct as well.

How do I optimise accessing struct fields?

How is your struct defined?

Very simply just

struct Graph
    state::Vector{Int8}
    ...
end

Actually, I just tried accessing the array through a function.

function accessState(g, idx)
    g.state[idx]
end

@btime accessState(g, idx)

and this reduces the time to 12ns again. So this is maybe only a problem when accessing the value directly through the REPL? Why is this?

Because when you run from the REPL, it needs to resolve the getproperty call first. That is, it needs to figure out what field .state refers to, just from the symbol :state. This is resolved at compile time when put into a function

5 Likes

Also note that MetaGraphs.jl is not meant to be type-stable, which can slow down field access. You may be able to achieve better performance with its successor MetaGraphsNext.jl

I was looking at that, but even though MetaGraphsNext is advertised as a type stable version of MetaGraphs, a lot of functionality seems to be missing from it.

As one of the maintainers, it pains me to read that :slight_smile: @f.ij if you feel something specific is missing, perhaps you could open an issue so we can work on adding it?

6 Likes

Thanks for your reply. I’m sorry, I didn’t mean any disrespect, and I only took a quick look at the docs and I did some very quick testing, so take that as you will. I think I was mostly just envisioning MetaGraphs with type annotations, whether that makes sense or not to implement. MGN to me just seems to be more limiting then metagraphs, since from what I understand you’re limited to a single type of property for every edge, as apposed to MG where you can just set_prop! an arbitrary amount of properties. I guess you can define a tuple with all the types you need, but this needs to be defined then for every node in the graph. I guess this is indeed a way to achieve type stability, and it will probably be faster than MG, but it would change a lot about how I would work with it. Moreover, I got an error message when trying to initialise a MetaGraph from a non-empty grid graph from Graphs, which does work in MG. From the way MGN is set up I can kind of see why this isn’t the right way to initialise a graph in the package, but I was just expecting something that would work more akin to how the normal MG package works.

Again, nothing really wrong with the package, I suppose, but from the way it was advertised, I guess I was just expecting something that works more akin to MetaGraphs,

Indeed, MetaGraphsNext requires you to specify the type of vertex / edge metadata in advance, and I guess most people use either custom structs or named tuples for that.
If you need to dynamically add an arbitrary number of properties, possibly with different types, then you cannot guarantee stability. In that case, and given that you are not willing to change your current workflow, it may well be that MetaGraphs suits your needs better. It is not uncommon that achieving better performance requires some amount of refactoring, and it’s perfectly okay that you don’t want to invest in it.

As for the error you encountered during initialization, I am ready to take a look at it and see how we can improve constructors for MetaGraphsNext. Can you whip up a Minimum Working Example and open an issue on the repo?

I understand what you’re saying, and that makes sense.

Also, sorry, it wasn’t actually an error message but a warning message. I’m guessing this is intended behaviour, since MGN uses symbols as vertex indices. Doing the following

using Graphs, MetaGraphsNext
g = Grid([N,N])
mg = MetaGraph(gr, VertexData = Symbol, EdgeData = Symbol)

┌ Warning: Constructing a MetaGraph with a nonempty underlying graph is not advised.
└ @ MetaGraphsNext ~/.julia/packages/MetaGraphsNext/BYnZO/src/metagraph.jl:55
Meta graph based on a {102, 180} undirected simple Int64 graph with vertex labels of type Symbol, vertex metadata of type Symbol, edge metadata of type Symbol, graph metadata given by nothing, and default weight 1.0and default weight 1.0

gives a metagraph that doesn’t really seem to be useful to work with. I think having a constructor that turns the MetaGraph into something that is automatically filled with vertices and edges based on the underlying graph topology would be a nice addition. Since I assume that as of now, this is intended behaviour, I’m not sure whether I should open an issue or not, let me know. I did find that if you try to make a node indexed by an int, it gives an error, but at the same time adds a node to the graph that cannot be accessed. I’m guessing this is actually unintended behaviour?

That seems long:

julia> accessState = rand(Int8, 512^2);

julia> ind = 100_000;

julia> @btime accessState[ind];
  17.836 ns (0 allocations: 0 bytes)

# with variable interpolation
julia> @btime $accessState[$ind];
  1.200 ns (0 allocations: 0 bytes)

Actually, MGN can use anything except Ints as vertex indices, Symbol is just the default choice.

As for the constructor, you use the one where you specify metadata types, which is mainly useful to create empty graphs.
However, if you want to start from a pre-existing graph, you also have to give the values of the metadata, using the full constructor. I think the shortcut constructor should throw an error and not just a warning in your case, since it leaves metadata effectively un-initialized.

Regarding your last remark, can you provide a piece of code?