# Vertex neighbors in LightGraphs.jl

I am using LightGraphs.jl to implement a fast SIR algorithm, which I have done. However, I am now trying to improve the timings. In the course of investigating, I find that `neighbors(g, v)` allocates memory (I used `@time` to determine this). So I went into the `LightGraphs.jl` source code.

In method `SimpleGraphsCore//SimpleGraphsCore.jl`, I find that

``````outneighbors(g::AbstractSimpleGraph, v::Integer) = fadj(g, v)
``````

and that the `neighbors` function is really a call to `outneighbors`.

So my question: when does the memory allocation occur? The adjacency matrix is obviously preallocated unless `fadj` is a function. Ok. In the same file, I find that

``````fadj(g::AbstractSimpleGraph) = g.fadjlist
``````

It is still not clear to me where the memory allocation occurs. Given the presence of `pop!(g.fadjlist)` in the same file, it appears that we are dealing with a linked list or similar structure.

In `SimpleGraphsCore/simpledigraph.jl`, I found functions

``````function SimpleDiGraph{T}(n::Integer=0) where {T<:Integer}
fadjlist = [Vector{T}() for _ = one(T):n]
badjlist = [Vector{T}() for _ = one(T):n]
end

This routine does not preallocate space for neighbors, but rather prepares a list unfilled vectors "just in case" neighbors are called for.

So when using `push!` On `fadjlist`, I assume allocation is made either for an integer or a reference to a structure. There are two costs: 1) the cost of performing the allocation, and 2) the cost of transferring data to the allocated memory.

I also found the longer routine in `SimpleGraphsCore/simpledigraph.jl`, but I am doubtful it is called when wanting to just give the neighbors of a given vertex:
``````

function all_neighbors(g::SimpleDiGraph{T}, u::Integer) where {T}
union_nbrs = Vector{T}()
i, j = 1, 1
in_nbrs, out_nbrs = inneighbors(g, u), outneighbors(g, u)
in_len, out_len = length(in_nbrs), length(out_nbrs)
while i <= in_len && j <= out_len
if in_nbrs[i] < out_nbrs[j]
push!(union_nbrs, in_nbrs[i])
i += 1
elseif in_nbrs[i] > out_nbrs[j]
push!(union_nbrs, out_nbrs[j])
j += 1
else
push!(union_nbrs, in_nbrs[i])
i += 1
j += 1
end
end
while i <= in_len
push!(union_nbrs, in_nbrs[i])
i += 1
end
while j <= out_len
push!(union_nbrs, out_nbrs[j])
j += 1
end
return union_nbrs
end

``````

I guess my question is: if the fadjlist is computed regardless of whether I invoke the `neighbors(g,i)` function, where is the memory allocation performed. I am still confused. Thanks.

Gordon``````
1 Like

Related to the previous message, I have the following code fragment:

``````	@time for ν in neighbors(G, node_u.index)
@time findTransSIR(Q, t, τ, node_u, nodes[ν], t_max, nodes, expo_τ)
println("after findTransSIR (within neighbor loop)\n")
end
``````

and results:

``````  0.000001 seconds (1 allocation: 112 bytes)
after findTransSIR (within neighbor loop)

.....

0.000001 seconds (1 allocation: 112 bytes)
after findTransSIR (within neighbor loop)

0.000512 seconds (224 allocations: 11.062 KiB)
after neighbors
``````

From this, I deduce that most of the allocations and memory allocate occurs in the statement

``````	@time for ν in neighbors(G, node_u.index)
``````

However, separate tests indicate that `neighbors(G, i)` does not allocate anything:

``````@time for i in nv(G)
neighbors(G, i)
end
0.000001 seconds
``````

Yes, it is a small graph:

``````julia> G
{50000, 250677} undirected simple Int64 graph
``````

Here is another experiment using a mutable structure (this is used in my code):

``````mutable struct myNode
index::Int  # :S, :I, :R
status::Symbol  # :S, :I, :R
pred_inf_time::Float64
rec_time::Float64
myNode(index::Int, status::Symbol, pred_inf_time::Float64, rec_time::Float64) =
new(index, status, pred_inf_time, rec_time)
end

nodes = Array{myNode,1}(undef, nv(G))

for u in 1:nv(G)  # extremely fast
nodes[u] = myNode(u, :S, 0.8, 0.9)
end

@time for node in nodes
neighbors(G, node.index)
end
``````

Notice the creation of a mutable structure. The code produces 3 Mbytes of allocations. I would like to reduce this to zero. Are these allocations the result of `mutable`? Why would that be? I can understand why mutable would create inefficiencies, but why would it engender storage allocation? Thanks.

Gordon

`neighbors` shouldn’t allocate for `SimpleGraphs`:

``````julia> g = Graph(100_000, 10_000_000)
{100000, 10000000} undirected simple Int64 graph

julia> @benchmark neighbors(g, 1)
BenchmarkTools.Trial:
memory estimate:  0 bytes
allocs estimate:  0
--------------
minimum time:     14.221 ns (0.00% GC)
median time:      15.052 ns (0.00% GC)
mean time:        15.097 ns (0.00% GC)
maximum time:     44.687 ns (0.00% GC)
--------------
samples:          10000
evals/sample:     999
``````

I used `@time` to determine this

Don’t use `@time` to benchmark. It’s likely you’ve run into allocations during first-run compilation:

``````julia> @time neighbors(g, 1)
0.005504 seconds (3.12 k allocations: 182.458 KiB)

julia> @time neighbors(g, 1)
0.000007 seconds
``````

I always run @time 2-3 times. It is much faster than @btime.

It’s faster because it does a single sample, unlike `@benchmark` and `@btime` which do multiple samples and compute relevant statistics so you can eliminate outliers.

Also, you don’t have to worry about being bitten by the first-time compilation allocations/timings if you use the proper tools.

How do you propose doing that when you’re heap-allocating memory for all the structs via `nodes`? You’re allocating a minimum of 32 bytes per `myNode` struct.

Benchmarking in global scope with global variables will always allocate. See the performance tips in the manual.

I do not know yet. I will start by para.etrizing my structures. If my nodes are on the heap, in the best of circumstances, there should be no allo ations hust to access Node properties. If it is not possi le, i might creat arrays of fliats and i ts and forevo structures.

Oh, I agree with you (perhaps I misunderstood your goal): accessing the data should not result in allocations. Creating the struct/vector that holds the data will necessarily allocate, as will any copying of the data to new variables.

Where are you seeing the allocations that you don’t want? It’s not clear to me from the code.