Trying to identify possible optimizations (or errors) in a graph algorithm

lightgraphs

#21

So, would that be a second graph representation in LightGraphs.jl? Or would that go into a separate package like StaticGraphs.jl?

Is there a Matrix{Bool}-based implementation of the LightGraphs.jl interface already?

Next thing: One of the cool features of bitmapped graphs is lightweight (lazy) views of induced subgraphs, via masking.

As far as I see, LightGraphs.jl does not define an interface for lazy views of induced subgraphs. The easiest way would be to define a lazy view via an AbstractArray{Bool}, and then have fadj return a struct containing both the mask and the original adjacency list, as a custom iterator.

Does the LightGraphs.jl interface expect fadj to obey the iteration protocol + some search, or does it need AbstractArray (defined length, getindex, etc)?


#22

It would be a separate package, like StaticGraphs or SimpleWeightedGraphs.

Closest is the sparse-matrixed-backed SimpleWeightedGraphs, but you don’t actually need a matrix of Bool to define edges - see StaticGraphs for a way to do it with just two vectors (that is, it’s a sparse matrix without the nzval vector. This saves you one byte per edge.

The API does not (yet) specify the interface for induced subgraphs. Each package is currently free to implement its own return value.

fadj is not part of any interface specification and indeed is only defined on SimpleGraphs. It is not necessary to implement fadj. The required interface may be found in interface.jl. The guarantee is that if you implement each of the dozen or so functions in interface.jl, all the other functions within LightGraphs will “just work”. (You might wish to override some of them to take advantage of your particular implementation’s strengths, but the general ones will work.)

Bottom line: don’t implement fadj unless you need it.


#23

Cool, I missed the description in interface.jl. That should all work for bitmaps (unless I failed at reading comprehension, neighbors(g,v) needs only length and iteration, and no random access).

Are light graphs used a lot for the setting of smallish (<4k vertices) dense-ish (>0.02 density / > 20 mean degree) graphs?

Is there a benchmark suite?

A benchmark suite would be nice to see where the bitmapped graphs fail to perform. In order to be useful, one probably needs specialized functions for at least some of the algorithms implemented in LightGraphs.jl.

Then, the way I would go about it for undirected graphs is to have two types: BMGraph{N}, where N=Base.num_bit_chunks(nv(g)), and BMSubGraph{N} which is the subgraph induced by a mask. Digraphs would be the same, except that they store both the adjacency matrix and its transpose.

This is a type-instability on construction that users need to function-barrier, but that permits local unrolling + simd of many loops.


#24

LightGraphs is designed to be generally fast across all graph types. That is, I can’t think of a case where performance drops off dramatically (in excess of algorithmic performance) based on graph size, density, etc. If you see an unwelcome performance regression between small and large graphs that can’t be explained by the algorithm, please file an issue.

We’ve had this as a to-do for a couple of years. The framework is there (in /benchmark) but it’s not in use and is probably outdated.

It’s actually very easy to stub out a new graph type to tweak it for your optimal use case. Again, it’s just a few interface functions and whatever constructors you want, and then you can think about what optimizations you need for the more advanced functions. I think StaticGraphs was done in less than an hour. So, my advice for your bitmapped graph type: give it a try! It shouldn’t take too long, and we’re happy to help in Slack.


#25

Welp, it took me a couple of days to wrap my head around what @foobar_lv2 's code was doing – bitwise operations is not something I’ve ever messed with. Still the example was self-contained enough that I think I know how it works, and I took a stab at some code that uses it. Since there’s discussion of making a graph type based on it, I thought someone might benefit from seeing the code (even if it’s just to state how bad it is!)

I didn’t try do do all the interfaces right now, just some that I use in my code. I make no claim that it’s as performant or correct as it could be:

# I don't use/modify all these in this section of code, but I'm to lazy to clean it up...
import LightGraphs: AbstractGraph, SimpleGraph, SimpleDiGraph, AbstractEdge, SimpleEdge, has_edge, is_directed, edgetype, src, dst, ne, nv, vertices, has_vertex, outneighbors, inneighbors, edges, add_edge!, rem_edge!, indegree, outdegree


# Code from foobar_lv2 in Julia discourse thread -- construct bitarray from bool
function mk_chunks(m::Matrix{Bool})
    n = size(m, 1)
    n == size(m,2) || error("matrix wants to be square")
    nchunks = Base.num_bit_chunks(n)
    bm_M = falses(nchunks<<6, n+1)
    bm_MT = falses(nchunks<<6, n+1)
    bm_M[1:n, end] .= true
    bm_MT[1:n, end] .= true
    bm_M[1:n, 1:n] .= m
    bm_MT[1:n, 1:n] .= m'
    M = reshape(bm_M.chunks, (nchunks, n+1))
    MT = reshape(bm_MT.chunks, (nchunks, n+1))
    return M, MT
end



# BitArray-style graphs - focus on digraph for now
# I often require the degree distribution in my code, so I find it useful 
# to pre-calculate and update as needed, rather than 
# do it on the fly.

struct lightbitdigraph{G<:Array{UInt64}}
    graph::G
    graphT::G
    indegree::Vector{Int64}
    outdegree::Vector{Int64}
end


# Basic outer contructor when we just have a boolean graph to start
# takes a digraph, stores the bitgraph and transpose, and the
# degree distribution.

function lightbitdigraph(g::G) where {G<:AbstractArray{Bool}}
    lightbitdigraph(mk_chunks(g)..., vec(sum(g, dims=1)), vec(sum(g, dims=2)))
end


# Define some needed helper functions
# I don't define all the needed interfaces here, just a few I use
# (plus a couple extra I found useful)
nchunks(g::T) where {T<:lightbitdigraph} = size(g.graph, 1)
nv(g::T) where {T<:lightbitdigraph} = size(g.graph, 2) - 1
vertices(g::T) where {T<:lightbitdigraph} = 1:nv(g)
chunks(g::T) where {T<:lightbitdigraph} = 1:nchunks(g)


# Get number of edges
# Note that self-loops are valid in some of the code here - careful.
function ne(g::T) where {T<:lightbitdigraph}
    x = 0
    @inbounds for v in vertices(g)
        for c in chunks(g)
            x += count_ones(g.graph[c, v])
        end
    end
    return x
end


# Check whether a given edge exists
function has_edge(g::T, i, j) where {T<:lightbitdigraph}
    i1, i2 = Base.get_chunks_id(i)
    @inbounds return !iszero(g.graph[i1, j] & (1<<i2))
end


# Assert a given edge is "on"
function add_edge!(g::T, i, j) where {T<:lightbitdigraph}
    i1, i2 = Base.get_chunks_id(i)
    j1, j2 = Base.get_chunks_id(j)
    @inbounds g.graph[i1, j] |= 1<<i2
    @inbounds g.graphT[j1, i] |= 1<<j2
    return nothing
end


# Assert a given edge is off
function rem_edge!(g::T, i, j) where {T<:lightbitdigraph}
    i1, i2 = Base.get_chunks_id(i)
    j1, j2 = Base.get_chunks_id(j)
    @inbounds g.graph[i1, j] &= ~1<<i2
    @inbounds g.graphT[i1, j] &= ~1<<j2
    return nothing
end


# Toggle a given edge on/off
function toggle_edge!(g::T, i, j) where {T<:lightbitdigraph}
    i1, i2 = Base.get_chunks_id(i)
    j1, j2 = Base.get_chunks_id(j)
    @inbounds g.graph[i1, j] ⊻= 1<<i2 
    @inbounds g.graphT[j1, i] ⊻= 1<<j2
    return nothing
end


# Calculate indegrees for a given node
# I use pre-calculated degrees for this type, 
# but testing showed that doing it on the fly was only a bit slower than
# standard lightgraph type on a 500 node graph

function indegree(g::T, v::V) where {T<:lightbitdigraph, V<:Int}
    @inbounds return g.indegree[v]
end


# Calculate outdegrees for a given node
function outdegree(g::T, v::V) where {T<:lightbitdigraph, V<:Int}
    @inbounds return g.outdegree[v]
end

A few notes:
In benchmarking, the speed of calculating just change statistics for existing edges (used to get total subgraph statistric counts) is actually a bit slower than the previous matrix{Bool} graph I was using. However, as that particular calculation seems to be an odd edge case I still can’t explain, the larger algorithm (that loops over the graph many times, toggling edges) is much, much faster.

The basic transitive triple calculation is 100x faster. The algorithm I have that loops over the graph multiple times takes <75ms, compared to ~ 650 ms in my original code (single Matrix{Bool}.

I made a version that uses two Matrix{Bool}s (the graph and it’s transpose) for comparison (out of curiosity), and this is still 7x faster than that.

I still have a bit of testing to confirm I’m getting identical output, so I’m not yet posting my full code… but I thought this might be useful to someone (or to me, if someone sees something I’m doing badly wrong in defining or using the types.)