Allocations when creating graphs, inefficiencies

erlebach · July 1, 2020, 8:31pm

I am creating my own method to create a graph in 3D because graphplot has a bug, which I reported here and on Github, and I got no reply, and yet I still need the plot. Here is some MWE, which I run 2-3 times to make sure @time provides approximately correct values. @btime will not provide me with additional information in this case.

using LightGraphs
using GraphRecipes
import Plots

Plots.PlotlyBackend()

function plotGraph(graph, x, y, z)
    edges_ = []
    nb_nodes = nv(graph)
    nb_edges = ne(graph)

    p = Plots.plot([],[])

    for e in edges(graph)
        i = src(e)
        j = dst(e)
        xx = [x[i], x[j]]
        yy = [y[i], y[j]]
        zz = [z[i], z[j]]
        if i == 1
            p = Plots.plot(xx,yy,zz, color=:red)
        else
            Plots.plot!(p,xx,yy,zz, color=:red)
        end
    end
    return p
end

const n = 1000
const graph = SimpleGraph(n, 3*n)

const x = rand(n)
const y = rand(n)
const z = rand(n)
@time const p = plotGraph(graph, x, y, z);
@time display(p)

I am simply displaying the graph’s edges. With a small graph with 1000 nodes and 3000 edges, the execution of my method, plotGraph takes about 1.1 seconds (a factor of 2 or 3, either way, does not change the point I am making), while display() takes about 19 seconds to complete and then another 20 seconds to actually see the plot!!! That is huge since I wish to display 100,000 edges. The time is incredibly high, and I am hoping not to have to use OpenGL for this, which I abandoned 15 years ago!

Note that when I do not use constants, the memory allocation is about 800 Mbytes rather than 1.6 Gbytes, and the time is halved to 8 seconds, indicating that time is proportional to memory allocation.

So the question is: how is it possible to have so much memory allocation for such a small problem. It seems inconceivable to me. There is surely room for substantial speedups of this particular problem.

In 1986, I used the first Silicon Graphics, and although I was programming in C, I could easily draw 10,000 lines in real-time (10 frames per second), with almost no startup time. That is 35 years ago!!! I got faster speed even taking compiling into account. And even though it was C, this was 35 years ago and it was specialized hardware, computers of all stripes are so much faster today.

kristoffer.carlsson · July 1, 2020, 9:58pm

It seems like you are making a plot call per edge. Collect all your edges and use a single call.

erlebach · July 1, 2020, 10:50pm

Hi Kristoffer,

Yes, I realized that, and you are correct. I searched for how to collect my edges and do a single call, and the only method I came across is interspersing my coordinates with NaN, which would slow down the computation in languages like C for example, so I assumed it would be true in Julia. OpenGL had the notion of collections which were data structures especially built to collect edges, colors, and other such things, with efficiency in mind. I assumed that after all these years, modern libraries would have these kinds of features.

If there is a way to collect the edges that does not involve using NaNs, I would be interested to learn about it.

Thanks again,

Gordon

P.S. I would like to apologize for the tone of my post. I was very frustrated by the time my code was taking. I am currently experimenting with Makie.

dpsanders · July 1, 2020, 11:32pm

Makie has a LineSegments type if I remember correctly.

dpsanders · July 1, 2020, 11:33pm

No, apparently you just insert NaNs in Makie too. What aspect of it would slow down the computation?
Note that GLMakie is a high-level interface to OpenGL.

erlebach · July 2, 2020, 12:02am

Thanks. I just checked and LineSegments exists and does not require NaNs. I will test this out later this evening. The docs states “Plots a line for each pair of points in (x, y, z), (x, y), or positions.”, exactly what I was looking for. I wish that ‘graphplot’ had the same capability.

Regarding the speed of NaN in C, if you take advantage of pipelining, which Julia surely does to achieve speeds close to C, any change in variable type within a vector will slow things down, force cache misses. The NaN cannot be treated a float; an exception is probably thrown, slowing down calculations.

It is likely that on a graphics card with OpenGL hardware, that the data in LineSegments is efficiency toy encoded into textures, which are extremely efficient on the GPU.

Gordon

dpsanders · July 2, 2020, 2:30am

The point is that a NaN is a float, so everything is of the same type:

julia> NaN isa Float64
true

julia> NaN32 isa Float32
true

erlebach · July 2, 2020, 2:36am

Ok, that is cool. But the hardware must still check for the presence of the NaN via some sort of conditional. In the meantime, I have confirmed that LineSegments works as I surmised.

I am now grappling with translate, rotate, cameras. There are not many recent examples. I come from the world of OpenGL and it has been a while.

dpsanders · July 2, 2020, 3:29am

The hardware is very fast at doing this though.

Have you seen http://juliaplots.org/MakieReferenceImages/gallery/index.html?

erlebach · July 2, 2020, 3:42am

I have seen the gallery but was only moderately impressed because I did not see demos that showed
the high number of edges that could be drawn interactively, which is what I am interested in. I worked in Visualization
for many years in my youth. Also, the animated sequences appeared to have been created using record, so regardless
of how long each frame takes to render, the sequence will animate in real time. But the Gallery does demonstrate the
versatility of Makie. Have you seen the galleries of Matplotlib and D3? They are also very versatile, but not nearly as interactive.

Cheers,

Gordon

dpsanders · July 2, 2020, 3:44am

Based on my experience, you shouldn’t have any problem with ~10^6 edges. I wouldn’t want to try that in matplotlib or D3…

dpsanders · July 2, 2020, 3:45am

By the way, in case it wasn’t clear, I was suggesting to look at the gallery for examples of camera positioning and rotations. But I agree that more examples are required.

kristoffer.carlsson · July 2, 2020, 9:57am

If you think about it, there needs to be some way to indicate where a line starts and ends and the logic for checking that needs to include some type of conditional. Encoding it as a NaN is probably one of the more efficient approaches since it is inline with the data and all data is stored contigously.

Topic		Replies	Views
Looping over edges in LightGraphs New to Julia	8	2004	June 7, 2020
Edges plotted with graphplot, which should not be New to Julia	0	265	July 1, 2020
Graphing a SimpleGraph New to Julia	1	397	June 30, 2020
Graph construction performance Graphs lightgraphs	2	723	May 16, 2019
GraphRecipes versus GraphPlot: why such a big performance disparity? Graphs plotting , lightgraphs	0	497	July 22, 2020

Allocations when creating graphs, inefficiencies

Related topics