Makie - Optimal way of drawing milions of data points

first of all, big thanks to authors and contributors of Makie!
This is the package that dragged me into Julia and it works so well for my use cases, that it feels illegal. :slight_smile:

I am working with long time series (EEG signal), typical array will have dimensions 80 x 7_000_000 (channels x time points) and I am using GLMakie to make an interactive plot visualizing the raw signal for inspection.

Main idea is to show a subset of electrodes in a short time window (e.g. 10 seconds of signal for 20 channels) and the user can add/subtract channels for view, browse through time or increase the displayed time span.

Right now, I am initially drawing everything with visible=false, then making visible only the subset that will be displayed. This still draws the whole time series (outside the limits of the plot), but it felt more responsive than reading and plotting chunks of data on each key press (but this was almost one year ago).

Here is a MWE of such a plot

using GLMakie
using Statistics

function plot(data)
    fig = Figure(resolution = (1920,1080));
    ax = fig[1,1] = Axis(fig);

    step = Observable(1:20000)
    chanRange = Observable(1:20)

    on(events(fig).keyboardbutton) do event
        if event.action in (, Keyboard.repeat)
            event.key == Keyboard.left   && step_back(ax, step, chanRange)
            event.key == Keyboard.right  && step_forw(ax, step, chanRange)
            event.key == Keyboard.page_down  && chans_less(data, ax, step, chanRange)
            event.key == Keyboard.page_up  && chans_more(data, ax, step, chanRange)
        return Consume(false)

    xlims!(ax, step.val[1], step.val[end])
    ylims!(ax, -10*chanRange.val[end]-5, -10*chanRange.val[1]+5)

    draw(data, ax, step)
    visible(data, ax, chanRange)


function draw(data, ax::Axis, step::Observable)
    for i=1:(size(data)[2])
        lines!(ax, step, @lift(data[$step,i].-mean(data[$step,i]).-10i), color="black", visible=false)

function visible(data, ax::Axis, chanRange::Observable)
    for j=1:(size(data)[2])
        if j in chanRange.val

function step_back(ax::Axis, step::Observable, chanRange::Observable)
    step[] = step.val.-100
    xlims!(ax, step.val[1], step.val[end])

function step_forw(ax::Axis, step::Observable, chanRange::Observable)
    step[] = step.val.+100
    xlims!(ax, step.val[1], step.val[end])

function chans_less(data, ax::Axis, step::Observable, chanRange::Observable)
    chanRange[] = chanRange.val.start:chanRange.val.stop-1
    visible(data, ax, chanRange)
    ylims!(ax, -10*chanRange.val[end]-5, -10*chanRange.val[1]+5)

function chans_more(data, ax::Axis, step::Observable, chanRange::Observable)
    chanRange[] = chanRange.val.start:chanRange.val.stop+1
    visible(data, ax, chanRange)
    ylims!(ax, -10*chanRange.val[end]-5, -10*chanRange.val[1]+5)

data = rand(5000000,80);

My question is:
Is there a more optimal Makie-way to update a plot with so many points?

This approach starts to feel slugish around 1,5 milion visible points (which is crazy good compared to things I tried in Python), but maybe there are some optimizations that could push this limit further.

1 Like

is scattered plot the right visualization for this?.. maybe a 2D histogram?

These are actually lines as the data points are samples from a continuous electrical brain activity.
And for this purpose we want to see them that way - mostly to check the quality of the signal etc. That is also why it is useful to scroll through the data.

I’ve written some code like that for Beacon Biosignals.
You can actually hook into the zoom rect from Makie, and use that in a timeline to navigate - that way you can create a signal, that only shows the data you’re looking at and switch to a resampled version above some threshold.
I thought this was better documented, but that’s the struct one can hook into:
I can see if I find some time to extract / open source some of the viewer code.

Or you hook into ax.finallimits directly to choose a subset of data that fits in there. It might be faster to make one series plot and then just extract a view of a matrix of your data (series will reform it to a vector with NaN in between).

There’s also

which is meant to help visualizing large signals

Yeah, I have seen many EEG related stuff on their github, but wanted to give it a try from scratch, to better understand Makie and Julia. Also haven’t seen ploting stuff there.

@sdanisch & @jules - thanks for the suggestions! Will look into those options, although I do not immediately see how to do the “hooking into”.
@jules - do have an example of the one series option?

@baggepinnen - I think I’ve seen it somewhere in the past, but forgot about. Will try out, thanks!

I mean something like

on(ax.finallimits) do lims

And for the series, it takes a matrix argument so you could determine which rows to display and what column steps and then do data[rows, cols] and pass that to series.

And on a different note:
is there an easy way to measure the performance/speed of rendering?
some function measuring frame rate etc.?
I tried to google it, but didn’t find anything useful.

With nvidia, you can turn on an FPS counter with geforce experience overlays. That’d be the easiest I suppose.
AMD might have the same.
Otherwise, you can jump in the code and log the timings here:

If you make it optional and configurable via GLMakie.set_window_config! a PR would be appreciated a lot :slight_smile:

@mkoculak: Looks like Makie is a good, flexible platform that suits most of your needs.

Glad to see you’ve managed to put in event hooks! I’m really kind of sad I haven’t yet had time to investigate Makie further myself.

Alternate solution for this SPECIFIC problem

For this specific problem, you might want to give InspectDR a try, though. It was built expressly for this kind of problem: Plotting multiple time-domain signals containing a large amount of datapoints.

Interactivity & bindkeys

InspectDR isn’t as flexible as Makie, but:


Here is an example you can try to quickly InspectDR is adequate for your immediate problem:

using InspectDR
using Colors

#Input parameters
NSIG = 15
TMAX = 60 #seconds
NSAMPLES = 1_000_000
#NSAMPLES = 10_000_000
t = range(0, TMAX, length=NSAMPLES)
fList = range(1, step=1, length=NSIG) #Hz

red = RGB24(1, 0, 0)
line_default = line(color=red, width=2)

Δt = Float64(t.step)
@info("Δt = $Δt")

#Generate data
@info("Calculating data array...")
sigA = Array{Float64}(undef, NSIG, NSAMPLES)
for (i, f) in enumerate(fList)
	sigA[i,:] = sin.(2pi*f * t)

#Generate plots
nstrips = NSIG
@info("Computing plots...")
mplot = InspectDR.Multiplot(title="Multi-signal time-domain plot")
plot = InspectDR.transientplot(:lin, title="EEG")
plot = add(mplot, InspectDR.transientplot([:lin for i in 1:nstrips],
	title="", #No title - use strip labels instead
	ylabels=["Potential (V)" for i in 1:nstrips]
#Zero-out gap between y-strips:
plot.layout[:valloc_mid] = 0
plot.xext = InspectDR.PExtents1D(min=18, max=28) #Zoom in on time span

t = collect(t) #InspectDR only supports Vector{Float64}
for (i, f) in enumerate(fList)
	sig_i = collect(sigA[i,:])
	wfrm = add(plot, t, sig_i, id="Signal $i", strip=i)
	wfrm.line = line_default #Set color, thickness, etc

gplot = display(InspectDR.GtkDisplay(), mplot)

Alternative APIs

There are also 2 alternative high-level APIs that can be used with InspectDR if you are curious:

Note that CMDimData.jl is probably not worth learning unless you need to post-process signal data without thinking too much about low-level data structures.

This actually sounds like a fun idea. :slight_smile:

I played around with the fps_renderloop and have some questions (probably related to GLFW functions calls). Where should I start the conversation about it? As a draft of a PR on github or here/slack, etc?

@MA_Laforge thanks for the suggestion! I have seen the InspectDR package couple of days ago and will certainly give it a try.

As a draft of a PR on github or here/slack, etc?

Draft PR sounds great :slight_smile: