Makie recipe specialisation for specific types?

Hiya, so this is a bit of a weird thing to own up to, but this is the first time I’ve ever seriously sat down and thought - right this BioJulia package needs to plotting functionality! (I know right, normally I’m obsessed with benchmarks and speeding stuff up over getting plots out).

Anyway I’m working on KmerAnalysis.jl, and I’d like to use Makie, but I’m not sure on how I should proceed - whether or not I need to be defining type or full recipes, or just overloading a function, or what. So I thought if I described my aim here someone more knowlegeable would know how I should go about it.

And I’ve introduced this type

struct KmerFrequencySpectra{N} where {N}
    data::Array{UInt64, N}
    min::UInt8
end

It’s basically a data-structure that stores a frequency histogram of kmer counts. X axis the kmer count, y axis the number of kmers in a dataset that have that count.

Thing is these spectra can be N dimensional - normally 1 or 2 though.

What I want to do is plot a 1 dimensional spectra like so using points or scatter or maybe even bars:

BUT, if it’s 2 dimensional, I’d like the type to be plotted a different way - actually two different ways.
the first is a 2D histogram or heatmap which is useful for comparing samples and identifying biases in data, it looks something like this:

The other 2D plot is for when you’re quality checking a genome assembly against reads, it looks like a bar plot might be produced for a 1D spectra, but with extra colour info relating to the genome assembly. It looks something like this:

Or this in a more complex organism:

So my question is, what Makie strategy should I be using to plot this type?
I can imagine the 1D case being a type recipe for scatter, line, or bars quite easily. And perhaps it’s even a type recipe for the heatmap example. But the third plot which is called a spectra_cn plot (cn=copy number) as opposed to just being called a spectra plot, seems a little trickier and maybe warrants a specific full recipe. But then I think well I might as well just make a spectraplot recipe since recipes can also be specialised on argument types and-arrrrgh the options! Yeah I think my problem is I’m paralyzed by choice. Can anyone advise me on the route I ought to take?

There are a couple approaches I would recommend.

For the 1D case, it looks like the input data is the same whether you’re using scatter, bars or lines. In that case, I would recommend defining:

function AbstractPlotting.convert_arguments(::AbstractPlotting.PointBased, spec::KmerFrequency{1})
    # lower to a Vector{Point2f0} or a Tuple{Vector{Float32}, Vector{Float32}}
end

which should then work with all point-based plot types (lines, scatter, barplot, poly, etc).
You can also overload specific recipes (eg barplot) by:

function AbstractPlotting.convert_arguments(::Type{<: BarPlot}, spec::KmerFrequency{1})
    # lower to something barplot understands
end

which will take precedence over the generic method for barplots.

If you want 2D handling…well. There are a couple ways to go about it.

One way is to say “make your contours yourself” and just define:

function AbstractPlotting.convert_arguments(::AbstractPlotting.SurfaceLike, spec::KmerFrequency{2})
    # lower to something which e.g. heatmap or contour understand
end

which will work with contour, heatmap, surface, etc.

The other is to create your own recipe. This is really just a matter of personal preference.

For the 1D barplot, I would highly recommend rolling your own recipe. That could be as simple as:

@recipe(SpectraCN) do scene
    default_theme(scene, BarPlot)
end

function AbstractPlotting.plot!(p::SpectraCN)
    # lower to potentially multiple barplots/plots here 
end

I would personally advise against creating a single monorecipe which can generate multiple plot types, just because it can get kinda confusing for the user. This way, you can allow the user to use standard Makie plot types, while also offering them their own convenience.

2 Likes

Thanks @asinghvi17!

The following works for the 1D points based case:

function AbstractPlotting.convert_arguments(::AbstractPlotting.PointBased, spec::KmerFrequencySpectra{1})
    # lower to a Vector{Point2f0} or a Tuple{Vector{Float32}, Vector{Float32}}
    return ([Point2f0(i, j) for (i, j) in enumerate(spec.data)],)
end

AbstractPlotting.plottype(::KmerFrequencySpectra{1}) = AbstractPlotting.BarPlot

Is there a way to set defaults for stuff like the X and Y axis labels and perhaps the y limit - the first bars at low x values are often very much taller than the main distribution - it’s where all erroneous kmer values from datasets end up.

Unfortunately, not really. This goes into the territory of “not implemented yet” for recipes, I’m afraid.

If you know that no value will be around 50,000 or so, you could filter them out manually.

As you advised I created a bit of code that would find the last value in the 1D spectra with a non-zero value:

function AbstractPlotting.convert_arguments(::AbstractPlotting.PointBased, spec::KmerFrequencySpectra{1})
    tv = view(spec.data, firstindex(spec.data):findprev(x -> x > 0, spec.data, lastindex(spec.data)))
    return ([Point2f0(i, j) for (i, j) in enumerate(tv)],)
end

Which gives a very nice result now:

I tried to do the same for the 2D case but hit a bit of a snag you might have some insight on:

So I made this:

function _find_plottable_subset(spec::KmerFrequencySpectra{2})
    mat = spec.data
    furthest_row = 0
    furthest_col = 0
    for i in 1:256
        for j in 1:256
            v = mat[i, j]
            furthest_row = ifelse((j > furthest_row) & !iszero(v), j, furthest_row)
            furthest_col = ifelse((i > furthest_col) & !iszero(v), i, furthest_col)
        end
    end
    return mat[1:furthest_col, 1:furthest_row]
end

function AbstractPlotting.convert_arguments(::AbstractPlotting.SurfaceLike, spec::KmerFrequencySpectra{2})
    # lower to something which e.g. heatmap or contour understand
    return (_find_plottable_subset(spec),)
end

Which should find forthest row and column out from (0,0) in a matrix, that has a non-zero value, and subset it for plotting, however I hit an error:

julia> twodspec = spectra(kca, kcb)
Count histogram of motifs appearing more than 0 times

julia> typeof(twodspec)
KmerAnalysis.KmerFrequencySpectra{2}

julia> twodspec.data
256×256 Array{UInt64,2}:
 0x0000000000000000  0x00000000000436c5  …  0x0000000000000000  0x0000000000000000
 0x0000000000045019  0x000000000000054d     0x0000000000000000  0x0000000000000000
 0x0000000000000256  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x000000000000002f     0x0000000000000000  0x0000000000000000
 0x0000000000000019  0x0000000000000049     0x0000000000000000  0x0000000000000000
 0x0000000000000004  0x000000000000001f  …  0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x000000000000009a     0x0000000000000000  0x0000000000000000
 0x0000000000000002  0x00000000000000af     0x0000000000000000  0x0000000000000000
 0x0000000000000030  0x000000000000005b     0x0000000000000000  0x0000000000000000
 0x0000000000000006  0x0000000000000055     0x0000000000000000  0x0000000000000000
 0x000000000000001d  0x0000000000000072  …  0x0000000000000000  0x0000000000000000
 0x0000000000000018  0x0000000000000039     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x000000000000001a     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000010     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000008     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x000000000000000f  …  0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000  …  0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000006  0x0000000000000005     0x0000000000000000  0x0000000000000000
 0x0000000000000001  0x0000000000000013     0x0000000000000000  0x0000000000000000
                  ⋮                      ⋱                                       ⋮
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000  …  0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000  …  0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000  …  0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000  …  0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000  …  0x0000000000000000  0x0000000000000000

julia> heatmap(twodspec)
ERROR: BoundsError: attempt to access (Observable{Array{UInt64,2}} with 0 listeners. Value:
UInt64[0x0000000000000000 0x00000000000436c5 … 0x0000000000000000 0x0000000000000000; 0x0000000000045019 0x000000000000054d … 0x0000000000000000 0x0000000000000000; … ; 0x0000000000000000 0x0000000000000000 … 0x0000000000000000 0x0000000000000000; 0x0000000000000000 0x0000000000000000 … 0x0000000000000000 0x0000000000000000],)
  at index [3]
Stacktrace:
 [1] getindex at ./tuple.jl:24 [inlined]
 [2] calculated_attributes!(::Type{Heatmap{...}}, ::Heatmap{...}) at /Users/bward/.julia/packages/AbstractPlotting/jOgYQ/src/interfaces.jl:295
 [3] calculated_attributes! at /Users/bward/.julia/packages/AbstractPlotting/jOgYQ/src/interfaces.jl:38 [inlined]
 [4] Heatmap{...}(::Scene, ::Attributes, ::Tuple{Observable{KmerAnalysis.KmerFrequencySpectra{2}}}, ::Observable{Tuple{Array{UInt64,2}}}) at /Users/bward/.julia/packages/AbstractPlotting/jOgYQ/src/interfaces.jl:456
 [5] plot!(::Scene, ::Type{Heatmap{...}}, ::Attributes, ::Tuple{Observable{KmerAnalysis.KmerFrequencySpectra{2}}}, ::Observable{Tuple{Array{UInt64,2}}}) at /Users/bward/.julia/packages/AbstractPlotting/jOgYQ/src/interfaces.jl:628
 [6] #plot!#217(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(plot!), ::Scene, ::Type{Heatmap{...}}, ::Attributes, ::KmerAnalysis.KmerFrequencySpectra{2}) at /Users/bward/.julia/packages/AbstractPlotting/jOgYQ/src/interfaces.jl:571
 [7] plot! at /Users/bward/.julia/packages/AbstractPlotting/jOgYQ/src/interfaces.jl:540 [inlined]
 [8] #heatmap#110(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(heatmap), ::KmerAnalysis.KmerFrequencySpectra{2}) at /Users/bward/.julia/packages/AbstractPlotting/jOgYQ/src/recipes.jl:15
 [9] heatmap(::KmerAnalysis.KmerFrequencySpectra{2}) at /Users/bward/.julia/packages/AbstractPlotting/jOgYQ/src/recipes.jl:13
 [10] top-level scope at REPL[16]:1

Which is odd as the following works:

julia> heatmap(twodspec.data[1:31, 1:36])

julia> heatmap(twodspec.data[1:31, 1:36], colorrange = (0.0, 10000.0))

And twodspec.data[1:31, 1:36] results in the same matrix as my auto function in convert_arguments!:

julia> KmerAnalysis._find_plottable_subset(twodspec)
31×36 Array{UInt64,2}:
 0x0000000000000000  0x00000000000436c5  …  0x0000000000000000  0x0000000000000000
 0x0000000000045019  0x000000000000054d     0x0000000000000000  0x0000000000000000
 0x0000000000000256  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x000000000000002f     0x0000000000000000  0x0000000000000000
 0x0000000000000019  0x0000000000000049     0x0000000000000000  0x0000000000000000
 0x0000000000000004  0x000000000000001f  …  0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x000000000000009a     0x0000000000000000  0x0000000000000000
 0x0000000000000002  0x00000000000000af     0x0000000000000003  0x0000000000000000
 0x0000000000000030  0x000000000000005b     0x0000000000000014  0x0000000000000000
 0x0000000000000006  0x0000000000000055     0x0000000000000005  0x0000000000000002
 0x000000000000001d  0x0000000000000072  …  0x0000000000000000  0x0000000000000000
 0x0000000000000018  0x0000000000000039     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x000000000000001a     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000010     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000008     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x000000000000000f  …  0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000  …  0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000006  0x0000000000000005     0x0000000000000000  0x0000000000000000
 0x0000000000000001  0x0000000000000013     0x0000000000000000  0x0000000000000000
 0x0000000000000005  0x0000000000000016  …  0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000     0x0000000000000000  0x0000000000000000
 0x0000000000000000  0x0000000000000000  …  0x0000000000000000  0x0000000000000000

Why might it not work for convert_arguments!, but if I subset the matrix manually it works?

EDIT:

I found that using convert_arguments recursively solved my problem:

function AbstractPlotting.convert_arguments(p::AbstractPlotting.SurfaceLike, spec::KmerFrequencySpectra{2})
    return AbstractPlotting.convert_arguments(p, _find_plottable_subset(spec))
end
3 Likes

It looks like a conversion is not being hit. Could you try returning vectors of x and y as well as the z matrix? You may also want to try converting to Float if that fails.