I’m trying to visualize some data where I want to see some kind of distribution a (small) set of per sample distributions.

I doubt that the description makes much sense, but this MWE creates the type of figure I would like to see by just mapping values to a color manually to create an image:

``````using Plots
# This creates 21 samples with three sets of data in each sample
dummydata = [(0.1:0.1:x, 0.2:0.2:x, 0.3:0.3:x) for x in 0:0.05:1];

function colormatrix(pdata, cms)
z = fill(colorant"white", length(pdata), maximum(x -> sum(length, x), pdata)+1)
for (i, row) in enumerate(pdata)
offs = 0
for (cm, series) in zip(cms, row)
nelems = length(series)
nelems > 0 || continue
z[i, 1+offs:nelems+offs] .= cm[series]
offs += nelems
end
end
return z
end

``````

The problem with the above approach is that I want to do this for a couple of million samples and the plot command either chokes or produces an invisibly thin horizontal line (as the x-axis has much fewer values than the y-axis).

If I instead make `z` an array of floats and let `plots.heatmap` handle the image generation I get something which looks good, but I can’t find a good way to separate the ranges of the colorgradient so the `series` will blend into each other.

I can ensure that each `series` (as used in `colormatrix`) lies within a distinct numeric range (e.g. first series in 0-1, second in 2-3 etc.) in case that can be helpful.

I also can’t find what trick `Plots.heatmap` uses to avoid the ‘too many pixels’ problem. Can one apply it to a matrix of `Color`s? I have tried using `aspect_ratio` set to `1` or `:equal` but it does not help (enough). Setting `size` does not help either.

I don’t have to use `Plots` for this btw, its just what I have tried so far.

Linking this related GR example, in case it is useful.

1 Like

Thanks, but I seem to get the same type of problem as with `Plots`. For example:

``````# Not even visible to me
GRUtils.imshow(rand(1000_000, 10))

# A very thin picture:
GRUtils.imshow(rand(1000, 10))

# This streches and blends/aliases the pixels somehow
Plots.heatmap(rand(1000_000, 10))
``````

I guess the last example destroys data as my screen can’t show one million vertical pixels, but since I intend to plot sorted data that should not matter as one should be able to grasp the overall distribution.

Edit: What is missing from the last example is a way to control the colormap, e.g. so that `0:0.57` uses `cgrad(:reds)` and the rest uses `cgrad(:greens)`. I’m perfectly happy with pixels being merged/aliased (whatever the term is) for the `1e6` sized dimension and stretched out for the small sized dimension.

You may want to play with the 4th channel (RGBA alpha transparency) for blending?

1 Like

This other post takes a different approach, uses a single color scheme, but might be of interest, just in case.

1 Like

You may want to play with the 4th channel (RGBA alpha transparency) for blending?

As in making separate calls to `heatmap!` for each colormap and line them up? I was thinking of this, but is there a way to do this with numerical input? I’m also a bit concerned with performance as things need to run N times instead of 1. It already takes almost a minute to draw/display the image.

I suppose a hacky way could be to use a sentinel value for a fully transparent `RGBA` which is manually concatenated with the desired colormap (not sure how to do this, but I assume it is possible somehow). Is there a way to use `NaN` for this sentinel value?

I was thinking about one single plot call. We would map first the millions of dots onto a big RGBA matrix, where 3 data families would live in separate R, G, B channels and they would blend via alpha. But I might be wrong.

map first the millions of dots into a big RGBA matrix

The problem with an `RGB(A)` matrix seems to be that pixels are not blended automatically when trying to display the image.

I have no idea what algorithm blends/stretches the pixels when `Plots.heatmap(rand(1000_000, 10))` is called but it seems to do a pretty good job in my case.

I tried the following naive implementation of the overlapping figures approach, but it seems to be too slow (has been running for about 5 minutes now with 200k rows dummydata):

``````function colorplot(pdata, cms)
plt = heatmap()
for (i, row) in enumerate(pdata)
offs = 0
for (cm, series) in zip(cms, row)
nelems = length(series)
nelems > 0 || continue
heatmap!(plt, 1+offs:nelems+offs, i:i, permutedims(series)'; c=cm)
offs += nelems
end
end
return plt
end
``````

Ok, this less naive version seems to be much faster:

``````function colorplot(pdata, cms)
zs = ntuple(_ -> fill(NaN, length(pdata), maximum(r -> sum(length, r), pdata)), length(cms))
for (i, row) in enumerate(pdata)
offs = 0
for (z, series) in zip(zs, row)
nelems = length(series)
nelems > 0 || continue
z[i, 1+offs:nelems+offs] = series
offs += nelems
end
end
plt = plot()
for (z, cm) in zip(zs, cms)
heatmap!(plt, z; c=cm)
end
return plt
end
``````

I’ll use it for now unless someone has a better way.