Rebuilding Data from Heatmaps

I need to extract scalar values from heatmaps saved as images. I don’t have access to the raw data, only compressed JPGs with inlayed colorbars.

I would think that finding an efficient inverse of the colormap would be the ideal way to retrieve the actual values. What would be the (most) efficient way to compute the inverse/ what would be the most computationally efficient inverse?

UPDATE: Added an example. The images are from a thermal camera.

The Minima and Maxima are present in the picture.

Running the following code

using Images, ImageShow, Plots
frame_flir = Images.load("FLIR/Misc Pics/FLIR1274.jpg")

unzip(a) = map(x->getfield.(a, x), fieldnames(eltype(a)))

plot(unzip(frame_flir[31:209, 309:312]), linecolor = frame_flir[31:209, 310], label = false, xlab = "R", ylab = "G", zlab = "B")

plot(unzip(frame_flir[31:209, 309:312])[1], linecolor = :red, label = ["red" false false false])
plot!(unzip(frame_flir[31:209, 309:312])[2], linecolor = :green, label = ["green" false false false])
plot!(unzip(frame_flir[31:209, 309:312])[3], linecolor = :blue, label = ["blue" false false false])

I think this is a bit noisier than desired because of JPG compression, along the points where there are the black notches in the colorbar.

Do you have some sample images you could provide us with? I think that would really increase your chances of getting good feedback. Do you know how the images were generated? Was a common colormap used, or was it something custom? Also, do you know how much data loss there may have been in the compression process? Finally, I assume the plotting library would have normalized the values before mapping them to the colors, so I don’t think you’ll be able to get the original values back without access to the raw data, unless you know min/max values maybe?

Hey. I have added a sample image, and a bit of code I used to analyze the colorbar present in said image. I think it’s within reason to infer the colormap using some smoothing interpolation, such as Loess.jl. My issue is more-so with the process of obtaining the inverse map, Temperature=f(R,G,B), which, ideally, should be fast.

Hmmm, what about building a k-d tree or ball tree with the colormap values and then finding the nearest neighbor for each pixel in the image? Once you get something (anything) working, it won’t be difficult to get advice from the community on how to speed it up.

Here’s an attempt to just get something working with your image above:

using ColorSchemes
using Images
using NearestNeighbors

flir = Images.load(raw"C:\Users\mthel\OneDrive\Pictures\FLIR.jpeg")
colorbar = flir[31:209, 309:312]
colors = vec(mean(colorbar, dims=2))
colormap = ColorScheme(colors)
kdtree = KDTree(vcat([[red(c) green(c) blue(c)] for c in colors]...)')

function pixel_to_value(pixel, kdtree)
    query_point = [red(pixel), green(pixel), blue(pixel)]
    idx, _ = knn(kdtree, query_point, 1) 
    normalized_val = (idx[1] - 1) / 178
	return 30.0 + (17.6 - 30.0) * normalized_val

img_vals = map(pixel -> pixel_to_value(pixel, kdtree), flir)

# convert back the other direction as a sanity check

function value_to_pixel(value, colormap)
    normalized_val = (value - 30.0) / (17.6 - 30.0)
    return get(colormap, normalized_val)

recovered_img = map(val -> value_to_pixel(val, colormap), img_vals)

Below is the original image, as well as the recovered image:

You can see just from a visual inspection that there are some problems, but it’s a start!


Given a heatmap as a color image of resolution (rows, cols), and fixing a number of colors, n_colors,
we can perform color image quantization, using K-means, to extract the basic elements that define a heatmap:
an array, z_data, of size (rows, cols), as normalized original data in the heatmap definition, and a colorscheme defined by n_colors.

using Clustering, Images, ColorSchemes
using Plots

function image2zvals(img::Union{Matrix{RGB{T}}, Matrix{RGBA{T}}};  
               n_colors=64, maxiter=200, tol=1e-04) where T<:Real
    rows, cols = size(img)
    observations = reshape(img, rows*cols)
    kmres = kmeans(observations, n_colors; maxiter = maxiter, tol=tol)
    nclust = nclusters(kmres)
    a = assignments(kmres)/nclust
    z_data = reshape(a, rows, cols)
    hcolors =  kmres.centers # kmres.centers is the codebook
    if fieldcount(eltype(img)) == 3
        cscheme = ColorScheme([RGB(c...) for c in eachcol(hcolors)])
         cscheme = ColorScheme([RGB(c[1:3]...) for c in eachcol(hcolors)]) 
    return z_data, cscheme

img = load("heatmap-forum.jpeg");
zdata, cscheme = image2zvals(img; n_colors=64,  maxiter=200, tol=1e-04);
#Reconstructed heatmap:
plt = heatmap(zdata[end:-1:1,:], c=cgrad(cscheme), cbar=false, size=(375, 300))

To avoid posting here one more image, see this notebook displaying the heatmap recovered from the computed normalized data and the colorscheme, as well as the values of z_data.

The current version of Clustering.js is 0.15.7, and I worked with an older one: 0.14.4.