AoG plotting: column name :x not found in data frame

Hi,

I’m trying the AlgebraOfGraphics page demo for heatmap-style visualization (Visual · Algebra of Graphics) using my own local data.

I have no issues importing the external .tsv matrix file, but the process throws an error when it comes to drawing the actual plot. Here’s the code I’m trying to get to work.

using AlgebraOfGraphics, DataFrames, CairoMakie, CSV

set_aog_theme!()

heat_matrix = CSV.read("localdata.tsv", DataFrame)

plt = data(heat_matrix) * mapping(:x, :y) * AlgebraOfGraphics.density(npoints=50)

draw(plt * visual(colormap=:viridis))

The error message is:

julia> draw(plt * visual(colormap=:viridis))
ERROR: ArgumentError: column name :x not found in the data frame

And the input data is roughly in form below, though the number of entries is about 876 by 876

entry_1	entry_2	entry_3	entry_4	entry_5	entry_6	entry_7	entry_8	entry_9	entry_10	entry_11	entry_12	entry_13	entry_14	entry_15
100.0	99.389389	80.807358	81.091492	80.045761	80.2612	80.591591	80.326904	80.517456	80.73204	80.916733	81.039085	80.428047	81.411896	82.012657
99.389389	100.0	80.769554	81.563904	80.253723	81.239258	80.34507	80.411316	80.983093	80.776138	80.54747	80.830902	80.215294	81.910057	82.269379
80.807358	80.769554	100.0	98.784981	87.967583	88.039368	88.1138	88.004013	84.194122	84.372253	84.497612	85.310501	82.93158	82.449257	82.363342
81.091492	81.563904	98.784981	100.0	88.240082	88.462433	87.531174	87.579102	83.479767	83.909073	84.052917	84.921959	83.114807	82.751495	82.161125
80.045761	80.253723	87.967583	88.240082	100.0	98.189163	88.235207	89.16333	84.933044	84.678314	86.075043	86.235718	83.413078	82.43309	80.658096
80.2612	81.239258	88.039368	88.462433	98.189163	100.0	88.264221	88.903702	85.049088	85.365807	86.156403	86.378174	83.042229	82.283073	80.690521
80.591591	80.34507	88.1138	87.531174	88.235207	88.264221	100.0	89.984344	84.083481	83.85466	87.144073	85.681107	83.881477	81.86351	82.695709
80.326904	80.411316	88.004013	87.579102	89.16333	88.903702	89.984344	100.0	85.202805	84.645584	84.900604	85.778015	85.173973	81.289238	82.033951
80.517456	80.983093	84.194122	83.479767	84.933044	85.049088	84.083481	85.202805	100.0	99.035522	87.151443	88.334236	84.80954	86.120651	83.791351
80.73204	80.776138	84.372253	83.909073	84.678314	85.365807	83.85466	84.645584	99.035522	100.0	87.532318	88.439926	84.654488	85.649384	83.718521
80.916733	80.54747	84.497612	84.052917	86.075043	86.156403	87.144073	84.900604	87.151443	87.532318	100.0	92.740463	85.269989	82.425095	83.434967
81.039085	80.830902	85.310501	84.921959	86.235718	86.378174	85.681107	85.778015	88.334236	88.439926	92.740463	100.0	85.421165	82.606354	83.663094
80.428047	80.215294	82.93158	83.114807	83.413078	83.042229	83.881477	85.173973	84.80954	84.654488	85.269989	85.421165	100.0	84.637268	84.210716
81.411896	81.910057	82.449257	82.751495	82.43309	82.283073	81.86351	81.289238	86.120651	85.649384	82.425095	82.606354	84.637268	100.0	84.265274
82.012657	82.269379	82.363342	82.161125	80.658096	80.690521	82.695709	82.033951	83.791351	83.718521	83.434967	83.663094	84.210716	84.265274	100.0

The dataframe clearly shows the first line as the header for each column - am pretty stumped at the moment. What could I be doing wrong?

Thank you!

plt = data(heat_matrix) * mapping(:entry_1) * AlgebraOfGraphics.density(npoints=50)

Your column names are entry_j. There are, as the error message says, no columns named :x or :y.

The mapping for density takes a single argument. See

https://tutorials.pumas.ai/html/PlottingInJulia/03-AoG-Stats.html#density

1 Like

Thank you for responding - do you know if there’s a way to simply plot out, say, 800x800 matrix table of number on x-y axis heatmap in AoG, with values corresponding to color range? I can’t seem to find an actual example to study…

I get a feeling the makie-AoG idea of heatmap in the linked page on the opening post isn’t meant to address what I’m trying to do, since it’s making a default assumption of linear corresponding 2 column data input. And the type of matrix file I’m trying to plot as heatmap can be tens of thousands of lines across x-y axis.

Just to clarify with some real world examples - attaching something I threw together in about a minute in gnuplot using a similar dataset as in the post:

And here’s a Plots.jl output with GR backend, using the same data (not cropped) I posted here:

Any lead would be helpful - so far all I’ve been finding are witty one-liners without any explanations of intent or what’s being expected for the package, IMHO.

data = [
100.0	99.389389	80.807358	81.091492	80.045761	80.2612	80.591591	80.326904	80.517456	80.73204	80.916733	81.039085	80.428047	81.411896	82.012657
99.389389	100.0	80.769554	81.563904	80.253723	81.239258	80.34507	80.411316	80.983093	80.776138	80.54747	80.830902	80.215294	81.910057	82.269379
80.807358	80.769554	100.0	98.784981	87.967583	88.039368	88.1138	88.004013	84.194122	84.372253	84.497612	85.310501	82.93158	82.449257	82.363342
81.091492	81.563904	98.784981	100.0	88.240082	88.462433	87.531174	87.579102	83.479767	83.909073	84.052917	84.921959	83.114807	82.751495	82.161125
80.045761	80.253723	87.967583	88.240082	100.0	98.189163	88.235207	89.16333	84.933044	84.678314	86.075043	86.235718	83.413078	82.43309	80.658096
80.2612	81.239258	88.039368	88.462433	98.189163	100.0	88.264221	88.903702	85.049088	85.365807	86.156403	86.378174	83.042229	82.283073	80.690521
80.591591	80.34507	88.1138	87.531174	88.235207	88.264221	100.0	89.984344	84.083481	83.85466	87.144073	85.681107	83.881477	81.86351	82.695709
80.326904	80.411316	88.004013	87.579102	89.16333	88.903702	89.984344	100.0	85.202805	84.645584	84.900604	85.778015	85.173973	81.289238	82.033951
80.517456	80.983093	84.194122	83.479767	84.933044	85.049088	84.083481	85.202805	100.0	99.035522	87.151443	88.334236	84.80954	86.120651	83.791351
80.73204	80.776138	84.372253	83.909073	84.678314	85.365807	83.85466	84.645584	99.035522	100.0	87.532318	88.439926	84.654488	85.649384	83.718521
80.916733	80.54747	84.497612	84.052917	86.075043	86.156403	87.144073	84.900604	87.151443	87.532318	100.0	92.740463	85.269989	82.425095	83.434967
81.039085	80.830902	85.310501	84.921959	86.235718	86.378174	85.681107	85.778015	88.334236	88.439926	92.740463	100.0	85.421165	82.606354	83.663094
80.428047	80.215294	82.93158	83.114807	83.413078	83.042229	83.881477	85.173973	84.80954	84.654488	85.269989	85.421165	100.0	84.637268	84.210716
81.411896	81.910057	82.449257	82.751495	82.43309	82.283073	81.86351	81.289238	86.120651	85.649384	82.425095	82.606354	84.637268	100.0	84.265274
82.012657	82.269379	82.363342	82.161125	80.658096	80.690521	82.695709	82.033951	83.791351	83.718521	83.434967	83.663094	84.210716	84.265274	100.0
]
using CairoMakie
heatmap(data)

3 Likes

I think you’re correct in that Heatmap for AoG is more for creating 2D histrograms/density plots. I would do what @jar1 suggests but if you really want to use AoG you could the geometry example:

using AlgebraOfGraphics, CairoMakie
using GeometryBasics

entries = [100.0	99.389389	80.807358	81.091492	80.045761	80.2612	80.591591	80.326904	80.517456	80.73204	80.916733	81.039085	80.428047	81.411896	82.012657
99.389389	100.0	80.769554	81.563904	80.253723	81.239258	80.34507	80.411316	80.983093	80.776138	80.54747	80.830902	80.215294	81.910057	82.269379
80.807358	80.769554	100.0	98.784981	87.967583	88.039368	88.1138	88.004013	84.194122	84.372253	84.497612	85.310501	82.93158	82.449257	82.363342
81.091492	81.563904	98.784981	100.0	88.240082	88.462433	87.531174	87.579102	83.479767	83.909073	84.052917	84.921959	83.114807	82.751495	82.161125
80.045761	80.253723	87.967583	88.240082	100.0	98.189163	88.235207	89.16333	84.933044	84.678314	86.075043	86.235718	83.413078	82.43309	80.658096
80.2612	81.239258	88.039368	88.462433	98.189163	100.0	88.264221	88.903702	85.049088	85.365807	86.156403	86.378174	83.042229	82.283073	80.690521
80.591591	80.34507	88.1138	87.531174	88.235207	88.264221	100.0	89.984344	84.083481	83.85466	87.144073	85.681107	83.881477	81.86351	82.695709
80.326904	80.411316	88.004013	87.579102	89.16333	88.903702	89.984344	100.0	85.202805	84.645584	84.900604	85.778015	85.173973	81.289238	82.033951
80.517456	80.983093	84.194122	83.479767	84.933044	85.049088	84.083481	85.202805	100.0	99.035522	87.151443	88.334236	84.80954	86.120651	83.791351
80.73204	80.776138	84.372253	83.909073	84.678314	85.365807	83.85466	84.645584	99.035522	100.0	87.532318	88.439926	84.654488	85.649384	83.718521
80.916733	80.54747	84.497612	84.052917	86.075043	86.156403	87.144073	84.900604	87.151443	87.532318	100.0	92.740463	85.269989	82.425095	83.434967
81.039085	80.830902	85.310501	84.921959	86.235718	86.378174	85.681107	85.778015	88.334236	88.439926	92.740463	100.0	85.421165	82.606354	83.663094
80.428047	80.215294	82.93158	83.114807	83.413078	83.042229	83.881477	85.173973	84.80954	84.654488	85.269989	85.421165	100.0	84.637268	84.210716
81.411896	81.910057	82.449257	82.751495	82.43309	82.283073	81.86351	81.289238	86.120651	85.649384	82.425095	82.606354	84.637268	100.0	84.265274
82.012657	82.269379	82.363342	82.161125	80.658096	80.690521	82.695709	82.033951	83.791351	83.718521	83.434967	83.663094	84.210716	84.265274	100.0]


geometry = [Rect(Vec(i, j), Vec(1, 1)) for i in 1:15 for j in 1:15]
group = vec(entries)
df = (; geometry, group)

plt = data(df) * visual(Poly) * mapping(:geometry, color = :group)
fg = draw(plt; axis=(aspect=1,))

Which produces:

1 Like

AoG can do heatmaps, the thing is just that it expects long format, so you cannot have a matrix-like dataframe but need to transform it into something that has an x, y and value column. In your case that could look like this (I’m not using AoG here but this shows off the heatmap method of Makie that you would use):

using DataFrames
using CSV
using CairoMakie
using Chain

data = """
entry_1	entry_2	entry_3	entry_4	entry_5	entry_6	entry_7	entry_8	entry_9	entry_10	entry_11	entry_12	entry_13	entry_14	entry_15
100.0	99.389389	80.807358	81.091492	80.045761	80.2612	80.591591	80.326904	80.517456	80.73204	80.916733	81.039085	80.428047	81.411896	82.012657
99.389389	100.0	80.769554	81.563904	80.253723	81.239258	80.34507	80.411316	80.983093	80.776138	80.54747	80.830902	80.215294	81.910057	82.269379
80.807358	80.769554	100.0	98.784981	87.967583	88.039368	88.1138	88.004013	84.194122	84.372253	84.497612	85.310501	82.93158	82.449257	82.363342
81.091492	81.563904	98.784981	100.0	88.240082	88.462433	87.531174	87.579102	83.479767	83.909073	84.052917	84.921959	83.114807	82.751495	82.161125
80.045761	80.253723	87.967583	88.240082	100.0	98.189163	88.235207	89.16333	84.933044	84.678314	86.075043	86.235718	83.413078	82.43309	80.658096
80.2612	81.239258	88.039368	88.462433	98.189163	100.0	88.264221	88.903702	85.049088	85.365807	86.156403	86.378174	83.042229	82.283073	80.690521
80.591591	80.34507	88.1138	87.531174	88.235207	88.264221	100.0	89.984344	84.083481	83.85466	87.144073	85.681107	83.881477	81.86351	82.695709
80.326904	80.411316	88.004013	87.579102	89.16333	88.903702	89.984344	100.0	85.202805	84.645584	84.900604	85.778015	85.173973	81.289238	82.033951
80.517456	80.983093	84.194122	83.479767	84.933044	85.049088	84.083481	85.202805	100.0	99.035522	87.151443	88.334236	84.80954	86.120651	83.791351
80.73204	80.776138	84.372253	83.909073	84.678314	85.365807	83.85466	84.645584	99.035522	100.0	87.532318	88.439926	84.654488	85.649384	83.718521
80.916733	80.54747	84.497612	84.052917	86.075043	86.156403	87.144073	84.900604	87.151443	87.532318	100.0	92.740463	85.269989	82.425095	83.434967
81.039085	80.830902	85.310501	84.921959	86.235718	86.378174	85.681107	85.778015	88.334236	88.439926	92.740463	100.0	85.421165	82.606354	83.663094
80.428047	80.215294	82.93158	83.114807	83.413078	83.042229	83.881477	85.173973	84.80954	84.654488	85.269989	85.421165	100.0	84.637268	84.210716
81.411896	81.910057	82.449257	82.751495	82.43309	82.283073	81.86351	81.289238	86.120651	85.649384	82.425095	82.606354	84.637268	100.0	84.265274
82.012657	82.269379	82.363342	82.161125	80.658096	80.690521	82.695709	82.033951	83.791351	83.718521	83.434967	83.663094	84.210716	84.265274	100.0
"""

@chain begin
    data
    IOBuffer
    CSV.read(DataFrame; delim = '\t')
    transform(eachindex => :x)
    stack(Not(:x))
    transform(:variable => ByRow(x -> parse(Int, split(x, "_")[2])) => :y)
    @aside display(first(_, 10))
    heatmap(_.x, _.y, _.value; axis = (; aspect = DataAspect()))
end

This prints out


10×4 DataFrame
 Row │ x      variable  value     y     
     │ Int64  String    Float64   Int64 
─────┼──────────────────────────────────
   1 │     1  entry_1   100.0         1
   2 │     2  entry_1    99.3894      1
   3 │     3  entry_1    80.8074      1
   4 │     4  entry_1    81.0915      1
   5 │     5  entry_1    80.0458      1
   6 │     6  entry_1    80.2612      1
   7 │     7  entry_1    80.5916      1
   8 │     8  entry_1    80.3269      1
   9 │     9  entry_1    80.5175      1
  10 │    10  entry_1    80.732       1

and the plot looks like

3 Likes