How to color points in scatter plot by value?

e3c6 · June 17, 2018, 11:03pm

I have a list of 3-dimensional points. I want to plot them in a plane scatter plot (using the first two-dimensions), and then color each point according to the third value.

How can I do this?

Mattriks · June 18, 2018, 12:38am

Gadfly examples here:
http://gadflyjl.org/latest/gallery/scales.html

with pre-defined color schemes:
http://juliagraphics.github.io/ColorSchemes.jl/stable/plotting.html#Gadfly-1

davidanthoff · June 18, 2018, 1:09am

With VegaLite.jl:

using VegaLite, DataFrames

df = DataFrame(x=randn(100), y=randn(100), z=randn(100))

df |> @vlplot(:point, x=:x, y=:y, color=:z)

example1

You can customize the color scale with any of the pre-defined color schemes:

df |> @vlplot(:point, x=:x, y=:y, color={:z, scale={scheme=:plasma}})

example2

You can also go entirely custom by specifying a custom piecewise scale:

df |>
@vlplot(
  :point,
  x=:x,
  y=:y,
  color={
    :z,
    scale={
      domain=[-3, -1, 1, 3], 
      range=[:red, :blue, :green, :yellow]
    }
  }
)

example3

And yes, I am aware that my custom color scheme example is probably a strong argument to go with the pre-defined schemes

kristoffer.carlsson · June 18, 2018, 9:42am

PGFPlotsX.jl example:

using PGFPlotsX

@pgf Plot(
    {only_marks, scatter, scatter_src = "explicit"},
    Table(
        {x = "x", y = "y", meta = "col"}, 
         x = randn(100), y = randn(100), col = randn(100)
    )
)

juliauser · June 16, 2020, 11:55pm

How can I set xlim and ylim with VegaLite? Without defining those, the plot is totally off.

davidanthoff · June 17, 2020, 3:29am

Lian_Yunlong · June 18, 2020, 1:25am

Came across with this question. Just an update

using Plots
scatter([1,2,3],[4,5,6],color=["red","blue","black"],legend=nothing)

VivMendes · July 18, 2020, 9:23pm

Congrats on the beautiful wrapper to the PGFPlots that you have been developing. For high-quality plotting in the Julia environment, it seems second to none.

I have a similar problem to that posted by @e3c6, but I want the third dimension to be passed into SIZE rather than into COLOR in the plane scatter plot. The problem is how to size points in a scatter plot by the value of a third variable. Indeed, if it makes the solution easier, we can have color and size, but the size is crucial.

I am not an expert on LaTeX but have adapted your wrapper to some specific needs of mine. For example, I have been able to plot using a two y-axis ordinates (and the quality of the output is superb because we can use all the available tricks from LaTeX), create vertical shapes (like the vspan function in Plots.jl), or any sort of shapes (like the Shape function in Plots.jl). But unfortunately, I have not been successful in introducing the SIZE of a third variable into a scatter plot.

I checked the entire PGFPlots documentation (latest version, Version 1.17 – 2020/02/29), in particular, section “4.5.12 Scatter Plots” and section “4.17.2 Changing the Appearance of Individual Coordinates”, and I also checked the internet forums like here, here and here.

I tried with both raw LaTeX code and code adapted to the PGFPlotX wrapper, without success.

Some help would be appreciated, and my clumsy piece of code follows below. The errors that come out depend on whether raw code is used or not. This may be easy, but I could not figure it out. —Thank’s.

using PGFPlotsX
using LaTeXStrings

push!(PGFPlotsX.CUSTOM_PREAMBLE, 
        raw"\pgfplotsset{tick label style = {font = {\fontsize{12 pt}{12 pt}\selectfont}},
                         label style = {font = {\fontsize{12 pt}{12 pt}\selectfont}},
                         legend style = {font = {\fontsize{12 pt}{12 pt}\selectfont}},
                         title style = {font = {\fontsize{12 pt}{12 pt}\selectfont}},
                        }"
        )

@pgf a = Axis({
              height = "13cm",
              width = "15cm",
              colorbar,
              "colormap/jet",
              #grid = "major",
              xlabel = L"x",
              ylabel = L"y",
              title = "A Scatter Plot: size as a marker to a third dimension", 
               },
    
    
Plot(
    {only_marks, scatter, scatter_src = "explicit"},
         raw"\visualization depends on = {5*z \as \perpointmarksize},
         scatter/@pre marker code/.append style={/tikz/mark size=\perpointmarksize}",

Table(
        {x = "x", y = "y", meta = "z"}, 
         x = randn(50), y = randn(50), z = randn(50),
        ))
)


#raw"\visualization depends on = {5*z \as \perpointmarksize},
#      scatter/@pre marker code/.append style={/tikz/mark size=\perpointmarksize}",

#visualization_depends_on = "{5*z \as \perpointmarksize},
#        scatter/@pre marker code/.append style={/tikz/mark size=\perpointmarksize}",

VivMendes · July 19, 2020, 5:13pm

Actually, on page 407, section 4.25 of the documentation of PGFPlots (Version 1.17 – 2020/02/29), there is detailed information about the introduction of the size of a third variable into a 2D scatter plot. It comes with the title:

/pgfplots/visualization depends on=〈expression〉\as〈\macro〉 (initially empty)
/pgfplots/visualization depends on=value 〈expression〉\as〈\macro〉 (initially empty)

Despite the detailed explanation being centered on a 3D case (4 with the size of the third variable), the example provided is one of a univariate process (see figure below). I can easily replicate the figure in pure LaTeX, but I could not reproduce this simple univariate case inside the PGFPlotsX. It has been so far the only case I have failed to apply various functionalities of PGFPlots within the PGDPlotsX package. Do I need to pass some commands to the preamble? Thank’s a lot.

feanor12 · July 19, 2020, 5:24pm

In plots.jl there is the option of using marker_z.

using Plots
x= rand(10)
y= rand(10)
z= rand(10)
scatter(x,y,marker_z=z)

VivMendes · July 19, 2020, 6:50pm

Thank’s, but this is not what I need. I can do this both in Plots or in PGFPlotsX. Your suggestion solves the problem initially posted by @e3c6: a 2D scatter plot (x,y), and the third variable/array passing information by coloring the points according to its values. So you are adding one further dimension to the two first ones.

I need to pass this new dimension in terms of the size of the points, not their color. That may look trivial, but if I want to use the size of different countries by population, or their GDP levels, it makes a huge difference because some will appear as tiny points. As we can see in the figure above, in the PGFPlots documentation, each point’s size is easily visible.

Thank’s a lot, anyway.

VivMendes · July 19, 2020, 7:59pm

This problem I am raising must be easy to solve, and in some fields, this output I’m looking for is frequently used. For example, with Plotly (Python), the pice of code that does a similar job is as simple as this:

import plotly.graph_objects as go
import numpy as np
np.random.seed(1)

N = 100
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.rand(N)
sz = np.random.rand(N) * 30

fig = go.Figure()
fig.add_trace(go.Scatter(
    x=x,
    y=y,
    mode="markers",
    marker=go.scatter.Marker(
        size=sz,
        color=colors,
        opacity=0.6,
        colorscale="Viridis"
    )
))

The output looks like this:

The crucial part of the code looks very similar to the code in PGFPlotsX.

feanor12 · July 19, 2020, 8:24pm

You can also pass the size of the markers via the markersize argument.
Other options like markershape, markerstrokewidth, … are listed here: Series Attributes · Plots

VivMendes · July 19, 2020, 9:56pm

Thank’s a lot for your help. In Plots, it is as easy as this:


using Plots
x= randn(200)
y= randn(200)
z= randn(200)
scatter(x,y, marker_z = z, markersize = 5*z,  color = :jet)

Tamas_Papp · July 20, 2020, 6:03am

@pgf Plot({ scatter, scatter_src = "y", samples = 40,
            visualization_depends_on = raw"{5*cos(deg(x)) \as \perpointmarksize}",
            "scatter/@pre marker code/.append style" = raw"{/tikz/mark size=\perpointmarksize}" },
          Expression("sin(deg(x)"))

works fine for me.

VivMendes · July 20, 2020, 10:49am

@Tamas_Papp thanks a lot for your help. I tried many ways, but not the right way: inserting the raw command in the right places. Your code works fine for me as well for this univariate case

But following @kristoffer.carlsson above code for 3D, I tried to apply it to this particular case and I get an error. The code for 3D works fine for me as well, the only problem is that I still do not how to put the size of a third variable into a scatter plot. I can do it with Plots, but it would be fine to see how this works in PGFPLotsX as well. It is used a lot in many fields. It is not by mere chance that this particular type of plot is the first one we see when we visit the Plotly (Python) website.

My code, following the inputs from @kristoffer.carlsson code and your contribution, looks like this:

using PGFPlotsX
using LaTeXStrings

push!(PGFPlotsX.CUSTOM_PREAMBLE, 
        raw"\pgfplotsset{tick label style = {font = {\fontsize{12 pt}{12 pt}\selectfont}},
                         label style = {font = {\fontsize{12 pt}{12 pt}\selectfont}},
                         legend style = {font = {\fontsize{12 pt}{12 pt}\selectfont}},
                         title style = {font = {\fontsize{12 pt}{12 pt}\selectfont}},
                        }"
        )

@pgf a = Axis({
              height = "13cm",
              width = "15cm",
              colorbar,
              "colormap/jet",
              #grid = "major",
              xlabel = L"x",
              ylabel = L"y",
              title = "A Scatter Plot: size as a marker to a third dimension", 
               },
    
    
Plot(
    { only_marks, scatter, scatter_src = "explicit",
            visualization_depends_on = raw"{8*z \as \perpointmarksize}",
            "scatter/@pre marker code/.append style" = raw"{/tikz/mark size=\perpointmarksize}" 
    },
           
Table(
        {x = "x", y = "y", meta = "z"}, 
         x = randn(50), y = randn(50), z = randn(50)
    ))
)

What is my mistake? Thanks.

Tamas_Papp · July 20, 2020, 11:44am

I am not familiar enough with visualization depends on, but if you have an idea of what you want the emitted LaTeX code to look like, that would make it easier to help. I usually search StackOverflow for a LaTeX solution and than make PGFPlotsX emit it.

joa-quim · July 20, 2020, 2:11pm

Equally (or more) simple, with GMT.jl
https://www.generic-mapping-tools.org/GMT.jl/latest/gallery/scripts_agu/scatter_cart/

VivMendes · July 20, 2020, 4:06pm

From those three hyperlinks above, the first and the third ones deal directly with my concerns. The only difference is that they base their plots on a built-in table data set (inside their LaTeX code), while we have data that is generated outside the LaTex code.

VivMendes · July 20, 2020, 6:50pm

Interestingly, the piece of code that @kristoffer.carlsson provided above can easily render a 3D scatter which is capable of displaying 4 dimensions by using color.

The code is just a mere repetition of @kristoffer.carlsson code, with one more variable passed into the Table function:

using PGFPlotsX
using LaTeXStrings

push!(PGFPlotsX.CUSTOM_PREAMBLE, 
        raw"\pgfplotsset{tick label style = {font = {\fontsize{12 pt}{12 pt}\selectfont}},
                         label style = {font = {\fontsize{12 pt}{12 pt}\selectfont}},
                         legend style = {font = {\fontsize{12 pt}{12 pt}\selectfont}},
                         title style = {font = {\fontsize{12 pt}{12 pt}\selectfont}},
                        }"
      )

@pgf a = Axis({
              height = "13cm",
              width = "15cm",
              colorbar,
              "colormap/jet",
              grid = "major",
              xlabel = L"MB",
              ylabel = L"CPI",
              zlabel= L"RIT",
              ztick_distance ="4", # set the distance betwen each tick
              title = "A Scatter Plot: color as a marker to a fourth dimension", 
               },
Plot3(
    {only_marks, scatter, scatter_src = "explicit"},
    Table(
        {x = "x", y = "y", z= "z", meta = "y"}, 
         x = MB, y = CPI, z = RIT, 
    ))
)
#pgfsave("Scatter_4D.pdf", a)

If you manage to add “size” as another dimension, a simple 3D scatter plot can easily display 5 dimensions of a given problem. Moreover, the constant meta can be costlessly switched across different variables.

Topic		Replies	Views
A question about scatter plot in Plots.jl Visualization	2	624	November 24, 2020
Scatter pallete application New to Julia visualization	2	516	May 28, 2020
How to make this plot in Julia? Visualization gadfly , plots , statsplots , algebraofgraphics , vegalite	29	7035	October 17, 2022
Set markeralpha of points on scatterplot based on vector General Usage	0	112	November 16, 2023
Makie recipe for creating a scatter plot with points colored by value Visualization	6	2593	December 19, 2022

How to color points in scatter plot by value?

Related topics