Plot DataFrames columns in plotlyjs

I’m trying to plot a DataFrame in plotlyjs(). I’m able to plot 1 column but cant plot more than 1 (I have several around 20 that I have to plot on the same figure).

The first column of the df is the data for the x axis, all the other columns should be the different traces.

Something like this:

using DataFrames
data = DataFrame(time = collect(1.:100), T1 = rand(100), T2 = rand(100), T3 = rand(100))

So I’m able to plot the first column by doing:

using PlotlyJS
Plot(data, x=:time, y=:T1)

but I can’t plot the other columns and I’m not sure how to do it… I tried:

Plot(data,x=:time, y=[:T1 :T2 :T3])
but it doesnt work

I’m also having problems with the label of the legend, I want the label to be the name of the column in the DF (so in this example the legend should show T1, T2, T3, etc…) but it only show the defauls “trace 0”

Thanks for the help!

@espin

You should define an empty AbstractTrace, data, and add successively one trace for each column:

using PlotlyJS
using DataFrames
include("plotlyju.jl")#style file: https://gist.github.com/empet/6b4f441ed96d724d67ff77f1fb8b0a79
df = DataFrame(time = collect(1.:100), T1 = rand(100), T2 = rand(100), T3 = rand(100))
data = AbstractTrace[]

for col in names(df)[2:end]
    push!(data, scatter(x=:time, y=df[col], mode="lines", name=col))
end    

layout = Layout(title="My title",  
                width=800,
                height=500)
plot(data, layout, style=plotlyju)
2 Likes

Thanks!

I only needed to modify your y = df[col] for y = df[!,col] for some reason I’m not sure… maybe Julia version?

Hi @empet, @espin. I have a similar problem, but with a small variation. I want to plot subsets of columns, instead of the entire set of columns. How to do it without slicing the entire DataFRame?

Thanks.

DataFrames has deprecated df[col] in favor of df[:, col] or df[!, col]. Take a look here.

You can do df.y[df.y .> 3] or similar to plot individual columns as vectors. To use the syntax with Symbols you will need to subset the data frame.

@pdeffebach Thanks for your help. Given my limited knowledge of DataFrames I am not capable of adapting your suggestion to @empet 's solution. This solution is extremely useful to plot an entire set of columns of a data frame using PlotlyJS.

To plot a particular column in PlotlyJS is quite simple. For example, if I want to plot column 24 of my Data2007 DataFrame, I just have to type:

plot(Period3, Data2007[: , 24])

My point is: what should I do if I want to plot, e.g., columns 3, 24, 58? I know how to use individual traces, and plot all of them together. @empet 's solution is very useful because it saves a lot of time. How can I integrate your solution into an approach like @empet proposes? Thanks.

I do not have enough knowledge of PlotlyJS to help you, unfortunately. It looks like the example here might help.

Thanks anyway. Your link is about what I was mentioning: plotting trace by trace. This is not very efficient if we want to have 15 traces in one plot, 10 in another, and so on.

@empet puts forward a very good solution. Probably, I will have to produce individual slices of the data frame and then apply his solution to each individual slice.

To plot only data from a subset of columns, you should retrieve the column names, given their indices (see the code between ###### #####:

using PlotlyJS
using DataFrames

df = DataFrame(time = collect(1.:100), T1 = rand(100), T2 = rand(100), T3 = rand(100))
data = AbstractTrace[]
#################
col_idx =[2, 4]
some_colnames = []
for k in col_idx
    push!(some_colnames, names(df)[k])
end
#################
for col in some_colnames     
    push!(data, scatter(x=:time, y=df[!,col], mode="lines", name=col))
end    

layout = Layout(title="My title",  
                width=800,
                height=500)
pl=plot(data, layout)
1 Like

@empet What a nice piece of code!!! It saves a huge amount of time if we use time series and large data sets with PlotlyJS. Congrats.

Hi @sglyon, why not make these two examples by @empet available in the documentation of PlotlyJS? Apparently, there is no information in the docs about plotting from a large data frame and having the series name be filled in with column names. There is an entry by you in GITTER in 2018 but it is quite outdated, and it was not very specific about what should be done in this context.

Hi @empet. A small add-on to your code. If one has a vector which already includes the dates to be passed on to the x-axis, if one leaves : in x=:time, the x-axis will include 0,1,2 … . If instead one removes the : the dates will pop up as they were defined in the dates vector, e.g., 2007, 2008, …

Hi. Sorry for such a late entry into this issue. I have been trying to adapt @empet’s solution to the case of a matrix, instead of a data frame. Not surprisingly, I failed.

My problem is simple. I have a loop that produces a 400×7 Matrix{Float64}: matrix, and I want to come up with a scatter plot of those 7 columns against an array of 400×7 Matrix{Float64}:. As one can see in the figure below, I can easily do it.

The problem I am facing is that I need to name the traces according to the entries of another Float{64} array: y_grid = [0.25 , 0.4 , 0.6 , 0.9 , 1.3 , 2.0 , 3.0]. I can do it trace by trace, but this is ugly and terribly inefficient.

I know how to do it in Plots, but I am migrating all my teaching stuff into PlotlyJS. I am also using Pluto and have no compatibility problems between the former and PlotlyJS due to the workaround developed by @disberd here.

Help will be very much appreciated. Thanks.

Hi @VivMendes, couldn’t you use restyle!() as per previous post?

1 Like

Hi, @rafael.guerra. Unfortunately restyle!() does not work inside Pluto. It works in VScode, but the reactivity of Pluto precludes its use. Thanks.

Hi @VivMendes,

I am not entirely sure I understood what you want to achieve, but here is a simple way of achieving what I think your intended result is:

Plotting directly a matrix is anyway a shorter way to create 7 independent scatter traces.
Using map you can easily create those scatter assigning a custom name (or any other attribute for what matters).

If what you wanted to achieve is different, please let me know.

Edit: regarding using restyle!() inside Pluto, it is not entirely correct. You can’t restyle the output of one cell from another cell, but you can restyle a plot object before sending it as output to a cell (like inside a let block as I did above, I just think restyle is not the best approach if you just want to give custom names to traces).

2 Likes

@VivMendes
I don’t understand how you decide the trace name according to the values in y_grid. How are related the vector elements to the matrix columns to decide the name to be assigned? The most natural answer, given your details above, is that given by @disberd, but if you meant another relationship, y_grid-name, please be more precise.

1 Like

@disberd, @empet, thanks for helping. @disberd’s suggestion is exactly what I was looking for. This is from a dynamic programming problem using the value iteration function. The different lines are associated with the array y_grid.

Thanks also for the tip about restyle!. @rafael.guerra pointed it out to me some time ago, and it is extremely useful. I tried it inside Pluto using begin instead of let in the past. It did throw out no error, but the plot was nowhere to be seen. I will check it out with let. As always, you guys are ready to help. Thanks a lot.

@VivMendes it’s not really about anything specific of let, it also works with begin:

when it didn’t work you probably did not use relayout! correctly.
Remember that the output of relayout! is the Layout, not the plot itself, so you have return the Plot object as last output of the cell to correctly display the modified plot.

1 Like

@disberd, thanks for clarifying the issue. I can now understand the problem of restyle! inside Pluto. For example, the following piece of code used to work for me in VScode and Atom:

using PlotlyJS

t  = 100 
b = -0.9
init_cond = [0  1 ; 0  1 ; 0  1; 0  1]

𝓎 = [init_cond  zeros(4, t-1)] 					        # pre-allocation for 𝓎
α = (1.85 , 1.895 , 1.9 , 1.9005) 					    # four different values for parameter α  
		    
	for k in eachindex(α)                               # nested for-loop
		for 𝒾 = 1 : t-1 
			𝓎[k,𝒾+2]  = α[k] .* 𝓎[k,𝒾+1] + b .* 𝓎[k,𝒾] 
		end
	end
p10 = plot(𝓎')
restyle!(p10, 1:4, name=["α=1.85" , "α=1.895" , "α=1.9" , "α=1.9005"])

Here is the plot with the restyle! appended:

However, in Pluto, as you pointed out in your example, after using restyle! we have to finish the code by calling the plot again. And this is the part that I was missing. The code above, inside Pluto, is used to produce the annoying output (no error, no plot):

That little trick at the end of the block does the job. Once again, thank you very much. The Julia community is fabulous.