Create Groupedbar in Statsplot from Dataframe with dates on the x-axis

Hi all,

I want to plot this csv:

To become a groupedbar like this:

  1. With dates on the x-axis (do I need to change the format in the csv of 01.2015 / 02.2015?)
  2. I want to plot an entire column (e.g. Solar) and plot to see the distribution with Distributions.jl package, since I saw this code I was inspired:
using Distributions
plot(Normal(3,5), fill=(0, .5,:orange))

the question is how? from my csv to become a distribution plot? Do I need to create a new csv just consisting of 2 columns: Month and Solar, is that right?

You can obtain the csv here:
https://energy-charts.info/charts/energy/chart.htm?l=en&c=DE&chartColumnSorting=default&interval=month&year=-1

You can try the following:

using CSV, DataFrames, StatsPlots

df = CSV.read(filename, DataFrame)
df.Month = join.(split.(rpad.(string.(df.Month),6,"0"),'.'), '-')

groupedbar(df.Month, Matrix(df[:,2:end]), bar_position=:stack, bar_width=0.7, label=permutedims(names(df)[2:end]))

plot!(tickfontsize=6, xlabel="Month", ylabel="Energy(GWh)", guidefontsize=6, legendfontsize=6, legend=:outerright)

NB: there should be an easier way using the @df macro

1 Like

Can we plot thé whole months, January Till December in year 2019?

For such input file, the months are numbered from 1 to 12.
If you are ok to plot them like that, just remove the line of code df.Month = join..., and it should work.

I get an error at this line:
groupedbar(df.Month, Matrix(df[:,2:end]), bar_position=:stack, bar_width=0.7, label=permutedims(names(df)[2:end]))

It seems that the file you loaded has missing data. It was not the case for any of the two files you suggested to look at above. What is the specific CSV file you are now using?

I use the option of all months
Capture d’écran_2022-07-13_18-58-29

Right. After you load the CSV into a dataframe, you can do: describe(df) to find out that the column 15 (Waste non-renewable) has missings: Union{Missing, Float64}

You may proceed as follows:

df[:,15] .= coalesce.(df[:,15], 0)
df.Month = join.(split.(rpad.(string.(df.Month),6,"0"),'.'),'-')

groupedbar(df.Month, Matrix(df[:,2:end]), bar_position=:stack, bar_width=0.7, label=permutedims(names(df)[2:end]))

plot!(tickfontsize=6, xlabel="Month", ylabel="Energy(GWh)", guidefontsize=6, legendfontsize=6, legend=:outerright)

It works but… you’ll run into a trickier problem: how to resize plots and fonts, when they are overloaded with information.

I suggest you look into Discourse and StackOverflow first, and if you’re not satisfied, open a new thread for advice.

Thanks for the describe(df)

I will search discourse and Stackoverflow first before start a new thread.

Fyi, the following quick settings display the last example fine on my system (just showing edited lines):

using Measures
plot(legendfontsize=6, widen=false, legend=:outerright, size=(1200,600), dpi=600, margins=10mm)
groupedbar!(df.Month, Matrix(df[:,2:end]), bar_position=:stack, lw=0.2, label=permutedims(names(df)[2:end]))
plot!(xrotation=90, tickfontsize=6, xlabel="Month", ylabel="Energy (GWh)", guidefontsize=7)

1 Like

Wow amazing, I create multiple plots instead:

Thank you very much @rafael.guerra

1 Like

Thank you, because the sample data that you have shown is very educational about energy consumption in Germany, which is a key issue in Europe today.

I have found it by accident as well from a blog talking about Julia and that person talk about energy and give the link to the database. I see it is a key issue now. Hopefully more renewable energy will be used in Europe.

1 Like