# How can I make this plot?

I came across this article on Freakonometrics and it has the following really nice plot in it (generated with R, I believe): Has anyone seen anything similar plotted in Julia or does anyone know if this is at least possible with Julia?

1 Like

The 3D-waterfall is an easy task for PGFPlotsX.jl when I’m back in office I will post an example.

2 Likes

Awesome, I think I’ve heard of the package but I’ve not actually used it. It looks really cool!

Here we go:

``````using Random
using Distributions
using PGFPlotsX

Random.seed!(42)
#Generate Data
σ = 2      # std dev
x_min = -10 # xrange to plot
x_max = 10
μ_min = -5
μ_max = 5

dist = (μ, σ) ->  Normal(μ, σ)      # define the distribution
dists = [dist(μ, σ) for μ in -5:1:5]       # The set of distributions we're going to use
rnd = rand.(Truncated.(dists, x_min, x_max),10)  # Generates norm distributed data
dat_pdf = [(x) ->  pdf(d,x) for d in dists ] # Get the pdf of the dists

x_pnts = collect(x_min:0.2:x_max)

# first mak a axes object
axis = @pgf Axis(
{
width = raw"1\textwidth",
height = raw"0.4\textwidth",
ymajorgrids,        # plot only gridlines in y direction
xmax = x_max,         # set the plot range
xmin = x_min,
zmin = 0,
"axis background/.style={fill=gray!10}",    # add some beauty
"set layers",   # this is needed to make the scatter points appear behind the graphs
view = raw"{60}{60}", # viewpoint
},
);

# first the jellow area at the bottom
@pgf y = Plot3(
{
"no marks",
style ="{dashed}",
color = "black",
fill = "yellow",
"fill opacity = 0.65",
"on layer" = "axis background", # so we can see the grid lines trought
},
Table(x=[μ_min-σ, 0, μ_max+σ, 0], y=[length(rnd), 0, 0, length(rnd)],z=[0, 0, 0, 0] ),
raw"\closedcycle",
)
push!(axis,y) # append the area to the axis

# second the scatter dots
@pgf for i in eachindex(dists)
s = Plot3(
{
"only marks",
color = "red",
"mark options" = raw"{scale=0.4}",
"mark layer" = "like plot",     # set the markers on the same layer as the plot
"on layer" = "axis background",
},
Table(x = rnd[i], y= (length(dists)-i) *ones(length(rnd[i])), z=zeros(length(rnd[i].+0.1)) )
)
push!(axis,s)

if i%2 ==0
d= Plot3(
{
"no marks",
style ="{thick}",
color = "blue",
"fill opacity=0.25",
fill = "blue",
},
Table(x = x_pnts, y= (length(dists)-i) *ones(length(x_pnts)), z=dat_pdf[i](x_pnts) ),

)
push!(axis,d)
end

end
plot = @pgf TikzPicture({"scale"=>2},axis)

PGFPlotsX.save("plot.png",plot)

``````

The reason why I like PGFPlotsX is because you stay close to latex code, which makes it easy to adopt examples for latex.

31 Likes

This is amazing   This is indeed very nice. Please consider contributing this to the PGFPlotsX gallery.

2 Likes

Request is open:
https://github.com/KristofferC/PGFPlotsX.jl/pull/191

2 Likes

That was the whole idea Glad to see it working so well!

4 Likes

Here’s something similar that I made with Plots.jl: ``````using DataFrames
using Distributions
using GLM
using Random
using StatsBase
using StatsPlots
pyplot()

# Generate some funky heteroscedastic data
data = DataFrame(y =[rand(TruncatedNormal(1, .4n, n/2, 2n)) for n in 1:100], x=collect(1:100))

# Model the data
ols = lm(@formula(y ~ x), data)

# Get cutoff points to group data (we’ll use these to generate the distributions)
groups = vcat(0, [percentile(data.x, n) for n in 10:20:100])

dists = [
fit(Normal, [data.y[i] for i in 1:length(data.y) if groups[j - 1] < data.x[i] < groups[j]])
for j in 2:length(groups)
]

xmin = minimum(data.y)
xmax = maximum(data.y)
xrange = collect(xmin:1:xmax)

p = plot(zeros(length(xrange)) .+ groups, xrange, [pdf(dists, x) for x in xrange])

popfirst!(groups)

for i in 2:length(dists)
plot!(
zeros(length(xrange)) .+ groups[i],
xrange,
[pdf(dists[i], x) for x in xrange],
legend = false
)
end

plot!(
data.x,
data.y,
seriestype = :scatter,
markersize = 2,
markerstrokewidth = 0,
markeralpha = 0.8
)

plot!(data.x, predict(ols), line=:line)
``````
10 Likes

Thanks for working through the PR revision process with us, now it is one of the highlights of the gallery in the manual:

https://kristofferc.github.io/PGFPlotsX.jl/dev/examples/gallery.html#D-Waterfall-1

4 Likes

As you may have noticed, my knowledge with git and its processes is still very basic. So thanks for your patience and for helping me pulling this!

3 Likes

I cleaned up/improved my code a bit and posted it to a github repo for easy access:

New result: And for a plot of the residuals: Unfortunately, the `fill` keyword argument doesn’t seem to work with the 3D format. It would be nice to figure that part out…I might start a new thread for that. Anyways, here’s the code:

``````using DataFrames
using Distributions
using GLM
using Random
using StatsBase
using StatsPlots

# Set plots backend to PyPlot
pyplot()

# Generate some funky heteroscedastic data
data = DataFrame(y =[rand(TruncatedNormal(1, .4n, n/2, 2n)) for n in 1:100], x=collect(1:100))

# Model the data
ols = lm(@formula(y ~ x), data)

# Get cutoff points to group data (we’ll use these to generate the distributions)
groups = [percentile(data.x, n) for n in 0:20:100]

dists = [
fit(Normal, [data.y[i] for i in 1:length(data.y) if groups[j - 1] < data.x[i] < groups[j]])
for j in 2:length(groups)
]

# The distributions are at the 20th, 40th, 60th, 80th, and 100th percentiles so this
# next variable will store values at the 10th, 30th, etc., percentiles so that dists
# appear in the middle of the data points that they represent when plotted
distlocs = [percentile(data.x, n) for n in 10:20:100]

xmin = minimum(data.y)
xmax = maximum(data.y)

# You'll likely have to tweak the xmin/xmax values in xrange to get the desired result
xrange = collect(xmin-25:1:1.5xmax)

p = plot(
data.x,
data.y,
seriestype = :scatter,
markersize = 2,
markerstrokewidth = 0,
markeralpha = 0.8
)

plot!(data.x, predict(ols), line=:line, linestyle=:dash, linealpha=0.6)