I am part of the effort to recreate “Causal Inference: The Mixtape” in Julia and therefore tried a couple of plotting libraries to recreate the first plot in the book (https://mixtape.scunning.com/probability-and-regression.html#ordinary-least-squares). There are of course different trade-offs with code length and possibilities. I would appreciate some input on which library would be best to recreate the book. Gadfly and VegaLite are very nice to read but I could not manage to implement everything (maybe this is not necessary though?), Plots and Makie are extremely flexible but result in longer code and with AlgebraOfGraphics I had a hard time actually showing the plot. Here is what I tried so far:
using DataFrames,
GLM,
Gadfly,
VegaLite,
Plots,
GLMakie,
AlgebraOfGraphics,
Random
Random.seed!(1);
df = DataFrame(
x = randn(10000),
u = randn(10000)
);
df.y = 5.5 .* df.x .+ 12 .* df.u;
reg_df = lm(@formula(y~x), df)
df.fitted = predict(reg_df)
slope = "Slope = $(round(coef(reg_df)[2],digits = 7))"
intercept = "y-intercept = $(round(coef(reg_df)[1],digits = 7))"
Gadfly.plot(df,
x = :x, y = :y,
Guide.title("OLS Regression Line"),
Geom.smooth(method=:lm,),
Geom.point,
size=[1.2pt],alpha=[0.5],
color = [colorant"grey0"],
Gadfly.Guide.manual_discrete_key(
"",
["Fitted Values", "y"],
shape = [Gadfly.Shape.hline, Gadfly.Shape.circle],
color = [colorant"grey0"],
size = [3pt, 1.2pt],
pos = [2.7, 15])
)
df |>
@vlplot(x=:x, y=:y,
title="OLS Regression Line",
width=300,height=300) +
@vlplot(
mark = {:point, filled = true,
color = "black",
size = 1.2,alpha = 0.5}
) +
@vlplot(
mark = {:line,
color = "black"},
transform = [{regression = :y, on = :x}],
)
Plots.scatter(
df.x, df.y,
markersize = 0.05,
label = "y",
title = "OLS Regression Line",
color = :black,
legend = :outerbottom
)
Plots.plot!(
df.x, df.fitted,
label = "Fitted values",
color = :black)
Plots.quiver!(
[2],[-20], quiver=([-1], [25]),
color = :blue
)
annotate!(2.1, -24,
Plots.text(slope,
:blue)
)
Plots.quiver!(
[-2],[20], quiver=([2], [-20+coef(reg_df)[1]]),
color = :red
)
annotate!(-1.5, 24.5,
Plots.text(intercept,
:red)
)
fig = Figure(resolution = (800,600))
ax = fig[1,1] = Axis(fig, title = "OLS Regression Line")
Makie.scatter!(ax,
df.x,df.y,
markersize = 1,
label = "y",
color = :black
)
Makie.lines!(ax,
df.x, df.fitted,
label = "Fitted values",
color = :black
)
annotations!(ax,
[slope,intercept],
position = [(1.5,-25),(-3.5, 21)], color = [:blue, :red])
arrows!(ax,
[2, -2], [-20, 20],
[-1, 2], [25, -20+coef(reg_df)[1]],
color = [:blue, :red])
fig[2,1] = Legend(fig, ax,
orientation = :horizontal,
tellwidth = false, tellheight = true)
fig
fig = Figure(resolution = (800,600))
plt = data(df) *(
mapping(:x, :y) *
visual(Scatter, markersize=1, label = "y") +
mapping(:x, :fitted=>"y") *
visual(Lines, label = "Fitted values") )+
data((x = [2, -2], y = [-20, 20], u = [-1, 2], v=[25,-20+coef(reg_df)[1]] ) )*
mapping(:x, :y, :u, :v) *
visual(Arrows, color = [:blue, :red])
ax = AlgebraOfGraphics.draw!(fig[1,1], plt)[1].axis
annotations!(ax,
[slope,intercept],
position = [(1.5,-25),(-3.5, 21)], color = [:blue, :red])
fig[2,1] = Legend(fig, ax,
orientation = :horizontal, tellheight = true, tellwidth=false
)
Label(fig[0,:],
"OLS Regression Line",
tellwidth=false)
fig
Also tagging @epogrebnyak and https://github.com/danielw2904/mixtape/issues/2 (hope that is ok )