I have two datasets and I want to emphasize that one of them has smaller “noise level” than the other one. For example:
pts1_x = [10randn() for i = 1 : 100]
pts1_y = pts1_x.^2 .+ [100randn() for i = 1:100]
pts2_x = [10randn() for i = 1 : 100]
pts2_y = pts2_x.^2 .+ [25randn() for i = 1:100]
plt = Plots.plot()
Plots.scatter!(pts1_x, pts1_y)
Plots.scatter!(pts2_x, pts2_y)
To emphasize that the orange points have a smaller scatter around the parabola, I want to superpose a semi-transparent orange band around them, and a wider semi-transparent blue band around the blue points.
Note that in general, I do not know the law generating the points (i.e., that there is an underlying parabola). I just have two sets of points, one of which has a narrower distribution and I want to emphasize this.
Any suggestions on how to make this more visually obvious? Thanks!
If you have access to the mean function/ true underlying function, you can plot it and use the keyword ribbon in Plots.jl to draw the transparent bands you are looking for. http://docs.juliaplots.org/latest/attributes/
See link above for attributes in Plots.jl
using Statistics
pts1_x = [10randn() for i = 1 : 100]
pts1_y = pts1_x.^2 .+ [100randn() for i = 1:100]
pts2_x = [10randn() for i = 1 : 100]
pts2_y = pts2_x.^2 .+ [25randn() for i = 1:100]
A = pts1_x.^(0:2)'
k = A\pts1_y # Estimate linear model of order 2
yhat = A*k
I = sortperm(pts1_x)
plt = Plots.plot()
Plots.scatter!(pts1_x, pts1_y)
Plots.scatter!(pts2_x, pts2_y)
Plots.plot!(pts1_x[I], yhat[I], ribbon = 2std(yhat-pts1_y))
FWIW I think the scatter by itself is pretty clear. When you say that one has a narrower distribution, you mean a narrower distribution around a varying local mean, right? One way to do accomplish your goal would be to bin your data points in the x dimension, and construct the band based on the quantiles in those bins.