Also related: Making @benchmark outputs statistically meaningful, and actionable
And the corresponding issue on GitHub: use legitimate non-iid hypothesis testing · Issue #74 · JuliaCI/BenchmarkTools.jl · GitHub
Don’t mind me, I’m just doing some sweet sweet cross-referencing