The functions copy and collect are extremely slow on the results of a QR factorization. Matrix produces the same result, but is much faster. An easy solution is to not use copy or collect. Is there a good reason for these routines to be so much slower? If not, fixing this is sure to help some people.
The following example is with Julia 1.1.0, on an iMac.
julia> n = 10
julia> q = qr(randn(n,n)).Q;
julia> @btime x = copy($q);
81.203 μs (301 allocations: 47.75 KiB)
julia> @btime y = collect($q);
80.980 μs (301 allocations: 47.75 KiB)
julia> @btime z = Matrix($q);
5.253 μs (2 allocations: 1.75 KiB)
julia> x == y == z
true
julia> n = 100;
julia> q = qr(randn(n,n)).Q;
julia> @btime x = copy($q);
114.106 ms (30002 allocations: 25.71 MiB)
julia> @btime y = collect($q);
113.197 ms (30002 allocations: 25.71 MiB)
julia> @btime z = Matrix($q);
114.343 μs (4 allocations: 156.41 KiB)
As you can imagine, it becomes much worse with larger matrices.