Incredible! The speed is about 20-24 seconds now. So as I understand it, when we use @turbo we have to write granular code and? Also by transposing r1 and r2 do you mean keeping them in their original form?
This is how the test matrix looks:
julia> X_test
20000×6 Matrix{Float64}:
3295.0 4044.0 1072.0 27.0 103.0 42.0
2959.0 1825.0 2980.0 89.0 42.0 323.0
2934.0 2779.0 1236.0 176.0 349.0 594.0
2808.0 1423.0 1734.0 80.0 147.0 180.0
3164.0 1338.0 2271.0 32.0 57.0 350.0
3085.0 2598.0 1712.0 90.0 299.0 324.0
3358.0 1332.0 1655.0 32.0 98.0 175.0
⋮ ⋮
2742.0 2493.0 1753.0 64.0 26.0 446.0
2735.0 1578.0 1879.0 19.0 65.0 457.0
3091.0 2239.0 2431.0 25.0 323.0 258.0
3278.0 1931.0 2439.0 63.0 115.0 470.0
3178.0 4577.0 2154.0 42.0 93.0 551.0
3262.0 390.0 807.0 28.0 261.0 408.0
So then I rotate 90 degrees to the left to transpose the rows, which then become columns.
julia> X_test |> rotl90
6×20000 Matrix{Float64}:
42.0 323.0 594.0 180.0 350.0 324.0 175.0 150.0 … 218.0 324.0 446.0 457.0 258.0 470.0 551.0 408.0
103.0 42.0 349.0 147.0 57.0 299.0 98.0 316.0 193.0 354.0 26.0 65.0 323.0 115.0 93.0 261.0
27.0 89.0 176.0 80.0 32.0 90.0 32.0 39.0 37.0 43.0 64.0 19.0 25.0 63.0 42.0 28.0
1072.0 2980.0 1236.0 1734.0 2271.0 1712.0 1655.0 966.0 1410.0 1976.0 1753.0 1879.0 2431.0 2439.0 2154.0 807.0
4044.0 1825.0 2779.0 1423.0 1338.0 2598.0 1332.0 5486.0 3208.0 4094.0 2493.0 1578.0 2239.0 1931.0 4577.0 390.0
3295.0 2959.0 2934.0 2808.0 3164.0 3085.0 3358.0 3307.0 … 3296.0 3112.0 2742.0 2735.0 3091.0 3278.0 3178.0 3262.0
Is this the wrong approach? I thought that Julia performs better when we iterate through columns rather than rows. Thus by rotating the rows we can access them faster now that they are columns.
I’m learning a lot from this : )