for rating in ratings
x_idx = rating[1] + 1
y_idx = rating[2] + 1
rv = rating[3]
W_row = W[:, x_idx]
H_row = H[:, y_idx]
pred = dot(W_row, H_row)
diff = rv  pred
W_grad = 2 * diff .* H_row
H_grad = 2 * diff .* W_row
W[:, x_idx] = W_row  step_size .* W_grad
H[:, y_idx] = H_row  step_size .* H_grad
end
Above is a piece of code whose performance degrades by ~20x (from 1.2 seconds to 32 seconds for some particular input data) after I upgraded to Julia v0.6.2 from v0.5.1. Even with v0.5.1, it’s 3~4x slower than the same program written in C++.
In the above code, ratings
is a Vector of tuples, W
and H
are 2dimensional arrays of roughly 100 by 5000.
I am guessing that it’s memory allocation that caused the problem.
Julia v0.6.2:
32.802334 seconds (148.08 M allocations: 11.332 GiB, 4.71% gc time)
Julia v0.5.1:
1.263061 seconds (18.58 M allocations: 6.835 GB, 9.55% gc time)
Why does v0.6.2 allocate much more memory?
In my C++ code, I would have preallocated memory for variables like W_row
H_row
etc and reuse the same memory across iterations. How would I do the same thing in Julia?
Memory allocation profiling (v0.6.2):
0 for iteration = 1:num_iterations
0 for rating in ratings
59192224 x_idx = rating[1] + 1
55158016 y_idx = rating[2] + 1
32006688 rv = rating[3]

2298050581 W_row = W[:, x_idx]
2291205792 H_row = H[:, y_idx]
32236624 pred = dot(W_row, H_row)
32006688 diff = rv  pred
3971640698 W_grad = 2 * diff .* H_row
3968829312 H_grad = 2 * diff .* W_row
5795581993 W[:, x_idx] = W_row  step_size .* W_grad
5793210528 H[:, y_idx] = H_row  step_size .* H_grad
 end
Memory allocation profiling (v0.5.1):
0 for iteration = 1:num_iterations
0 for rating in ratings
59192224 x_idx = rating[1] + 1
55158016 y_idx = rating[2] + 1
32006688 rv = rating[3]

1793490040 W_row = W[:, x_idx]
1792374528 H_row = H[:, y_idx]
32279345 pred = dot(W_row, H_row)
32006688 diff = rv  pred
1825048125 W_grad = 2 * diff .* H_row
1824381216 H_grad = 2 * diff .* W_row
3618250719 W[:, x_idx] = W_row  step_size .* W_grad
3616755744 H[:, y_idx] = H_row  step_size .* H_grad
 end