copy suffices. (I don’t recall ever needing deepcopy in all my years of using Julia.)
You could just multiply by a vector of 1’s to sum each row; I don’t know about sum along particular dimensions, but matrix–vector multiplication is certainly well optimized. The code is a lot shorter, too, and works for any linear operator:
rowstochastic(M) = Diagonal(M * ones(eltype(M), size(M,2))) \ M
f.(M) works, as long as f preserves zeros.