Reversing the loops gave me a significant performance spead-up in julia:
# your original function
julia> @btime test($f, $m);
1.945 ms (0 allocations: 0 bytes)
julia> function test2(f, m)
size = length(f)
for j in 1:size
for i in 1:size
dij = m[i,j]
f[i] = dij
f[j] = -dij
end
end
return f
end
test2 (generic function with 1 method)
julia> @btime test2($f, $m);
799.093 μs (0 allocations: 0 bytes)