I need to do in-place summations of a transposed matrix and I get memory allocation in the process. For example, running function `test`

of the following code

```
function test_sum_transpose(b, a)
sum!(b, a')
end
function test_sum(b, a)
sum!(b, a)
end
function test()
a = [1 2 3; 3 4 5]
b = [0.0, 0.0, 0.0]; @time test_sum_transpose(b, a)
b = [0.0, 0.0]; @time test_sum(b, a)
return nothing
end
```

I get

```
0.000001 seconds (1 allocation: 16 bytes)
0.000001 seconds
```

Is there any way to achieve this summation (sum the columns of `a`

) and assign the result to `b`

without allocations?