# Strange behaviors in python and numpy

Numpy just assumes that you want to do broadcasting here, so it just starts adding stuff to your numpy array.

``````In [1]: import numpy as np

In [2]: a = np.array([1,2,3])

In [3]: sum(a,a)
Out[3]: array([7, 8, 9])

In [4]: ?sum
Signature: sum(iterable, /, start=0)
Docstring:
Return the sum of a 'start' value (default: 0) plus an iterable of numbers

When the iterable is empty, return the start value.
This function is intended specifically for use with numeric values and may
reject non-numeric types.
Type:      builtin_function_or_method

In [5]: a + 1 + 2 + 3
Out[5]: array([7, 8, 9])

In [6]: a + sum(a)
Out[6]: array([7, 8, 9])
``````
5 Likes

can we start a sub-thread about Numpy?

Nobody ever said Numpy is too buggy

7 Likes

Are you saying numpy might have correctness issues?

1 Like

Thatâs not correctness issues. In `a in b` in Numpy means what `any(a .== b)` means in Julia. Itâs a bad definition to be sure, but not really a correctness problem.

1 Like

except it also has this behavior

``````In [2]: np.array([1,2,3]) == np.array([1,2])
ValueError: operands could not be broadcast together with shapes (3,) (2,)

In [3]: np.array([1,2,3]) == np.array([2])
Out[3]: array([False,  True, False])

In [12]: np.array([1,2,3]) in np.array([2])
Out[12]: True
``````
2 Likes

No idea whether this is correct or incorrect but it did catch me by surprise when I was porting some Python code to Julia:

``````>>> import numpy as np
>>> x = np.float32(0)
>>> (x ** 2).dtype
dtype('float64')
``````

@jling That behaviour is completely consistent. Remember that in NumPy, `a == b` is the same as Juliaâs `a .== b`, and `a in b` is the same as Juliaâs `any(a .== b)`.

So, we get the exact same behaviour in Julia:

``````julia> [1,2,3] .== [1, 2]
ERROR: DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 3 and 2
[ ... ]

julia> [1,2,3] .== [2]
3-element BitVector:
0
1
0

julia> any([1,2,3] .== [2])
true
``````
1 Like

oh wow I actually hate this.

3 Likes

Like NumPy, we broadcast arrays with length 1 across a given dimension. This is useful, because we can do stuff like

``````julia> x = rand(4, 3);

julia> x_normed = x ./ sum(x; dims=2)
4Ă3 Matrix{Float64}:
0.0099922  0.907046   0.0829616
0.628005   0.0108374  0.361158
0.568802   0.249942   0.181256
0.361719   0.220452   0.417829
``````

Which works, even though `sum` returns a `4 x 1` matrix.

1 Like

Isnât this basically the whole point of broadcasting?

1 Like

I would have thought in general

``````x != [x] != Ref(x)
``````

for broadcasting, but idk, maybe I just donât have a basic understanding of the point of broadcasting

Yeah, this is what I call âimplicitâ broadcasting â dimensions that happen to have length one also âextrude outâ (or repeat) to match other arguments exactly like missing dimensions do. I like this visualization.

1 Like

It would be a pity if we couldnât do `matrix .* vec`, but had resort to padding `vec` with `repeat`, like Matlab in the bad old days (Matlab too broadcasts singleton dimensions these days, and I think numpy as well. In my mind, this is what âbroadcastingâ means.)

1 Like

In my opinion, broadcasting should apply scalars to each element of a container but should not conflate singleton containers with scalars.

How would you do broadcast multiply between a matrix and a vector, then? Use `repeat`?

Anyway, what are the drawbacks of this conflation? I can see only advantages.

Mixing containers with their values complicates reasoning and hides errors. This could be an error, for example:

``````julia> [10] .* [2, 3]
2-element Vector{Int64}:
20
30
``````

I think you mean

``````julia> m = collect(reshape(1:6, 3,2))
3Ă2 Matrix{Int64}:
1  4
2  5
3  6

julia> v = [1, 10, 100]
3-element Vector{Int64}:
1
10
100

julia> m .* v
3Ă2 Matrix{Int64}:
1    4
20   50
300  600
``````

We can retain that without inconsistency. I like to think of matching up each dim from each argument until one runs out; semantically,

``````julia> mapslices(col -> map(*, col, v), m; dims=1)
3Ă2 Matrix{Int64}:
1    4
20   50
300  600
``````

I think part of this confusion stems from the fact that actually scalars donât fit the view of âextruding dimensions of size 1â. In that sense I have no problem accepting that `[1,2,3] .+ [1]` works and more the fact that `[1,2,3] .+ 1` also works.
However noone wants to wrap scalars in brackets and I think this problem is actually at least part of the reason that numbers are iterable in Julia because that means `5` also behaves as a container `5[1] == 5` and then broadcasting stays logically consistent.

In my view itâs the other way around: Numbers, Refs and 0-dimensional arrays are the ones that are behaving completely unobjectionably. They all have `ndims(x) == 0` and `axes(x) = ()`. The âproperâ way to index them is with `x[]`; `x[1]` is a convenience. They fit the view of âexpanding missing dimensionsâ. Data that has missing dimensions (or dimensions that are statically-enforced to be 1, like a transposed vectorâs first dimension) will never unexpectedly change how it broadcasts. You might get caught out, however, if your matrix suddenly only has one row on some nth iteration.

2 Likes

There are several ways to do broadcasting or extending operations across arrays:

1. Like Numpy or Julia with the following rules:

i. Dimensions are matched from the front (Julia column-major) or back (Python row-major) and missing dimensions are filled:

``````rand(2, 3) .+  1      # zero-dimensional, i.e., always matches
rand(2, 3) .+ [1, 2]  # one-dimensional and matching first dim
``````

ii. Dimensions with single elements are expanded. This is particularly handy when used together with aggregations which donât remove dimensions, e.g.,

``````X = rand(2, 3, 4)
sum(X; dims = 2) |> size
X ./ sum(X; dims = 2)
``````

For consistency, this also implies that `[1, 2, 3] .+ [4]` works the same way.

2. Like APL or J with the following rules:

i. Each operation has a rank which specifies at which dimension of values it acts on, e.g., `+`, `*` all have rank zero and act on elements whereas aggregations such as `+/` (sum) has infinite rank and acts on whole arrays by inserting `+` between its sub-arrays sliced along the first dimension (giving a result of lower dimension):

``````    X =. i. 2 3 4  NB. index array of size 2 x 3 x 4
\$ X  NB. \$ means size
2 3 4
\$ +/ X  NB. sum collapses first dimension
3 4
``````

ii. When an operation has a lower rank than the array its applied to, the array is sliced into sub-arrays â starting from the front and expanding missing dimensions â until the rank of the operation is reached:

``````    X + 1 2  # Works as + has rank zero and first dimension matches
X * 3     # Scalar is zero-dimensional
``````

Thus, this looks like broadcasting in Julia, but works for a different reason and one-dimensional dimensions are not expanded, e.g., `X + i. 2 1` is a length error.

iii. The rank of an operation can be modified using `"` making it possible to aggregate across other than the first dimension and correct for collapsed dimensions:

``````(] %"1 +/)"2 X  NB. same as X / sum(X; dims = 2)
``````
3. Always using explicit and possible multiple mappings when applying scalar operations across arrays. Haskell with nested lists and `map . map` comes to mind or something like
`mapslices(v -> let s = sum(v); map(x -> x / s, v) end, X; dims = 2)`.
This can get quite verbose quickly and I will not expand on this here âŚ

All models are perfectly consistent with the first one being less general than the second one regarding composed operations applying at different sub-dimensions (ranks). It is probably easier to understand though and a good compromise to combat the verbosity of the third.

2 Likes