Numpy just assumes that you want to do broadcasting here, so it just starts adding stuff to your numpy array.
In : import numpy as np
In : a = np.array([1,2,3])
In : sum(a,a)
Out: array([7, 8, 9])
In : ?sum
Signature: sum(iterable, /, start=0)
Return the sum of a 'start' value (default: 0) plus an iterable of numbers
When the iterable is empty, return the start value.
This function is intended specifically for use with numeric values and may
reject non-numeric types.
In : a + 1 + 2 + 3
Out: array([7, 8, 9])
In : a + sum(a)
Out: array([7, 8, 9])
In : np.array([1,2,3]) == np.array([1,2])
ValueError: operands could not be broadcast together with shapes (3,) (2,)
In : np.array([1,2,3]) == np.array()
Out: array([False, True, False])
In : np.array([1,2,3]) in np.array()
@jling That behaviour is completely consistent. Remember that in NumPy, a == b is the same as Julia’s a .== b, and a in b is the same as Julia’s any(a .== b).
So, we get the exact same behaviour in Julia:
julia> [1,2,3] .== [1, 2]
ERROR: DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 3 and 2
[ ... ]
julia> [1,2,3] .== 
julia> any([1,2,3] .== )
Yeah, this is what I call “implicit” broadcasting — dimensions that happen to have length one also “extrude out” (or repeat) to match other arguments exactly like missing dimensions do. I like this visualization.
It would be a pity if we couldn’t do matrix .* vec, but had resort to padding vec with repeat, like Matlab in the bad old days (Matlab too broadcasts singleton dimensions these days, and I think numpy as well. In my mind, this is what ‘broadcasting’ means.)
I think part of this confusion stems from the fact that actually scalars don’t fit the view of “extruding dimensions of size 1”. In that sense I have no problem accepting that [1,2,3] .+  works and more the fact that [1,2,3] .+ 1also works.
However noone wants to wrap scalars in brackets and I think this problem is actually at least part of the reason that numbers are iterable in Julia because that means 5 also behaves as a container 5 == 5 and then broadcasting stays logically consistent.
In my view it’s the other way around: Numbers, Refs and 0-dimensional arrays are the ones that are behaving completely unobjectionably. They all have ndims(x) == 0 and axes(x) = (). The “proper” way to index them is with x; x is a convenience. They fit the view of “expanding missing dimensions”. Data that has missing dimensions (or dimensions that are statically-enforced to be 1, like a transposed vector’s first dimension) will never unexpectedly change how it broadcasts. You might get caught out, however, if your matrix suddenly only has one row on some nth iteration.
There are several ways to do broadcasting or extending operations across arrays:
Like Numpy or Julia with the following rules:
i. Dimensions are matched from the front (Julia column-major) or back (Python row-major) and missing dimensions are filled:
rand(2, 3) .+ 1 # zero-dimensional, i.e., always matches
rand(2, 3) .+ [1, 2] # one-dimensional and matching first dim
ii. Dimensions with single elements are expanded. This is particularly handy when used together with aggregations which don’t remove dimensions, e.g.,
X = rand(2, 3, 4)
sum(X; dims = 2) |> size
X ./ sum(X; dims = 2)
For consistency, this also implies that [1, 2, 3] .+  works the same way.
Like APL or J with the following rules:
i. Each operation has a rank which specifies at which dimension of values it acts on, e.g., +, * all have rank zero and act on elements whereas aggregations such as +/ (sum) has infinite rank and acts on whole arrays by inserting + between its sub-arrays sliced along the first dimension (giving a result of lower dimension):
X =. i. 2 3 4 NB. index array of size 2 x 3 x 4
$ X NB. $ means size
2 3 4
$ +/ X NB. sum collapses first dimension
ii. When an operation has a lower rank than the array its applied to, the array is sliced into sub-arrays – starting from the front and expanding missing dimensions – until the rank of the operation is reached:
X + 1 2 # Works as + has rank zero and first dimension matches
X * 3 # Scalar is zero-dimensional
Thus, this looks like broadcasting in Julia, but works for a different reason and one-dimensional dimensions are not expanded, e.g., X + i. 2 1 is a length error.
iii. The rank of an operation can be modified using " making it possible to aggregate across other than the first dimension and correct for collapsed dimensions:
(] %"1 +/)"2 X NB. same as X / sum(X; dims = 2)
Always using explicit and possible multiple mappings when applying scalar operations across arrays. Haskell with nested lists and map . map comes to mind or something like mapslices(v -> let s = sum(v); map(x -> x / s, v) end, X; dims = 2).
This can get quite verbose quickly and I will not expand on this here …
All models are perfectly consistent with the first one being less general than the second one regarding composed operations applying at different sub-dimensions (ranks). It is probably easier to understand though and a good compromise to combat the verbosity of the third.