Numpy just assumes that you want to do broadcasting here, so it just starts adding stuff to your numpy array.

In [1]: import numpy as np
In [2]: a = np.array([1,2,3])
In [3]: sum(a,a)
Out[3]: array([7, 8, 9])
In [4]: ?sum
Signature: sum(iterable, /, start=0)
Docstring:
Return the sum of a 'start' value (default: 0) plus an iterable of numbers
When the iterable is empty, return the start value.
This function is intended specifically for use with numeric values and may
reject non-numeric types.
Type: builtin_function_or_method
In [5]: a + 1 + 2 + 3
Out[5]: array([7, 8, 9])
In [6]: a + sum(a)
Out[6]: array([7, 8, 9])

Thatâs not correctness issues. In a in b in Numpy means what any(a .== b) means in Julia. Itâs a bad definition to be sure, but not really a correctness problem.

In [2]: np.array([1,2,3]) == np.array([1,2])
ValueError: operands could not be broadcast together with shapes (3,) (2,)
In [3]: np.array([1,2,3]) == np.array([2])
Out[3]: array([False, True, False])
In [12]: np.array([1,2,3]) in np.array([2])
Out[12]: True

@jling That behaviour is completely consistent. Remember that in NumPy, a == b is the same as Juliaâs a .== b, and a in b is the same as Juliaâs any(a .== b).

So, we get the exact same behaviour in Julia:

julia> [1,2,3] .== [1, 2]
ERROR: DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 3 and 2
[ ... ]
julia> [1,2,3] .== [2]
3-element BitVector:
0
1
0
julia> any([1,2,3] .== [2])
true

Yeah, this is what I call âimplicitâ broadcasting â dimensions that happen to have length one also âextrude outâ (or repeat) to match other arguments exactly like missing dimensions do. I like this visualization.

It would be a pity if we couldnât do matrix .* vec, but had resort to padding vec with repeat, like Matlab in the bad old days (Matlab too broadcasts singleton dimensions these days, and I think numpy as well. In my mind, this is what âbroadcastingâ means.)

I think part of this confusion stems from the fact that actually scalars donât fit the view of âextruding dimensions of size 1â. In that sense I have no problem accepting that [1,2,3] .+ [1] works and more the fact that [1,2,3] .+ 1also works.
However noone wants to wrap scalars in brackets and I think this problem is actually at least part of the reason that numbers are iterable in Julia because that means 5 also behaves as a container 5[1] == 5 and then broadcasting stays logically consistent.

In my view itâs the other way around: Numbers, Refs and 0-dimensional arrays are the ones that are behaving completely unobjectionably. They all have ndims(x) == 0 and axes(x) = (). The âproperâ way to index them is with x[]; x[1] is a convenience. They fit the view of âexpanding missing dimensionsâ. Data that has missing dimensions (or dimensions that are statically-enforced to be 1, like a transposed vectorâs first dimension) will never unexpectedly change how it broadcasts. You might get caught out, however, if your matrix suddenly only has one row on some nth iteration.

There are several ways to do broadcasting or extending operations across arrays:

Like Numpy or Julia with the following rules:

i. Dimensions are matched from the front (Julia column-major) or back (Python row-major) and missing dimensions are filled:

rand(2, 3) .+ 1 # zero-dimensional, i.e., always matches
rand(2, 3) .+ [1, 2] # one-dimensional and matching first dim

ii. Dimensions with single elements are expanded. This is particularly handy when used together with aggregations which donât remove dimensions, e.g.,

X = rand(2, 3, 4)
sum(X; dims = 2) |> size
X ./ sum(X; dims = 2)

For consistency, this also implies that [1, 2, 3] .+ [4] works the same way.

Like APL or J with the following rules:

i. Each operation has a rank which specifies at which dimension of values it acts on, e.g., +, * all have rank zero and act on elements whereas aggregations such as +/ (sum) has infinite rank and acts on whole arrays by inserting + between its sub-arrays sliced along the first dimension (giving a result of lower dimension):

X =. i. 2 3 4 NB. index array of size 2 x 3 x 4
$ X NB. $ means size
2 3 4
$ +/ X NB. sum collapses first dimension
3 4

ii. When an operation has a lower rank than the array its applied to, the array is sliced into sub-arrays â starting from the front and expanding missing dimensions â until the rank of the operation is reached:

X + 1 2 # Works as + has rank zero and first dimension matches
X * 3 # Scalar is zero-dimensional

Thus, this looks like broadcasting in Julia, but works for a different reason and one-dimensional dimensions are not expanded, e.g., X + i. 2 1 is a length error.

iii. The rank of an operation can be modified using " making it possible to aggregate across other than the first dimension and correct for collapsed dimensions:

(] %"1 +/)"2 X NB. same as X / sum(X; dims = 2)

Always using explicit and possible multiple mappings when applying scalar operations across arrays. Haskell with nested lists and map . map comes to mind or something like mapslices(v -> let s = sum(v); map(x -> x / s, v) end, X; dims = 2).
This can get quite verbose quickly and I will not expand on this here âŚ

All models are perfectly consistent with the first one being less general than the second one regarding composed operations applying at different sub-dimensions (ranks). It is probably easier to understand though and a good compromise to combat the verbosity of the third.