Rant: I hate that it's possible to iterate over an integer

ForceBru · March 9, 2022, 6:09pm

Maybe this’ll be helpful to new users like myself. I’ve been writing Julia every day for about a year now. (and still considering myself new because I still haven’t had the time to learn the type system properly, for example), but I’m still getting bitten by this every once in a while.

Take a look at this code:

const AM = AbstractMatrix{T} where T

function expectations(P::AM{<:Real}, M::AM{<:Real}, V::AM{<:Real})
	a = @views sum(
		@. P[k, :] * M[k, :]
		for k ∈ size(P, 1)
	)
	sigma = @views sum(
		@. P[k, :] * sqrt(V[k, :])
		for k ∈ size(P, 1)
	)

	(; a, sigma)
end;

This code is supposed to compute expectations of some random variables for each row in the matrices P (each column is a discrete probability distribution), M and V.

To me, it looks completely fine, and it also runs and outputs something that looks reasonable… sometimes.

Other times, however, it randomly returns some garbage that makes zero sense, depending on the order of rows in the input matrices, even though it’s supposed to be summing over them, so the order shouldn’t matter! Guess why?

Because for k ∈ size(P, 1) does not iterate over a range from one to size(P, 1), as I intended!!! In fact, it iterates over one single integer size(P, 1), which boils down to selecting just one last row of each matrix instead of summing over all rows. That’s super confusing!

Why would iterating over an integer (as opposed to an array or a tuple of one integer) work at all without producing any errors?! I guess it works like this so that code like Int(2.0) and Int.(2.0) (notice broadcasting) produces the same result. So, this kind of makes sense, but it also just made me waste several hours (seriously) debugging my code and surrounding libraries because I was absolutely sure I most definitely couldn’t have messed up this simple for loop.

Yet it was exactly what I messed up.

goerch · March 9, 2022, 6:19pm

I hate that it’s possible to iterate over an integer

Then you might be willing to do evil things:

import Base.iterate
iterate(t::T) where T<:Number = @assert false

v = 1
for i in size(v, 1)
end

ForceBru · March 9, 2022, 6:35pm

Right, and then you hit me with this:

Why would computing the size of a single number work? A size of something that is not a collection? Turns out, 1234[1] also “works” and evaluates to the number 1234 itself. Apparently, Julia treats numbers as arrays because numbers obey the collections “interface”:

Indexing works: 1234[1] == 1234
Computing the length works: length(1234) == 1
Computing the size works: size(1234, 1) == 1

However, a number is not a collection, IMO.

I just quickly tried some stuff out, and it looks like all primitive types behave as this kind of “collections of one element”:

julia> 1234[1], length(1234)
(1234, 1)

julia> true[1], length(true)
(true, 1)

julia> 'c'[1], length('c')
('c', 1)

julia> 1234.56[1], length(1234.56)
(1234.56, 1)

Okay, so Julia seems to be copying R here. Same thing in R:

> c(1234[1], length(1234))
[1] 1234    1
> c(TRUE[1], length(TRUE))
[1] 1 1
> c('c'[1], length('c'))
[1] "c" "1"
> c(1234.56[1], length(1234.56))
[1] 1234.56    1.00

Yeah, it sort of makes sense, but at the same time leads to very subtle bugs like this one…

Henrique_Becker · March 9, 2022, 6:42pm

Yes, it is not ideal. Since I tripped on this, I have been using (1,) instead of just 1 in broadcast, so as to be more explicit, and also because if the behavior is changed in Julia 2.0 (one can hope) then my code changes would be minimal.

lmiq · March 9, 2022, 6:46pm

As a side note, you can do this faster using:

[ sum(P[k, j] * M[k, j] for k ∈ 1:size(P, 1)) for j in 1:size(P,2) ]

because in the former you are allocating a new array in the bradcasting operation for every iteration of the 1:size(P,1) loop:

julia> P = rand(100,100); M = rand(100,100);

julia> function test1(P,M)
          @views sum(@. P[k, :] * M[k, :] for k ∈ 1:size(P, 1) )
       end
test1 (generic function with 1 method)

julia> function test2(P,M)
          [ sum(P[k, j] * M[k, j] for k ∈ 1:size(P, 1)) for j in 1:size(P,2) ]
       end
test2 (generic function with 1 method)

julia> test1(P,M) ≈ test2(P,M)
true

julia> @btime test1($P,$M);
  17.619 μs (199 allocations: 174.12 KiB)

julia> @btime test2($P,$M);
  9.282 μs (1 allocation: 896 bytes)

or much faster with:

julia> using LoopVectorization

julia> function test3(P,M)
          r = zeros(size(P,1))
          @turbo for j in axes(P,2)
              for k in axes(P,1)
                  r[j] += P[k,j]*M[k,j]
              end
          end
          return r
       end
test3 (generic function with 1 method)

julia> @btime test3($P,$M);
  1.530 μs (1 allocation: 896 bytes)

goerch · March 9, 2022, 7:00pm

Interestingly, VS Code linter shares your opinion:

JonasWickman · March 9, 2022, 8:35pm

Coming from a Matlab background I’ve never felt this was strange (though I admit I sometimes have made the same mistake)

For reference, using this pattern might help avoid future headaches:

P = [1 2 3
     4 5 6
     7 8 9]
k = 2
for i ∈ axes(P,2)
    println(P[k,i])
end

Outputs:

4
5
6

Mason · March 9, 2022, 9:02pm

Numbers are scalars. If someone were to ask me “how many dimensions does a scalar have?”, I would answer that “a scalar has 0 dimensions”, not that the question is ill-posed (i.e. an error). Would you agree with that?

I think if we can agree that scalars are zero dimensional and that numbers are scalars, then this behaviour under iteration is pretty unambiguous and natural.

The length of a container is product of the size of all of it’s dimensions. The product of an empty list (zero dimensions) is 1. Hence, zero dimensional objects should have one element. What is that element in the case of a number? The number itself!

Elrod · March 9, 2022, 9:08pm

Something else I want to highlight there is:

julia> using LoopVectorization

julia> function test3(P,M)
          r = zeros(size(P,1))
          @turbo for j in axes(P,2)
              for k in axes(P,1)

that you used axes instead of 1:size.
axes is much better and should be preferred as a rule. The two main advantages: support for OffsetArrays and avoiding bugs like the one this thread is about!
(Also pointed out by JonasWickman above.)

DNF · March 9, 2022, 9:13pm

While your main point stands, you should never use size in places like this. Always axes or eachindex or pairs.

ForceBru · March 9, 2022, 9:14pm

True, I actually didn’t know about axes

goerch · March 9, 2022, 9:19pm

Hm, corresponds to my intuition. But can we eliminate the special cases for

dims(t::T) where {T} = 0
dims(t::Array{T, N}) where {T, N} = N

@show dims(1)
@show dims([1;])
@show dims([1;;])
@show dims([1;;;])

then? Also

@show Array{Int, 0}(1)
@show Int <: Array{Int, 0}

are questionable…

ForceBru · March 9, 2022, 9:27pm

Yes, this looks totally reasonable. However, it seems strange from the point of view of… programming, I guess? I’m used to iterating over collections, but a single number isn’t really a collection, as opposed to an array that contains one number. At the same time, it kind of is a collection because of the scalar/vector/matrix/tensor analogy.

Other languages (excluding Julia and R as mentioned in one of my previous comments) don’t treat numbers or other primitive types as collections of one element, and this makes perfect sense to me.

Python:

>>> for i in 1:
...  ...
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable

It explicitly tells me that it doesn’t make sense to iterate over a single integer.

Rust:

fn main() {
    for i in 1 {
        println!("hey");
    }
}

Error message:

error[E0277]: `{integer}` is not an iterator
 --> src/main.rs:2:14
  |
2 |     for i in 1 {
  |              ^ `{integer}` is not an iterator
  |
  = help: the trait `Iterator` is not implemented for `{integer}`
  = note: if you want to iterate between `start` until a value `end`, use the exclusive range syntax `start..end` or the inclusive range syntax `start..=end`
  = note: required because of the requirements on the impl of `IntoIterator` for `{integer}`

It also explicitly tells me that it doesn’t make sense to iterate over an integer.

Mason · March 9, 2022, 9:27pm

goerch:

Hm, corresponds to my intuition. But can we eliminate the special cases for
dims(t::T) where {T} = 0
dims(t::Array{T, N}) where {T, N} = N

@show dims(1)
@show dims([1;])
@show dims([1;;])
@show dims([1;;;])

I don’t really understand the point you’re trying to make here. Could you elaborate?

Why would this be true? We have lots of zero dimensional containers. E.g. Ref:

julia> ndims(Ref(1))
0

julia> size(Ref(1))
()

julia> [x for x in Ref(1)]
0-dimensional Array{Int64, 0}:
1

yet it’s not an Array

julia> Ref{Int} <: Array{Int, 0}
false

Mason · March 9, 2022, 9:30pm

ForceBru:

Rust:

fn main() {
    for i in 1 {
        println!("hey");
    }
}

Error message:

error[E0277]: `{integer}` is not an iterator
 --> src/main.rs:2:14
  |
2 |     for i in 1 {
  |              ^ `{integer}` is not an iterator
  |
  = help: the trait `Iterator` is not implemented for `{integer}`
  = note: if you want to iterate between `start` until a value `end`, use the exclusive range syntax `start..end` or the inclusive range syntax `start..=end`
  = note: required because of the requirements on the impl of `IntoIterator` for `{integer}`

It also explicitly tells me that it doesn’t make sense to iterate over an integer.

Well, Rust doesn’t tell you that it doesn’t make sense, it tells you that this is an error because the Iterator trait hasn’t been implemented for {integer}. In julia, we have implemented that trait.

julia> Iterators.IteratorEltype(Int)
Base.HasEltype()

julia> iterate(1)
(1, nothing)

I guess I just don’t find the argument persuasive that we should do what other languages do here. If you multiply two matrices in Python and a bunch of other languages with * it does elementwise multiplication, which is frankly insane IMO.

goerch · March 9, 2022, 9:32pm

This doesn’t work without dims(t::T) where {T} = 0 so numbers are not zero dimensional arrays…
OTOH zero dimensional array are at least not syntactically invalid, so how to instantiate one and what does that mean?

Mason · March 9, 2022, 9:33pm

Numbers aren’t zero dimensional arrays because they’re not arrays. Arrays are not the only containers in the language.

They are zero dimensional objects though.

help?> ndims
search: ndims RoundingMode ENDIAN_BOM

  ndims(A::AbstractArray) -> Integer


  Return the number of dimensions of A.

  See also: size, axes.

  Examples
  ≡≡≡≡≡≡≡≡≡≡

  julia> A = fill(1, (3,4,5));

  julia> ndims(A)
  3


julia> ndims(1)
0

Mason · March 9, 2022, 9:40pm

I don’t think there’s any real intended way to make them, but here’s one thing you can do: make a list comprehension iterating over a number!

julia> A = [x for x in 1]
0-dimensional Array{Int64, 0}:
1

Now you have a zero dimensional array containing that number, and you can do whatever you want with it:

julia> A[1]
1

julia> ndims(A)
0

julia> A[1] = 2;

julia> A
0-dimensional Array{Int64, 0}:
2

PetrKryslUCSD · March 9, 2022, 9:48pm

But

julia> a = 1
1                                                                 
                                                                  
julia> axes(a, 1)
Base.OneTo(1)                                                     
                                                                  
julia>

PetrKryslUCSD · March 9, 2022, 9:51pm

Does axes really help?

julia> a = 1
1       
julia> for k in size(a, 1)
       println(k)
       end
1                                                                 
                                                                  
julia> for k in axes(a, 1)                                        
       println(k)                                                 
       end                                                        
1                                                                 
                                                                  
julia>

Topic		Replies	Views
How do you iterate over a vector of ranges of different lengths? General Usage	2	323	April 6, 2023
Why is the following behaviour not a bug? General Usage	19	1473	November 2, 2020
Iterating over a DataFrame New to Julia iterative , dataframes , function	2	708	May 26, 2021
Iterators, collections, arrays General Usage question	8	3097	July 16, 2017
Why is length of integers = 1? New to Julia	3	456	March 26, 2019

Rant: I hate that it's possible to iterate over an integer

Related topics