Consistency of eye vs. zeros, ones, rand functions

question

#1

Hi,
I’d like to understand better the design of these standard array creation functions, as they seem somewhat inconsistent to me.

Compare these examples:

a = rand(2);
sz = size(a);
# 1 #
o = ones(a); # 2-element Array{Float64,1} but...
id = eye(a); # ...an error here
# 2 #
o2 = ones(sz...); # 2-element Array{Float64,1} but...
id2 = eye(sz...); # ... 2×2 Array{Float64,2}

It is reasonable, that 2x1 identity matrix doesn’t exist, unless that:

b = rand(2,1);
# 3 #
o3 = eye(b); # 2×1 Array{Float64,2}:
# 4 #
o4 = eye(size(b)...); # 2×1 Array{Float64,2}:

So, 3 of 4 molding results of eye() are different, while the ones(), for instance, produces predictable and consistent results.
Is it a bug or a feature?


#2

When you call eye(sz...), you are calling eye(::Int), which gives you a square matrix. Withl eye(b) and eye(size(b)...), you provide the dimensions, and call eye(::Int,::Int) (indirectly in the first case, picking the dimensions from b).

Since eye(::Int) is a much more common use case, this interface is convenient. Note that ones / zeros can provide arbitrary arrays, while eye always gives a matrix. So the two interfaces don’t need to be consistent.


#4

In terms of matrices, I’d expect both eye(::Int) zeros(::Int) to produce the identity matrix and the all-zeros matrix, but not a matrix and a vector (to create block matrices, for instance).

And still. if eye() creates a 2x1 array, why doesn’t it make a 2-element array with the same contents, as rand() or ones() do?

I’d expect all common matrix generators to behave similarly, e.g. Matlab/Octave, where rand(n) creates an nxn matrix.

Moreover, it would make possible creating higher-order functions (to create the same block matrices as as example) without converting arguments for eye(), rand() etc.


#5

Letting zeros(n) etc. create a matrix would be wildly inconsistent with the rest of Julia. The odd man out here is eye, which seems to be special cased because it can only ever be a matrix.

It is bad enough that Matlab does this, but it can sort of be understood because everything is a matrix in Matlab.


#6

Then your expectations are incorrect. Read the manual. In particular, zeros and ones produce general arrays (as I said above).

But ones and zeros are not matrix generators. They generate arrays. The only one that generates matrices is eye, that’s why it is different.

For matrices, this is already possible, just call all of them with the (::Int, ::Int) signature. For other arrays, eye does not make sense, or to be more precise, there is no generally used analogue to the identity matrix.


#7

Could you explain or refer to a source, why is it inconsistent, does any function by default returns a column vector?

it can only ever be a matrix

That’s also not very clear from the content, not type, point of view. An n-element vector and an nx1 matrix are treated very similarly even in Julia: B*rand(5,1)-B*rand(5) works fine, but eye(rand(5)) throws an error, even not a square matrix.


#8

It’s inconsistent in the sense that Julia creates general arrays, with n dimensions.

The number of shape parameters supplied to the constructor, as in zeros(n1, n2, n3, ...) dictates the number of dimensions in the array. So zeros(4,3,5) is three-dimensional, zeros(4,3) is two-dimensional, and zeros(4) is one-dimensional. You are basically arguing that 2 should be the minimum number of dimensions, like in Matlab, which is something I disagree with. If zeros(4) should be a matrix, you are essentially removing the concept of a one-dimensional vector from the language.

This is just how it is in linear algebra. Matrix-vector products work the same as a matrix-(n-by-1) matrix product. (Edit: except, note that the former produces a matrix while the latter produces a vector.)

I think you might have a case for arguing that eye should behave in a slightly different way, but there’s no chance at all zeros(5) will ever return a matrix.


#9

Actually, I do not. Sorry, I’ve explained badly. rand(n) resulting an nxn matrix is just an example. Similarly, every generate_something(n) may produce an n-element vector.

What are the advantages of this design, if it behaves very similar to nx1 arrays (or can be substituted by them)?

This is what I mean, in fact. Something like that eye() should not throw exceptions and be more similar to the majority in this specific case just to generate e1 n-element vector, which consists of rand(), zeros() etc.


#10

I would expect that in some imaginary example, I do not expect it. I wanted to emphasize consistency, not the actual syntax.

According to the manual, a matrix is just an aliased array typealias Matrix{T} Array{T,2}, so I don’t understand the great difference between them because…

I can be defined formally for m dimensions as element(i,j,...l,m) = kronecher_delta(i,j,...,l,m) and eye() can return an nx1 matrix giver an n-elements vector, resulting the same signature as zeros() have.


#11

There are many discussions about this, and I’m not equipped to go into all of it. There are deep reasons having to do with the mathematical differences between vectors and matrices, but there are also practical reasons.

The most obvious reason, to me, is that when writing my code, I always know whether I am dealing with a matrix or a vector, in that they are distinguishable at the type level. If vectors were simply a special case of a matrix, then my code would always have to explicitely check if the matrix has exactly one column, while in Julia this is handled by dispatch.

In Matlab (which I use all the time) it’s a continual headache that vectors are sometimes columns and sometimes rows, and my functions always need boilerplate where I ensure correct and compatible orientation, how to handle matrices etc. Still Matlab frequently frustrates my efforts by throwing row-vectors at me when I let down my guard (linspace and 1:n do this for example, even though Matlab is supposedly column-major!?!!)

I can assure you that in everyday, practical work, it’s a Yuge relief to deal with real vectors, instead of the pseudo-vector matrices I’m unfortunately used to.


#12

I think he means that they are not specifically matrix generators. It’s like you were complaining that the Ferrari generator is not giving you the correct paint job, and then someone points out it’s not a Ferrari generator, but a generic car generator, so it will not automatically paint the car red.


#13

The core issue here seems to be that eye(rand(n)) doesn’t work. It could, of course, be made to work. However, there’s an ambiguity about what it should do:

  1. Since eye(m,n) produces an m×n matrix and eye(n) produces an n×n matrix, one could argue that by analogy eye(rand(n)) should produce an n×n identity matrix.

  2. Since vectors and column matrices are largely similarly behaved, one can also argue that eye(rand(n)) should behave the same as eye(rand(n,1)) and therefore produce a length n vector whose first element is one and the rest zeros.

Since someone might reasonably expect either behavior, it’s better to leave this method undefined and require the user to be explicit about what they want.


#14

Could you provide a code snippet to illustrate your idea?


#15

It is just a choice to make, as someone has selected, let’s say, “non-Matlab” behavior or that rand(n,1) creates a matrix, not a vector. This may also differ from expected.


#16

It seems most consisten, to me at least, that eye(n) and eye(rand(n)) behave one way, and eye(n,1) and eye(rand(n,1)) behave one way.

(Edit: Actually, that came out wrong, and I don’t really see an optimal solution to the above conundrum.)

It’s also sometimes an issue that

julia> zeros((2,2))
2×2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

while

julia> eye((2,2))
MethodError: no method matching eye(::Tuple{Int64,Int64})
Closest candidates are:
  eye{T}(::Type{Diagonal{T}}, ::Int64) at linalg/diagonal.jl:242
  eye(::Type{T}, ::Integer) at array.jl:189
  eye(::Type{T}, ::Integer, ::Integer) at array.jl:182

#17

eye is just awkward. It’s a pretty bad name, and this method signature awkwardness comes directly from it’s Matlabby origins. It’s worth noting that this exact inconsistency also exists in NumPy.

Julia has the special I constant that efficiently represents a uniform scaling diagonal matrix. It’s great… except that it cannot be used as an initial value for a matrix that is subsequently modified. It’d be great if we could use I in some manner for array initialization, and then deprecate eye in favor of that. A few options would be to implement I[1:5,1:5] or A .= I or maybe even I(5) (none of which currently work).


#18

Neither it can be a block in a matrix.


#19

It will be able to on 0.6!

julia> [2*ones(3)  I
        3*ones(3)' 4]
4×4 Array{Float64,2}:
 2.0  1.0  0.0  0.0
 2.0  0.0  1.0  0.0
 2.0  0.0  0.0  1.0
 3.0  3.0  3.0  4.0