Alternative proposal to getindex pun for typed array literals


#1

I can’t remember where this was discussed, but I do recall someone stating that Int32[1, 2, 3] was the one of the “worst puns” in the Julia language (being getindex on a type). But now with the “dot-call” syntax, could we get

Int32.([1,2,3])

lowered to

Base.vect(Int32(1), Int32(2), Int32(3))

and similarly for the vcat, hcat and hvcat cases? (the broadcast would need to nest in those cases, I suppose).

I know its a bit more typing but the meaning is clearer, IMO.


#2

And how do you handle empty arrays? T[] was an easy thing to push! into. I guess Vector{T}() is fine.


#3

Right, so have the lowering of T.([]) go to Vector{T}()?


#4

Both Int32.([1,2,3]) and Int32.([]) already work, are you proposing an optimized code generation for those cases?

BTW, the syntax Int.[1,2,3] seems to be free, maybe it could be useful for something…


#5

Yes, they already work, and yes I am suggesting optimal code generation via lowering, but I am really suggesting to deprecate Int[1,2,3] entirely.

There are already two competing meanings for Int.[1,2,3]. One being a broadcasted getindex (it makes sense if you replace Int with an array of arrays), or another involving a new interesting indexing scheme which broadcasts the indices themselves (see https://github.com/JuliaLang/julia/issues/2591).


#6

Now do you change lowering?


#7

Now do you change lowering?

Sorry, @yuyichao, I don’t quite understand what your question is asking.

So far, I am only making proposals. If people are receptive to this idea, maybe I would have time to make some changes to lowering and a PR, but really I have zero experience in lisp or the Julia’s lowering codebase.

Either way, to implement this for the standard (typed vector: Float64[1,2,3]) notation we would (a) add deprecation warnings to getindex{T}{::Type{T}, X...) in Base and (b) have the broadcast performed on literal vectors explicitly by the parsing/lowering steps so that we don’t first construct [1,2,3] at run-time and then perform broadcast(Float64, [1,2,3]). Step (b) is just an optimization to avoid regressions.

Similar steps would have to be done for Float64[1; 2; 3], Float64[1 2 3] and Float64[1 2; 3 4] (so the user syntax becomes Float64.([1; 2; 3]), etc).


#8

I mean are you going to change the meaning of f.([a, b, c])?


#9

One might argue that if

f.([a, b, c]) == broadcast(f, [a, b, c]) == [f(a), f(b), f(c)]

does not hold, something is broken anyway, hence this change of lowering might be an acceptable optimization.


#10

I mean are you going to change the meaning of f.([a, b, c])?

I think the meaning will be preserved (as @martinholters points out) , but the lowering will be different.

EDIT: and yes, f doesn’t have to be a type nor does a, b, c have to be literal values - the transformation should hold in general.


#11

That argument does not apply to splatting and concatenation.


#12

True, splatting couldn’t be handled this way.

Could you expand what you mean by concatenation here?


#13

I mean

The these are not simple array construction so it’s less clear if this translation is always correct. (i.e. you can’t easily lower f.([1; 2; 3]) to typed_vcat(f, 1, 2, 3))

Similarly

is problematic.

Finally, now that I’m more wake up… the argument doesn’t hold in general.

[f(a), f(b), f(c)] isn’t the correct thing to compare.

Examples of problematic cases includes

julia> Integer.([1])
1-element Array{Int64,1}:
 1

julia> Any.([1])
1-element Array{Int64,1}:
 1

julia> Integer[1]
1-element Array{Integer,1}:
 1

julia> Any[1]
1-element Array{Any,1}:
 1

In general, I don’t really find using getindex for typed array construction problematic. For syntax, it’s consistent with other array construction/concatenation related syntax. For the implementation, it’s generic (i.e. doesn’t have the issue of only have vectorized version for a subset of functions) and doesn’t conflict with anything else (especially after Tuple type is not a Tuple anymore) so I don’t see a need to change it.


#14

Oh, yes, of course, I stand corrected.


#15

I agree, the syntax does not work as currently for abstract types. My style has always been to use Vector{T}() or Vector{T}(N) when I care about T anyway…

The these are not simple array construction so it’s less clear if this translation is always correct. (i.e. you can’t easily lower f.([1; 2; 3]) to typed_vcat(f, 1, 2, 3))

For vcat, hcat and hvcat, as I said earlier, the broadcast would have to be recursive: e.g. T[a; b; c] would be lowered to vcat(broadcast(T, a), broadcast(T,b), broadcast(T,c)) (or similar). I suppose that this would involve extra copies… but AFAICT it should be possible to maintain the current behavior.

BTW I’m not convinced at all that this is a good idea - I was just thinking about it after thinking about my own pun on the pun of getindex on types here and was curious what the future of this syntax might be.