Alternative proposal to getindex pun for typed array literals

I can’t remember where this was discussed, but I do recall someone stating that Int32[1, 2, 3] was the one of the “worst puns” in the Julia language (being getindex on a type). But now with the “dot-call” syntax, could we get

Int32.([1,2,3])

lowered to

Base.vect(Int32(1), Int32(2), Int32(3))

and similarly for the vcat, hcat and hvcat cases? (the broadcast would need to nest in those cases, I suppose).

I know its a bit more typing but the meaning is clearer, IMO.

And how do you handle empty arrays? T[] was an easy thing to push! into. I guess Vector{T}() is fine.

Right, so have the lowering of T.([]) go to Vector{T}()?

Both Int32.([1,2,3]) and Int32.([]) already work, are you proposing an optimized code generation for those cases?

BTW, the syntax Int.[1,2,3] seems to be free, maybe it could be useful for something…

Yes, they already work, and yes I am suggesting optimal code generation via lowering, but I am really suggesting to deprecate Int[1,2,3] entirely.

There are already two competing meanings for Int.[1,2,3]. One being a broadcasted getindex (it makes sense if you replace Int with an array of arrays), or another involving a new interesting indexing scheme which broadcasts the indices themselves (see https://github.com/JuliaLang/julia/issues/2591).

Now do you change lowering?

Now do you change lowering?

Sorry, @yuyichao, I don’t quite understand what your question is asking.

So far, I am only making proposals. If people are receptive to this idea, maybe I would have time to make some changes to lowering and a PR, but really I have zero experience in lisp or the Julia’s lowering codebase.

Either way, to implement this for the standard (typed vector: Float64[1,2,3]) notation we would (a) add deprecation warnings to getindex{T}{::Type{T}, X...) in Base and (b) have the broadcast performed on literal vectors explicitly by the parsing/lowering steps so that we don’t first construct [1,2,3] at run-time and then perform broadcast(Float64, [1,2,3]). Step (b) is just an optimization to avoid regressions.

Similar steps would have to be done for Float64[1; 2; 3], Float64[1 2 3] and Float64[1 2; 3 4] (so the user syntax becomes Float64.([1; 2; 3]), etc).

I mean are you going to change the meaning of f.([a, b, c])?

One might argue that if

f.([a, b, c]) == broadcast(f, [a, b, c]) == [f(a), f(b), f(c)]

does not hold, something is broken anyway, hence this change of lowering might be an acceptable optimization.

1 Like

I mean are you going to change the meaning of f.([a, b, c])?

I think the meaning will be preserved (as @martinholters points out) , but the lowering will be different.

EDIT: and yes, f doesn’t have to be a type nor does a, b, c have to be literal values - the transformation should hold in general.

That argument does not apply to splatting and concatenation.

True, splatting couldn’t be handled this way.

Could you expand what you mean by concatenation here?

I mean

The these are not simple array construction so it’s less clear if this translation is always correct. (i.e. you can’t easily lower f.([1; 2; 3]) to typed_vcat(f, 1, 2, 3))

Similarly

is problematic.

Finally, now that I’m more wake up… the argument doesn’t hold in general.

[f(a), f(b), f(c)] isn’t the correct thing to compare.

Examples of problematic cases includes

julia> Integer.([1])
1-element Array{Int64,1}:
 1

julia> Any.([1])
1-element Array{Int64,1}:
 1

julia> Integer[1]
1-element Array{Integer,1}:
 1

julia> Any[1]
1-element Array{Any,1}:
 1

In general, I don’t really find using getindex for typed array construction problematic. For syntax, it’s consistent with other array construction/concatenation related syntax. For the implementation, it’s generic (i.e. doesn’t have the issue of only have vectorized version for a subset of functions) and doesn’t conflict with anything else (especially after Tuple type is not a Tuple anymore) so I don’t see a need to change it.

Oh, yes, of course, I stand corrected.

I agree, the syntax does not work as currently for abstract types. My style has always been to use Vector{T}() or Vector{T}(N) when I care about T anyway…

The these are not simple array construction so it’s less clear if this translation is always correct. (i.e. you can’t easily lower f.([1; 2; 3]) to typed_vcat(f, 1, 2, 3))

For vcat, hcat and hvcat, as I said earlier, the broadcast would have to be recursive: e.g. T[a; b; c] would be lowered to vcat(broadcast(T, a), broadcast(T,b), broadcast(T,c)) (or similar). I suppose that this would involve extra copies… but AFAICT it should be possible to maintain the current behavior.

BTW I’m not convinced at all that this is a good idea - I was just thinking about it after thinking about my own pun on the pun of getindex on types here and was curious what the future of this syntax might be.