Hi; I’m a Julia beginner myself, so this is also more like a question if this macro is any good…?
macro ntplaccess(sym, idx)
elements = []
for i in (@eval $idx)
push!(elements, :($sym[$i]))
end
return :( ($(elements...),) )
end
f(y) = @ntplaccess y 1:4
And if you want this to work for arbitrary iterable sequences of indices (not just ranges like i:j), then I don’t know how a macro could evaluate this sequence (assuming it is const) to generate the code, without resorting to using eval or the like.
Base.@propagate_inbounds function load_ntup(x, offset, ::Val{N}) where N
@boundscheck checkbounds(x, offset:offset+N)
return ntuple(i -> (@inbounds x[i+offset]), N)
end
That can become a single vector-load instruction (also for extracting a tuple from an Array). Boundschecks inside the loop prevent simd, and hence it is better to do them outside.
Use esc(sym). You can see in your code_warntype that it’s not using the y that’s the argument to f but instead using Main.y. This is macro hygiene.
Don’t @eval idx. Instead, manually dig into the AST of 1:4 and grab out the endpoints manually. Of course, you’ll want to add error handling to ensure callers don’t give you want you don’t want. The Meta module can be helpful as you explore what the AST looks like:
Thanks everyone! Playing around with the answers, I tried:
f1(x) = (x[4], x[5], x[6], x[7])
f2(x) = ntuple(i -> x[i + 3], 4)
Base.@propagate_inbounds function f3(x, offset, ::Val{N}) where N
# this line causes a runtime error
# @boundscheck checkbounds(x, offset:offset+N)
return ntuple(i -> (@inbounds x[i+offset - 1]), N)
end
f4(x, from, to) = ntuple(i -> x[i + from - 1], to - from + 1)
f5(x, from, to) = ntuple(i -> (@inbounds x[i + from - 1]), to - from + 1)
f6(x, ::Val{from}, ::Val{to}) where from where to = ntuple(i -> x[i + from - 1], to - from + 1)
macro ntplaccess(sym, idx)
elements = []
for i in (idx.args[2]:idx.args[3])
push!(elements, :($(esc(sym))[$i]))
end
return :( ($(elements...),) )
end
@ntplaccess(x, 1:4)
f7(x) = @ntplaccess x 4:7
where f1, f2, f6, and f7 give the best @code_warntype:
I like the macro version. Why? Because you can also check (in the macro) which type of indexing is used. That is, one could call the macro with a single index, range indexing, an array of indices, …
Then I was thinking; why specify the symbol and the indexing as two arguments? There’s the @view macro, which you can also use as @view x[1:3]. I didn’t look up how it is implemented but wanted to come up with my own solution (as an exercise), and it works!
macro ntplaccess(expr)
sym = expr.args[1] # the symbol
idx = expr.args[2] # the index, or the indexing expression
if typeof(idx) <: Int
# single index
return :($(esc(sym))[$idx])
else
# range, array, tuple
elements = []
if idx.head == :call && idx.args[1] == :(:)
# range
for i in (idx.args[2]:idx.args[3])
push!(elements, :($(esc(sym))[$i]))
end
elseif idx.head == :tuple || idx.head == :vect
# tuple or array
for i in idx.args
push!(elements, :($(esc(sym))[$i]))
end
end
return :( ($(elements...),) )
end
end
# can do...
f(y) = @ntplaccess y[1]
f(y) = @ntplaccess y[[3,5]]
f(y) = @ntplaccess y[(1,4,7)]
f(y) = @ntplaccess y[1:4]
I would not worry too much about enclosing (from, to) indices in value types, since I would expect that these constructs would always (or at least most of the time) be used in larger functions, with possibly variable data, but always the same indices.
Say you define f4 in the same way as above:
f4(x, from, to) = ntuple(i -> x[i + from - 1], to - from + 1)
Of course a top-level call is not type stable:
julia> @code_warntype f4(x, 1, 4)
Body::Tuple{Vararg{UInt8,N} where N}
[...]
But as soon as you enclose it in a larger function, if indices arguments are constant (as is the case above and in all examples in this thread so far), then constant propagation simplifies everything, and you’re type-stable again:
From there, regular compiler optimizations further simplify the code and essentially remove all forms of integer operations for indices so that what you get in the end mostly performs as good as the by-hand tuple construction:
I tend to like macros too. Still, this is probably not what I would do here: the polymorphism you introduce in your macro is essentially a manual implementation of some sort of dispatching depending on the index type.
Therefore, I would probably define a function to perform what I need, with multiple methods depending on the index type. For example, this is what a minimal implementation for unit ranges could look like:
function ntpl(x, idx::UnitRange)
from = idx.start
to = idx.stop
ntuple(i->x[i+from-1], to - from + 1)
end
julia> f(x) = ntpl(x, 2:4)
f (generic function with 1 method)
julia> @code_warntype f(x)
Body::Tuple{UInt8,UInt8,UInt8}
[...]
julia> @btime f($x)
0.024 ns (0 allocations: 0 bytes)
(0x02, 0x03, 0x04)
For tuples of indices, the implementation could be:
function ntpl(x, idx::Tuple)
ntuple(i->x[idx[i]], length(idx))
end
julia> g(x) = ntpl(x, (2,3,4))
g (generic function with 1 method)
julia> @code_warntype g(x)
Body::Tuple{UInt8,UInt8,UInt8}
[...]
julia> @btime g($x)
0.024 ns (0 allocations: 0 bytes)
(0x02, 0x03, 0x04)
And a new implementation could be added for each useful type of indices set (StepRanges and so on).
If you want to have a bracket-like syntax (without overloading the standard Base.getindex) you can add then implement only the syntax-to-syntax transformation in a macro:
macro ntpl(expr)
sym = expr.args[1]
idx = expr.args[2]
quote
ntpl($(esc(sym)), $(esc(idx)))
end
end
julia> @macroexpand @ntpl x[2:4]
quote
#= REPL[25]:5 =#
(Main.ntpl)(x, 2:4)
end
julia> h(x) = @ntpl x[2:4]
h (generic function with 1 method)
julia> @code_warntype h(x)
Body::Tuple{UInt8,UInt8,UInt8}
[...]
julia> @btime h($x)
0.024 ns (0 allocations: 0 bytes)
(0x02, 0x03, 0x04)
This has the advantage (IMO) that specific implementations of the NTuple extraction can be added for many types of indices, without having to change the macro itself.
julia> using Unrolled
[ Info: Precompiling Unrolled [9602ed7d-8fef-5bc8-8597-8f21381861e8]
julia> @code_warntype (1,2,3)[@fixed_range 2:3]
Body::Tuple{Int64,Int64}
1 ─ %1 = (Base.getfield)(seq, 2, true)::Int64
│ %2 = (Base.getfield)(seq, 3, true)::Int64
│ %3 = (Unrolled.tuple)(%1, %2)::Tuple{Int64,Int64}
└── return %3
help?> @fixed_range
@fixed_range 3:10 behaves like the standard range 3:10, but is stored within
the type system, so that some_tuple[@fixed_range 3:10] is type-stable. Also
supports some_tuple[@fixed_range 3:end-5]
Note that timings this low are benchmarking artifacts. It occurs when the expressions are replaced with constants during compilation, and you end up benchmarking nothing at all. You can tell that this is the case, since one CPU clock cycle is ~0.3 ns on a 3 GHz computer, so 0.017 ns would be a tiny fraction of a clock cycle, which is not realistic.