N-dimensional implementations

I am trying to generalize an algorithm for N-dimensional arrays, but there are many code patterns for which I don’t have a solution yet. I will use this thread to ask a few questions.

How do you generalize the following code to work with N-dimensional arrays as opposed to just 3D arrays?

activated[gridsize[1]+1:padsize[1],:,:] .= false
activated[:,gridsize[2]+1:padsize[2],:] .= false
activated[:,:,gridsize[3]+1:padsize[3]] .= false

Tried to make use of the functionality in Base.Cartesian like @nref but without success.

Any program is a bunch of (parsed) strings! :sweat_smile:
Edited: Sloopy solution that no one should use but it’s fun to write.

N = 8
for i in 1:N
    println("activated[" * ":,"^(i-1) * "gridsize[$i]+1:padsize[$i]" * ",:"^(N-i) * "] .= false" )
end

prints

activated[gridsize[1]+1:padsize[1],:,:,:,:,:,:,:] .= false
activated[:,gridsize[2]+1:padsize[2],:,:,:,:,:,:] .= false
activated[:,:,gridsize[3]+1:padsize[3],:,:,:,:,:] .= false
activated[:,:,:,gridsize[4]+1:padsize[4],:,:,:,:] .= false
activated[:,:,:,:,gridsize[5]+1:padsize[5],:,:,:] .= false
activated[:,:,:,:,:,gridsize[6]+1:padsize[6],:,:] .= false
activated[:,:,:,:,:,:,gridsize[7]+1:padsize[7],:] .= false
activated[:,:,:,:,:,:,:,gridsize[8]+1:padsize[8]] .= false

So you only need to “make it code”

N = 8
your_code = []
for i in 1:N
    push!(your_code, Meta.parse("activated[" * ":,"^(i-1) * "gridsize[$i]+1:padsize[$i]" * ",:"^(N-i) * "] .= false" ))
end
1 Like
function foo(gridsize, padsize, activated::Array{Bool,N}) where {N}
for k=1:N 
g=gridsize[k]
p=padsize[k]
pref = ntuple(i->Colon(), k-1)
suff = ntuple(i->Colon(), N-k)
v=view(activated, pref..., g+1:p, suff...)
v .= false
end
end

If this is too slow, you might need to become @generated :frowning:

2 Likes

Even if you are doing code generation, don’t use strings. Work directly with ASTs.

4 Likes

It’s in my todo list to learn how to do it!

CartesianIndices is what you want here. That allows you to specify a slice of an arbitrary multidimensional array and iterate over it very efficiently.

function myfunc(activated::AbstractArray{N}, gridsize, padsize) where {N} 
    ax = axes(activated)
    for d = 1:N
        axd = ntuple(i -> i == d ? (gridsize[d]+1:padsize[d]) : ax[i], Val{N}())
        activated[CartesianIndices(axd)] .= false
    end
end

Note that by making N a type parameter in this way and using Val{N} in the ntuple constructor, I ensure that the dimensionality is known at compile-time, allowing the compiler to emit highly optimized code.

8 Likes

Thank you @stevengj, will Val{N} be needed in future versions of the language? Shouldn’t the compiler be able to infer N directly?

That kind of code is wonderful but also kind of hard to read. Any good syntactic sugar that could be used here? Something like @NDColon activated[1:d, ind, d+2:N] or similar?

edit: hmm that’s really bad notation in case ind also has colons. .. is available and has a nice precedence, so @.. activated[1..d-1, gridsize[d]+1:padsize[d],d..N] .= false could plausibly work.

2 Likes

+1 for syntactic sugar, code can get quite unreadable quickly in N-dimensions

So I’d like to know:

It is not so rare that I have loops over compile-time known numbers of elements, but types of local variables change over these loops. Like axd in your case.

Is there a sensible way of telling julia that a loop should be statically unrolled before inference? My current standard approach is to make it all @generated but that is really really ugly.

In your code with N=4 we infer axd::NTuple{4,Union{OneTo{Int64}, UnitRange{Int64}}}, which is afaik heap-allocated. Whereas in statically unrolled loops, it would be type-stable and stack-allocated.

Afterwards, constant propagation is normally good enough to not need Val{N}(), but you always need to read @code_warntype.

julia> function myfunc(activated::AbstractArray{Bool,N}, gridsize, padsize) where {N} 
           ax = axes(activated)
           for d = 1:N
               axd = ntuple(i -> i == d ? (gridsize[d]+1:padsize[d]) : ax[i], Val{N}())
               activated[CartesianIndices(axd)] .= false
           end
       nothing
       end

julia> function myfunc_ur(activated::AbstractArray{Bool,4}, gridsize, padsize)  
           activated[(gridsize[1]+1:padsize[1]),:,:,:] .= false
           activated[:,(gridsize[2]+1:padsize[2]),:,:] .= false
           activated[:,:,(gridsize[3]+1:padsize[3]),:] .= false
           activated[:,:,:,(gridsize[4]+1:padsize[4])] .= false
nothing
end
julia> function foo(activated::Array{Bool,N}, gridsize, padsize) where {N}
       for k=1:N 
       g=gridsize[k]
       p=padsize[k]
       pref = ntuple(i->Colon(), k-1)
       suff = ntuple(i->Colon(), N-k)
       v=view(activated, pref..., g+1:p, suff...)
       v .= false
       end
       end

julia> activated = rand(Bool, (2,2,2,2)); gridsize=[1,1,1,1]; padsize=[2,2,2,2];

julia> using BenchmarkTools
julia> @btime myfunc_ur($activated, $gridsize, $padsize)
  164.119 ns (4 allocations: 320 bytes)
julia> @btime myfunc($activated, $gridsize, $padsize)
  4.207 μs (36 allocations: 1.25 KiB)
julia> @btime foo($activated, $gridsize, $padsize)
  5.275 μs (37 allocations: 1.31 KiB)

But the fast version needs to be @generated, which is in practice too much of a hassle. Some way of static unrolling would be pretty nice for that.

1 Like

An ntuple loop with a Val length is statically unrolled, and the heterogeneous tuple type should be statically inferred and not heap allocated.

For example:

julia> foo(n) = ntuple(i -> i < 4 ? i : string(i), n)
foo (generic function with 1 method)

julia> foo(Val{7}())
(1, 2, 3, "4", "5", "6", "7")

julia> using Test

julia> @inferred foo(Val{7}())   # succeeds: type is statically inferred
(1, 2, 3, "4", "5", "6", "7")

versus a non-Val loop where n is not statically inferred:

julia> @inferred foo(7)
ERROR: return type Tuple{Int64,Int64,Int64,String,String,String,String} does not match inferred return type Tuple
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] top-level scope at none:0
1 Like

Note constant propagation though:

julia> foo(n) = ntuple(i -> i < 4 ? i : string(i), n)
foo (generic function with 1 method)

julia> g() = foo(7)
g (generic function with 1 method)

julia> @inferred g()
(1, 2, 3, "4", "5", "6", "7")

Ref Extend code_typed to be able to debug constant prop by Keno · Pull Request #29261 · JuliaLang/julia · GitHub

2 Likes

@foobar_lv2 check out https://github.com/cstjean/Unrolled.jl

@stevengj in your example is it ok to replace ax[i] by Colon()? Any performance penalty in using Colon()? I personally think it is more readable.

EDIT: nevermind, CartesianIndices does not work with Colon()

But, as far as I gathered, the problem is the for d = 1:N loop that needs to be unrolled. Each heterogeneous Tuple is fine, but we should not put all of them into the same slot.

That looks pretty cool, I’ll play with that and see how well it works!

Do you mean to create a macro that generates the expressions that I created with strings + Meta.parse but without using strings? I read the metaprogramming section in the manual, and I think I graps some ideas, but it feels a bit odd (probably because I only use macros from other people but never went into the hassle of building one).

It doesn’t have to be with a macro, e.g. you can just use @eval. The point is, if you want to do code generation in Julia, do it by generating/manipulating ASTs (Expr nodes) and not by manipulating strings and then parsing them.

I feel like @StefanKarpinski wrote up a long explanation somewhere of why manipulating strings is brittle, and I don’t feel like repeating it, but I’m not sure where to find it.

1 Like

Maybe this stack overflow answer?

2 Likes

I was thinking more of examples like this:

x = "a + b"
y = "$x * c"  # probably not what you want!

x = :(a + b)
y = :($x * c) # good: gives :((a + b) * c)

Doing code generation with strings is extremely brittle because you are dealing with surface syntax. Working with Expr objects (which is especially easy in Julia because of quoting and interpolation syntax), on the other hand, is composable with few surprises because you are directly encoding the abstract meaning of the expressions rather than their spelling.

6 Likes