Getting CartesianIndices to propagate

RainerHeintzmann · March 16, 2021, 4:34pm

Julia (here 1.6RC1 on Windows 10) has a number of mechanism to calculate with positions inside an array such as ranges like (1:10) or generators or via CartesianIndices(x). Yet they all seem to not automatically propagate in larger pointwise array operations involving the array x and such index positions:

x = zeros(1000,1000);
xx(x) = (i[1] for i in CartesianIndices(x))
y = collect(xx(x))
@time y .= xx(x) .+ 0.0

Note that about 7.8 MB are allocated, which means that the “.+” operation forces an evaluation of xx(x) into an array instead of propagating the pointwise operation through the generator.
A similar effect is observed when encapsulating ranges in functions:

g(x) = (1:size(x,1)) .+ reshape((1:size(x,2)),(1,size(x,2)))
yy = g(x)
@time yy .= g(x)

Also here 7,8 Mb are allocated.
For longer expressions on arrays that involve mathematics using the index positions, I think it would be quite nice, if they could be written down in high-level notation without causing any intermediate array evaluations with associated memory and data transport overhead.

The only way, I was able to achieve this was via using a macro:

macro xx(x)
    return :((1:size($x,1)) .+ reshape((1:size($x,2)),(1,size($x,2))))
end
yy = @xx(x)
@time yy .= @xx(x)

Note that there is no memory overhead this time, but writing macros sounds like not the ideal solution to me here.

So my question: Is there a way that one can prevent ranges or CartesianIndices from prematurely generating arrays when they are encapsulated in functions and used in point-wise expressions?

Alternatively: Is there a way to obtain access to index positions (such as a CartesianIndex) from inside scalar functions, if they are called with the dor (“.”) notation to be applied to an array. This would be extremely useful, as then one could easily write nice functions that not only involve values of arrays put also the local position inside the array.

Thanks for any help on this matter!

mbauman · March 16, 2021, 4:49pm

It’s a little unclear to me what your end goal is — what do you want to get after your computation on the range/CartesianIndices?

By and large, if it’s possible to represent the answer as a re-computed range or CartesianIndices, then we do so. For example:

julia> ((1:3) .+ 4) .* 3
15:3:21

julia> CartesianIndices((-1:1, -2:2)) .+ CartesianIndex(4, 6)
3×5 CartesianIndices{2,Tuple{UnitRange{Int64},UnitRange{Int64}}}:
 CartesianIndex(3, 4)  CartesianIndex(3, 5)  CartesianIndex(3, 6)  CartesianIndex(3, 7)  CartesianIndex(3, 8)
 CartesianIndex(4, 4)  CartesianIndex(4, 5)  CartesianIndex(4, 6)  CartesianIndex(4, 7)  CartesianIndex(4, 8)
 CartesianIndex(5, 4)  CartesianIndex(5, 5)  CartesianIndex(5, 6)  CartesianIndex(5, 7)  CartesianIndex(5, 8)

In cases where that’s not possible we’ll create a dense Array.

We don’t have a builtin object that represents a “range matrix” like (1:3) .+ reshape((1:4),1,:), but it’s pretty easy to create one.

RainerHeintzmann · March 16, 2021, 5:23pm

Thanks! As for the end goal: I would like to be able to easily write down functions such as a propagator in Fourier optics in the following way:

field .= field .* exp.(1im .* sqrt.(k2 .- rr2.(field)))

with rr2 being the square of the pixel distance to the Fourier-center of the array. Ideally this should evaluate without any extra array allocation. “rr2” would be a function which does NOT operate on the values of field but simple on the indices. In all my attempts so far, I was able to obtain range matrices, but as soon as I even just add “.+ 0.0”, they would evaluate thus causing the allocation of an extra array.

mbauman · March 16, 2021, 5:37pm

Something like this?

julia> field = rand(5,5);

julia> rr2(index, center) = sum((Tuple(index) .- center).^2)
rr2 (generic function with 1 method)

julia> rr2.(CartesianIndices(field), (size(field)./2 .+ 0.5,))
5×5 Array{Float64,2}:
 8.0  5.0  4.0  5.0  8.0
 5.0  2.0  1.0  2.0  5.0
 4.0  1.0  0.0  1.0  4.0
 5.0  2.0  1.0  2.0  5.0
 8.0  5.0  4.0  5.0  8.0

The key with broadcasting in general is to build the scalar function first. I think where you’re having difficulty is that you were trying to get rr2 to operate on the array. Instead, figure out the computation for one pixel, then pass what you need to it (in this case, the CartesianIndices and the center). When you combine this with other broadcast expressions like you have there, it won’t allocate intermediaries.

DNF · March 16, 2021, 5:46pm

You should also be aware that benchmarking in global scope, with non-const globals, is likely to give highly unreliable results.

Wrap your code in a function, and use BenchmarkTools instead of @time. It may give a clearer picture.

RainerHeintzmann · March 16, 2021, 5:56pm

Thanks. Yes, this is probably the sensible solution, and that we already put into our code. Yet, it feels wrong that the “field” needs to be supplied multiple times. It would be nice to be able to call it by the field as the only argument. The trouble is that there seems to be no way (except for macros) to encapsulate this into a single function with the mid position being the default. E.g. the individual coordinate scaling may also depend on the size of the array.

RainerHeintzmann · March 16, 2021, 6:00pm

… thinking about it a little more. Maybe one could modify the “CartesianIndices” such that they include the offset and scaling of the field.

mbauman · March 16, 2021, 6:05pm

See OffsetArrays to embed this directly into your array — they are great for things like Fourier transforms.

RainerHeintzmann · March 20, 2021, 11:43am

After some rethinking, the solution was to create a new type derived from AbstractArray. What looks and feels like an array should be an array after all, even if it does not contain memory but just stores a revised way of index evaluation.
Initially we wanted to call it GeneratorArrays, but since this name was already taken, it is now called IndexFunArray. @roflmaostc helped me in the implementation of this idea and we did some performance tests, which are promising. See (GitHub - bionanoimaging/IndexFunArrays.jl: Fun with indices (and functions on them))
Example Usage: y .= x .* window_hanning(x) which does not cause any extra allocation of the array.
The only current downside is that we did not yet find (or implement) a way to prevent evaluation into an array (with memory) upon assignment. Another thing worth discussing may be how to deal with set_index() attempts. On views, subindexing works like a charm, but indexing an IndexFunArray directly currently raises an error. But then instead of writing window_hanning(x)[:,10:20] you can also use a view and then subindex: @view window_hanning(x)[end:1:-1,:]

Topic		Replies	Views
[ANN] IndexFunArrays.jl - Fun with indices (and functions on them) Package Announcements package , array	21	1330	April 2, 2021
How to CartesianIndex array comprehension? New to Julia question	4	632	January 13, 2021
CartesianIndices syntax General Usage indexing	6	685	April 9, 2019
Cheapest way for creating a CartesianIndex General Usage question , cartesianindices	9	473	November 6, 2022
Indexing multidimensional arrays of arbitrary dimension Performance indexing , array	12	1559	July 19, 2019

Getting CartesianIndices to propagate

Related topics