Yet another broadcast question

Hi,
I want to perform a given computation f1D on all the “rows” of 3D arrays x,y:

#2 3D arrays
x=ones(Float32,10,10,5)
y=ones(Float32,10,10,5)
#a function (not parallel)
function f1D!(x1d,y1d)
    for k=2:length(x1d)
        x1d[k]+=y1d[k]+x1d[k-1]
    end
end
# apply f1D to all rows i,j
s=size(x)
for i=1:s[1]
    for j=1:s[2]
        f1D!(@view(x[i,j,:]),@view(y[i,j,:]))
    end
end

I wonder if there is a way to express this (efficiently) via a broadcast operator or a map.

The goal is to apply f1d to all the rows in parallel on a GPU with cuArrays via a single kernel.

You will gain a lot more speed, especially on the GPU, by permuting your dims, since Julia is column major. You may also have to write your own CUDAnative kernel to do this in one go. It’s not that hard.

Thank you for the advice.
There is no problem to switch the dims. Let us assume that I want to perform f1d! On the first dim for all k,j rows.
I know how to write the Cuda kernel. But I wonder if it can be be done with broadcast.
Conceptually I want to apply a given function to a collection of ny×nz rows… So I guess it is a map operation. But I don’t know how it should be written.

How large is the dimension that you want to apply f1d! on? If it is small, consider reinterpreting your array as a matrix of static vectors.

typically 100x100x100.
If I was able to reinterpret a 3D array as a 2D array of 1D arrays (or a 1D array of 2D arrays), the problem would be solved.

#2 3D arrays
x=ones(Float32,100,100,100)
y=ones(Float32,100,100,100)
#a function (not parallel)
function f1D!(x1d,y1d)
    for i=2:length(x1d)
        x1d[i]+=y1d[i]+x1d[i-1]
    end
end
# map f1D to all rows j,k
s=size(x)
for j=1:s[2]
    for k=1:s[3]
        f1D!(@view(x[:,j,k]),@view(y[:,j,k]))
    end
end

@LaurentPlagne someone with more knowledge than me can explain
https://docs.julialang.org/en/v1/devdocs/subarrays/index.html

Also it was pointed out to me here once that if a copy of an array is made, the copy is in fact the same data values (int he same memory locations). you can reshape the copyied array without changing the original array, so you can have different views of the same array data.
Sorry if I have not explained that correctly.

This does the trick. Replace all 3s with your desired size.

julia> using StaticArrays

julia> x = ones(Float32,3,3,3);

julia> r_x = dropdims(reinterpret(SVector{3, Float32}, x), dims=1);

julia> s = sum.(r_x);

julia> s
3×3 Array{Float32,2}:
 3.0  3.0  3.0
 3.0  3.0  3.0
 3.0  3.0  3.0

Thank you for the link. I will read (again) that part.

This gets me to a more general point for Julia improvement. The Julia documentation is a very good reference manual. But, to me, it is not a good programming guide. It contains a lot of well described technical details but very few examples. I think that it is entirely normal because it is written by (brilliant) language designers and developers. Their state of mind is basically that if you know how the language works and its rules then, theoretically, you have all the information required to work. It is a very reductionist point of you: you can know in details of the fundamental laws of physics and not be able to design a good bridge :wink:
Of course there exists blogs (e.g. excellent @ChrisRackauckas blog) that give very useful Julia programming advices (design patterns ?) but not yet (AFAIK) a complete programming guide. The Rust programming book Foreword - The Rust Programming Language is a striking counter example of what I wrote about language designers :slight_smile:

At this time my best source of inspiration (apart from the discourse forum), is to read the source code of well established Julia packages and also Julia’s Base and Core although it is most of the time out of reach for my limited skills. I doubt that the majority of mathematician’s and scientists will have the reflex to start to read carefully package sources.

P.S. Sorry for my English level.

1 Like

Thank you Mohamed,
is it OK to use StaticArrays with size 100 ?

Not an expert but benchmark it and see :man_shrugging:

1 Like

Note that Julia’s performance/compilation model is experimental in the sense that it is constantly being improved, with occasional incompatible changes. While some things are stabilizing, it is easy to find examples of particular suggestions that get outdated within a few months or a year. There is a trade-off between working on updating tutorial-like docs and improving the language, and unfortunately the number of contributors to both is limited.

I think that reading the source of Base and some packages is the best way to learn. I understand this may be unusual for some people, especially if they are used to very mature languages that come with a detailed manual and tutorials. But currently this is a price that one pays for using Julia, in exchange for a lot of benefits.

1 Like