When using a pattern like that, I am always unsure if some customization can be done. For example:
If I happened to have to preallocate this buffer, meaning @init buf = pre_buf[threadid()] would that possibly work? (I am not mentioning specifically the threadid() use, but I would not know how to perform such preallocation, since I don’t see exactly what the macro will be doing with that instruction).
Also, I can imagine that if I preallocate the buffer, I may go against the rationale of the @floops macro, because I guess it does something more smart than having all buffers next to each other in the same array to be distributed among threads.
Another pattern I am facing which I have trouble trying to “translate” to the Folds syntax is something like the one below. This is a toy example in which I want to build a list of the numbers smaller than 0.5, but it captures some of the characteristics of the true problem:
using Base.Threads: @threads, nthreads
struct List
n::Int
l::Vector{Float64}
end
add_to_list(x,list) = x < 0.5 ? List(list.n+1,push!(list.l,x)) : list
# Serial version
function build_list(x)
list = List(0,zeros(0))
for i in eachindex(x)
list = add_to_list(x[i],list)
end
return list
end
# Parallel version
append_lists!(list1,list2) = List(list1.n+list2.n,append!(list1.l,list2.l))
function build_list_threads(x)
list = List(0,zeros(0))
list_threaded = [ deepcopy(list) for _ in 1:nthreads() ]
@threads for ithread in 1:nthreads()
local_list = list_threaded[ithread]
for i in ithread:nthreads():length(x)
local_list = add_to_list(x[i],local_list)
end
list_threaded[ithread] = local_list
end
# reduce
for lst in list_threaded
list = append_lists!(list,lst)
end
return list
end
Uhm… now that I think this one may not be very different from the above, and with the “new syntax” could be something like:
list = List(0,zeros(0))
@floop begin
@init buf = List(0,zeros(0))
for i in eachindex(x)
buf = add_to_list(x[i],buf)
end
@combine append_lists!(list,buf) # ???
end
My doubts there would roughly the same: 1) could I preallocate the bufs if I was to call this many times? 2) Is there anything special needed for the @combine syntax since the combination of two List objects is not simply an addition?