Hello, please help to parallelize the task, maybe with CUDA?

There are two point clouds A and B, the clouds have a different number of points.

In cloud A there can be several Y along the X axis,

In cloud B, there is only one Y for every X.

Data is ordered along the x-axis.

For each point of cloud A on the X-axis within [a_x; a_x + 10]

find the nearest point from cloud B whose sum of y-values is more than zero: (a_y + b_y) >= 0

I do it but it’s so slow:

```
##
using Random
limit = 10
# points number
A_n = 100_000_000
B_n = 30_000_000
#
func(a_y, b_y) = (a_y + b_y) >= 0
# cloud A
A_x = Int32.(cumsum(rand(0:3, A_n)))
A_y = Int32.(rand(-2:2, A_n))
# cloud B
B_x = Int32.(cumsum(rand(1:9, B_n)))
B_y = Int32.(rand(-2:2, B_n))
# for results
R = zeros(Int32, A_n)
## calculations
b = 1
bb = 1
for a = 1 : A_n
while B_x[b] < A_x[a]
b += 1
b == B_n && break
end
bb = b
while B_x[bb] <= (A_x[a] + limit)
if func(A_y[a], B_y[bb])
R[a] = B_x[bb]
break
end
bb == B_n && break
bb += 1
end
a % 1_000_000 == 0 && println(a) # progress
b == B_n && break
end
```