I am announcing this package mostly because it might be useful outside the original purpose for which I implemented it, and also because feedback will be mostly welcome, as this will be part of an important project I am developing. The package is an implementation of cell lists for computing properties dependent on the distances between pairs of particles.
Currently the package works with euclidean and three-dimensional systems, and accepts orthorhombic periodic boundary conditions. I will surely be extending it to general PBCs and, probably, to generalized dimensions as well (not sure about other distance metrics, I don’t see any application of that in my field at the moment).
The package allows the definition of custom functions to be evaluated pairwse, and with such it is possible, for example, to compute neighbor lists, which seems to be something widespread in other fields.
Within the limitations of the type of variable and metric that the package accepts, it seems that be sometimes faster than ball tree NN searches (from NearestNeighbors, for example), and maybe this is useful for someone in another context than that of particle simulations.
For example, below I compute the list of neighbours of a set of particles relative to another set, considering a maximum distance cutoff (
inrange function). There is a custom function for that in NN packages, but here one can implement that on top of the cell list search.
Searching the neighbours of a only a few particles is much faster with the cell lists, because the cost of building the ball tree is too high for a single particle search:
N1 = 1 N2 = 1000000 nn: 1.265 s (39 allocations: 67.16 MiB) CL serial: 50.415 ms (65 allocations: 22.91 MiB) CL parallel: 24.160 ms (262 allocations: 22.93 MiB) naive (N1*N2): 2.600 ms (10 allocations: 16.39 KiB)
For performing many similar searches the fixed cost of course will be diluted, but both work well (parallelization of the NN method here is trivial as well):
N1 = 1000 N2 = 1000000 nn: 1.363 s (10566 allocations: 84.68 MiB) CL serial: 357.926 ms (74 allocations: 27.89 MiB) CL parallel: 81.272 ms (379 allocations: 34.92 MiB) naive (N1*N2): 3.035 s (19 allocations: 5.00 MiB)
These times can change depending on the point distribution, number of points and cutoff, and one method or the other may be faster. It can be sometimes much faster, sometimes much slower.
using NearestNeighbors function nn(points,data,r) balltree = BallTree(data) return inrange(balltree, points, r, true) end using CellListMap, StaticArrays function findpairs(x,y,cutoff;parallel=true) N1 = length(x) N2 = length(y) # Boundaries xmin = [ +Inf, +Inf, +Inf ] xmax = [ -Inf, -Inf, -Inf ] for v in x @. xmin = min(xmin,v) @. xmax = max(xmax,v) end for v in y @. xmin = min(xmin,v) @. xmax = max(xmax,v) end # Define box sides sides = (xmax - xmin) .+ cutoff box = Box(sides,cutoff) # Initialize auxiliary linked lists (largest set!) lc = LinkedLists(N2) # Initializing linked cells with these positions (largest set!) initlists!(y,box,lc,parallel=parallel) # Function adds pair to the list function f(j,d2,cutoff,pairs) if sqrt(d2) < cutoff push!(pairs,j) end return pairs end # We have to define our own reduce function here (for the parallel version) function reduce_pairs(pairs,pairs_threaded) pairs = pairs_threaded for i in 2:Threads.nthreads() append!(pairs,pairs_threaded[i]) end return pairs end # Initialize pairs = Int # Run pairwise computation pairs = map_pairwise!( (x,y,i,j,d2,pairs) -> f(j,d2,cutoff,pairs), pairs,x,y,box,lc, reduce=reduce_pairs, parallel=parallel ) return pairs end function naive(x,y,cutoff) pair_list = Int for vx in x for (i,vy) in pairs(y) if CellListMap.distance(vx,vy) < cutoff push!(pair_list,i) end end end pair_list end using BenchmarkTools, Test N1 = 1000; N2 = 1_000_000 points = rand(3,N1) data = rand(3, N2) r = 0.05 x = [ SVector(points[:,i]...) for i in axes(points,2) ] y = [ SVector(data[:,i]...) for i in axes(data,2) ] @test sort(vcat(nn(points,data,r)...)) == sort(findpairs(x,y,r,parallel=false)) #@test sort(findpairs(points,y,r)) == sort(naive(points,y,r)) println("N1 = $N1 N2 = $N2") print("nn: "); @btime nn($points,$data,$r) print("CL serial: "); @btime findpairs($x,$y,$r,parallel=false) print("CL parallel: "); @btime findpairs($x,$y,$r,parallel=true) #print("naive (N1*N2): "); @btime naive($x,$y,$r) nothing