CellListMap: Go from list of pairs to list of indices

Ahmed_Salih · December 21, 2022, 8:23pm

Hello!

I am using CellListMap to generate a neighbour list of the type:

ListOfPairs =
(1,4,0.04)
(1,3,0.02)
(1,8,0.09)
(2,1,0.01)
(2,4,0.05)

Where first index is “i”, second index is “j” and third index is the euclidean distance between “i” and “j”. My questions is then, how do I go in the fastest way from an vector of pairs to vector of indices:

ListOfIndices=
[1 4 3 8]
[2 1 4]

Where the array “[1 4 3 8]” would be understood as “particle with index 1 has neighbours 4 3 8” and the second array as “particle with index 2 has neighbours 1 4”

For now I can live with “losing” the information about euclidean distance, I just want to extract the indices in the second format. Currenly the code I use in a slightly pseudo code format:

i == 1
ListOfIndices_  = list[map(x->x[1]==i || x[2]==i,ListOfPairs)]
ListOfIndices[i] = [i;unique([getfield.(l,1);getfield.(l,2)])]

Which for a ListOfPairs of ~17000 elements roughly takes 0.10 seconds. I wish to get it down further.

@lmiq sorry if not okay to tag you. Since you are the developer of the package I imagined you might already have such a function - and thank you for making the package, it really helps me out

Kind regards

lmiq · December 21, 2022, 9:06pm

I would of course first ask if you really need that conversion. Usually running over that list of tuples is faster to compute properties for the set than running over a list of lists.

Concerning the conversion, I don’t think you can get much faster than something like this:

julia> function convert(x, list)
          out = [ Int[] for _ in x ]
          for (i,j,d) in list
              push!(out[i], j)
              push!(out[j], i)
          end
          return out
       end

Yet, if you need that kind of output, you could implement it using CellListMap lower level interface directly. Something like this, for example, such that you don’t need the conversion at all:

using CellListMap
using StaticArrays

const T = Vector{Vector{Int}}

function add_pair!(i,j,list::T)
    push!(list[i], j)
    push!(list[j], i)
    return list
end

function reduce_lists(list::T, list_threaded::Vector{T})
    for it in eachindex(list_threaded)
        for i in eachindex(list)
            append!(list[i], list_threaded[it][i])
        end
    end
    return list
end

function run()

  sides = [250,250,250]
  cutoff = 10.
  box = Box(sides,cutoff)

  # positions
  x = [ SVector{3,Float64}(sides .* rand(3)) for i in 1:1500 ]

  # Initialize auxiliary linked lists
  cl = CellList(x,box)

  # Initial, empty, list
  list = [ Int[] for _ in x ]

  # Run pairwise computation
  map_pairwise( 
      (x,y,i,j,d2,list) -> add_pair!(i,j,list),
      list,box,cl;
      reduce=reduce_lists
  )
  return list

end

(there are some additional optimizations that can be used if your calling this within an iterative procedure, to preallocate the threaded output and auxiliary arrays for threading).

Finally, if I’m not mistaken, NearestNeighbors.jl default output is the list of lists you want.

The reason CellListMap.jl provides the output as a list of tuples is exactly because building that list is faster, and running over it is faster as well, if one need to run over all pairs. But of course that may not be the best structure for every application.

Ahmed_Salih · December 21, 2022, 9:12pm

Thank you very much!

Unfortunately I do “need” the conversion, since I am not clever enough (yet!) to write the code so it works based on ListOfPairs instead of ListOfIndices… I am working towards that, but for now I have to stick to this to be able to discretize my equations how I need

The reason I don’t use NearestNeighbours (which is also a great package!) is that CellListMap is simply too good for what I need to use it for. It outputs all neighbours and distances between them so fast. Also you have developed code for smooth updates of the neighbour list i.e. update!(system,x_new) which makes it so easy to develop transient simulations with.

I hope that gave some answers as to why I am purposefully “bottle-necking” my self for now

Kind regards

lmiq · December 21, 2022, 9:14pm

In that case the optimal solution would be to write a custom version of the neighborlist function, returning what you need directly, as I mentioned above. If you need that at some point and want some help, let me know, it is not much different from what I posted there above, even with all optimizations.

Ahmed_Salih · December 21, 2022, 9:21pm

lmiq:

function convert(x, list)
          out = [ Int[] for _ in x ]
          for (i,j,d) in list
              push!(out[i], j)
              push!(out[j], i)
          end
          return out
       end

First, your solution blew my map approach out of the water!

Time to generate list of indices: 0.0004447 s

Compared to 0.1 s from before So thank you very much!

And thank you! I will let you know if I end up needing that or hopefully get the clearly better approach working

Kind regards

lmiq · December 21, 2022, 9:24pm

The good thing about Julia is that frequently the literal solution, meaning, just write what you need, is very good. Note that I just did the “obvious” there, created the empty list and added the elements to it. Nothing fancy.

rocco_sprmnt21 · December 21, 2022, 9:27pm

function celllistmap(p)
    pp=[Pair(f,s) for (f,s,l) in p]
    d=Dict{Int64,Array{Int64}}()
    for e in pp
        push!(get!(()->Int[],d,first(e)),last(e))
    end
    return d
end

function celllistmap1(p)
    d=Dict{Int64,Array{Int64}}()
    for e in p
        push!(get!(()->Int[],d,e[1]),e[2])
    end
    return d
end

Ahmed_Salih · December 26, 2022, 10:25am

Yep, I had totally disregarded “push” because I was thinking that constantly expanding a vector would lead to a very slow code (as in for example Matlab), but obviously not - I will try to keep that piece of advice with me

But on to the “big news”!

Thank God, I was finally able to wrap my head around how to rewrite the code from focusing on a particle to performing the math over pairs where CellListMap really shines. So now I have gotten an even bigger speed up than before and learned a lot in the process.

A lot of frustration was incurred in the process, but also so much relief now, finally having a working version.

I have some other questions in regards to CellListMap, but will post these in different threads to make it easier for others to benefit too in the future

Kind regards

lmiq · December 26, 2022, 11:30am

Yeah, push! is quite efficient, because it preallocates space such that not every push needs an actual allocation.

Nice that you could rework your code. Please mark me on other threads about CellListMap, so I don’t miss them. I’m not sure I’ll be able to answer quickly, because of the time of the year, but at least the notification os there.

Topic		Replies	Views
[ANN] CellListMap.jl: short-ranged particle pairwise computations (lennard-jones, gravitational, etc) Package Announcements	38	2999	June 24, 2022
Cell list algorithm is slower than double for loop Performance	18	2273	April 2, 2020
Get distances of many points within a max range, any performance improvements? Performance	8	402	November 3, 2021
Accelerate pairwise Lennard-Jones force computation Performance	87	3309	February 12, 2022
How to speed up value updating in an array? Performance	19	571	February 15, 2023

CellListMap: Go from list of pairs to list of indices

Related topics