CellListMap: Go from list of pairs to list of indices


I am using CellListMap to generate a neighbour list of the type:

ListOfPairs =

Where first index is “i”, second index is “j” and third index is the euclidean distance between “i” and “j”. My questions is then, how do I go in the fastest way from an vector of pairs to vector of indices:

[1 4 3 8]
[2 1 4]

Where the array “[1 4 3 8]” would be understood as “particle with index 1 has neighbours 4 3 8” and the second array as “particle with index 2 has neighbours 1 4”

For now I can live with “losing” the information about euclidean distance, I just want to extract the indices in the second format. Currenly the code I use in a slightly pseudo code format:

i == 1
ListOfIndices_  = list[map(x->x[1]==i || x[2]==i,ListOfPairs)]
ListOfIndices[i] = [i;unique([getfield.(l,1);getfield.(l,2)])]

Which for a ListOfPairs of ~17000 elements roughly takes 0.10 seconds. I wish to get it down further.

@lmiq sorry if not okay to tag you. Since you are the developer of the package I imagined you might already have such a function - and thank you for making the package, it really helps me out :slight_smile:

Kind regards

I would of course first ask if you really need that conversion. Usually running over that list of tuples is faster to compute properties for the set than running over a list of lists.

Concerning the conversion, I don’t think you can get much faster than something like this:

julia> function convert(x, list)
          out = [ Int[] for _ in x ]
          for (i,j,d) in list
              push!(out[i], j)
              push!(out[j], i)
          return out

Yet, if you need that kind of output, you could implement it using CellListMap lower level interface directly. Something like this, for example, such that you don’t need the conversion at all:

using CellListMap
using StaticArrays

const T = Vector{Vector{Int}}

function add_pair!(i,j,list::T)
    push!(list[i], j)
    push!(list[j], i)
    return list

function reduce_lists(list::T, list_threaded::Vector{T})
    for it in eachindex(list_threaded)
        for i in eachindex(list)
            append!(list[i], list_threaded[it][i])
    return list

function run()

  sides = [250,250,250]
  cutoff = 10.
  box = Box(sides,cutoff)

  # positions
  x = [ SVector{3,Float64}(sides .* rand(3)) for i in 1:1500 ]

  # Initialize auxiliary linked lists
  cl = CellList(x,box)

  # Initial, empty, list
  list = [ Int[] for _ in x ]

  # Run pairwise computation
      (x,y,i,j,d2,list) -> add_pair!(i,j,list),
  return list


(there are some additional optimizations that can be used if your calling this within an iterative procedure, to preallocate the threaded output and auxiliary arrays for threading).

Finally, if I’m not mistaken, NearestNeighbors.jl default output is the list of lists you want.

The reason CellListMap.jl provides the output as a list of tuples is exactly because building that list is faster, and running over it is faster as well, if one need to run over all pairs. But of course that may not be the best structure for every application.

1 Like

Thank you very much!

Unfortunately I do “need” the conversion, since I am not clever enough (yet!) to write the code so it works based on ListOfPairs instead of ListOfIndices… I am working towards that, but for now I have to stick to this to be able to discretize my equations how I need :slight_smile:

The reason I don’t use NearestNeighbours (which is also a great package!) is that CellListMap is simply too good for what I need to use it for. It outputs all neighbours and distances between them so fast. Also you have developed code for smooth updates of the neighbour list i.e. update!(system,x_new) which makes it so easy to develop transient simulations with.

I hope that gave some answers as to why I am purposefully “bottle-necking” my self for now :slight_smile:

Kind regards

In that case the optimal solution would be to write a custom version of the neighborlist function, returning what you need directly, as I mentioned above. If you need that at some point and want some help, let me know, it is not much different from what I posted there above, even with all optimizations.

1 Like

First, your solution blew my map approach out of the water!

Time to generate list of indices: 0.0004447 s

Compared to 0.1 s from before :sweat_smile: So thank you very much!

And thank you! I will let you know if I end up needing that or hopefully get the clearly better approach working :slight_smile:

Kind regards

1 Like

The good thing about Julia is that frequently the literal solution, meaning, just write what you need, is very good. Note that I just did the “obvious” there, created the empty list and added the elements to it. Nothing fancy.

1 Like
function celllistmap(p)
    pp=[Pair(f,s) for (f,s,l) in p]
    for e in pp
    return d
function celllistmap1(p)
    for e in p
    return d
1 Like

Yep, I had totally disregarded “push” because I was thinking that constantly expanding a vector would lead to a very slow code (as in for example Matlab), but obviously not - I will try to keep that piece of advice with me :slight_smile:

But on to the “big news”!

Thank God, I was finally able to wrap my head around how to rewrite the code from focusing on a particle to performing the math over pairs where CellListMap really shines. So now I have gotten an even bigger speed up than before and learned a lot in the process.

A lot of frustration was incurred in the process, but also so much relief now, finally having a working version.

I have some other questions in regards to CellListMap, but will post these in different threads to make it easier for others to benefit too in the future :slight_smile:

Kind regards

1 Like

Yeah, push! is quite efficient, because it preallocates space such that not every push needs an actual allocation.

Nice that you could rework your code. Please mark me on other threads about CellListMap, so I don’t miss them. I’m not sure I’ll be able to answer quickly, because of the time of the year, but at least the notification os there.

1 Like