Lammps interface to julia vs. python


So LAMMPS ( has a python interface which I have been trying to use within Julia, with some success. However, I had two points I wanted to query the community about.

First, when I wish to access the positions of the atoms within the environment, I run the following commands:

using PyCall
@pyimport lammps
@pyimport numpy.ctypeslib as ctlib
lmp = lammps.lammps()
lmp[:file]("setup.lammps") # raw lammps scripting set up file

# n = number of atoms, something I would set beforehand
Xobj = @pycall ctlib.as_array(lmp[:extract_atom]("x",3)[:contents],shape=(n,3))::Any
X = unsafe_wrap(Array{Float64,2}, Ptr{Float64}(Xobj[:__array_interface__]["data"][1]), reverse(Xobj[:__array_interface__]["shape"]))
Fobj = @pycall ctlib.as_array(lmp[:extract_atom]("f",3)[:contents],shape=(n,3))::Any
F = unsafe_wrap(Array{Float64,2}, Ptr{Float64}(Fobj[:__array_interface__]["data"][1]), reverse(Fobj[:__array_interface__]["shape"]))

At this point I have (3,n) Julia arrays which gives me access to the underlying positions and forces of the atoms, X and F. Clearly, it takes two commands for each such array. Is there a way to condense this down to single command? Is there a way to do this more efficiently?

Second, suppose I want to do a very elementary simulation, running the overdamped langevin problem, as

# set nsteps number of time steps, Dt the step size, and kBT the temperature
for j =1:nsteps
   X.+= Dt.*F .+ sqrt(2. * kBT * Dt) .* randn(3,n)
    lmp[:command]("run 1 pre yes post no") # forces the lammps environment to update forces as atoms move
# other commands go here

While this works, I find that the analogous code within python is about twice as fast. Are there any obvious mistakes and/or ways to get back the Python speed?


Is the limiting factor the X .+= line or the lmp call? If it’s the latter, I don’t have much to say since I don’t know anything about LAMMPS. The X .+= line can probably be made faster, though, in two ways:

  • Don’t run performance-critical code in global scope (or using global variables): put your performance-critical code into a function. (It’s not clear whether you are doing this already?) e.g. write a function timestep(X, F, kBT, Dt, nsteps, lmp) that performs your loop.

  • Your randn call is allocating a temporary array on each call. You can instead do X.+= Dt.*F .+ sqrt(2. * kBT * Dt) .* randn.(), or just set c = sqrt(2kBT * Dt) and do @. X += Dt*F + c*randn() to omit all of the dots.

Rather than X = unsafe_wrap(Array{Float64,2}, Ptr{Float64}(..), by the way, have you tried just using:

X = @pycall ctlib.as_array(lmp[:extract_atom]("x",3)[:contents],shape=(n,3))::PyArray

since PyArray is a copy-free abstract-array wrapper around a NumPy array? (Possibly PyArray could still use some performance tuning, though, since there have been a lot of array updates since it was first implemented.)


The suggestions did seem to give a bit of speedup, so I’ll keep playing with that.

Regarding your suggestion about using

X = @pycall ctlib.as_array(lmp[:extract_atom]("x",3)[:contents],shape=(n,3))::PyArray

While this certainly works, it introduces a new question. The output is now an (n,3) array. Suppose I have a problem that I wish to constrain to 2D by ignoring the 3rd coordinate. I do not seem to have the ability to call X[:,1:2], like I would with a regular julia array. Is this kind of indexing not possible with PyArray’s?


It is possible, but slice-indexing hasn’t been implemented yet for PyArray. However, realize that expressions like X[:,1:2] make a copy, even with Array, unlike NumPy. If you want slicing to create a view, use e.g. @view(X[:,1:2]), and this will work with PyArray since it just wraps a SubArray around it.


Probably by making a direct Julia wrapper (at which point it might be seriously faster than using the Python wrapper in Python).

If there is a decent C API, that’s pretty easy to do, otherwise, there are a couple of ways of wrapping C++ code directly, such as and