Optimizing a simple random walk simulation

stevengj · January 10, 2025, 9:32pm

I was assuming that was just a toy aspect of the model, and that in practice the walkers are coupled. (Otherwise you wouldn’t organize the loops this way.)

aris · January 10, 2025, 9:34pm

I just need to track the number of walkers alive at each time step. The walkers do not see each other.

stevengj · January 10, 2025, 9:55pm

Then you should definitely just run each walker separately, i.e. swap the order of your loops. You don’t even need arrays of walkers, just an array to store when they died that is updated at the end of their loop.

e.g.

function foo3()
    n_walkers = 1000
    n_steps = 1000
    step_size = 4
    n_died = zeros(Int, n_steps) # number that died on step i

    for _ in 1:n_walkers
        xyz = @SVector zeros(3)
        for i in 1:n_steps
            xyz += normalize(@SVector(randn(3))) * step_size
            if any(xyz .> 10)
                n_died[i] += 1
                break
            end
        end
    end
    return n_died
end

(note that the number alive on each step is simply n_walkers .- cumsum(n_died) at the end). This is about 2x faster than foo2() on my machine. (Update: sorry, not 1000x, I misread the numbers.)

(To parallelize this, which would only be worthwhile if you vastly increased the number of walkers, you would need to synchonize the n_died[i] += 1 statement. This isn’t too expensive since it only occurs at the end of the inner loop. You could either have a single n_died array and put a mutex lock around the increment, or have per-thread n_died arrays and then sum them together at the end. Or use MPI.jl and do an allreduce on n_died at the end.)

aris · January 10, 2025, 10:04pm

Thank you very much, that makes a lot more sense indeed.
Looks like I was a bit tunnel visioned while translating “vectorized” matlab code.

aris · January 11, 2025, 10:07am

Turns out moving to the real example involves some extra detail I’m not 100% sure how to deal with.
Please allow me to post something closer to the original problem (which I should have done in the first place).

The walkers are moving within a “lattice”, which is represented as a 3D array of 1’s (which is the lattice itself) and 0’s (which are the gaps in which the walkers can reside and move).
The steps should be smaller than the size of the lattice’s voxels, and the condition of the walker dying would be some probability check, every time it hits a wall (otherwise the previous step is undone and the walker continues).

Below I’ve modified (or perhaps butchered) @stevengj 's function to include the actual problem.

Code

lattice = rand(Bool, 100,100,100)

# Make edges solid, so that the walkers cannot escape
lattice[1,:,:].=true;
lattice[:,1,:].=true;
lattice[:,:,1].=true;
lattice[end,:,:].=true;
lattice[:,end,:].=true;
lattice[:,:,end].=true;

function random_walk(lattice)

    n_walkers = 1000
    n_steps = 1000
    n_died = zeros(Int, n_steps) # number that died on step i
    step_size = 1e-6
    voxel_size = 1e-5
    hit_wall = false
    step = @SVector rand(3)
    kill_probablility = 0.1
    
    # Find all indices of "false" in the lattice, and select some of them to 
    # use as starting points, one for each walker
    starting_point_indx = rand(findall(x -> x == false, lattice), n_walkers);

    # Convert to actual xyz coordinates 
    # (which places walkers on the edge of their respective voxel)
    xyz = [collect(Tuple(starting_point_indx[i])) for i in 1:n_walkers] .* voxel_size

    # move to random initial positions within voxel and convert to SVectors
    xyz -= ([@SVector rand(3) for _ in 1:n_walkers] .* voxel_size)

    for _ in 1:n_walkers

        for i in 1:n_steps
            xyz[i] += (step = normalize(@SVector(randn(3))) * step_size)

            hit_wall = lattice[Int.(cld.(xyz[i] , voxel_size))...]

            if hit_wall 
                if rand() < kill_probablility
                    n_died[i] += 1
                    break
                else # take a step back and continue
                    xyz[i] -= step
                end
            end
        end
    end

    return n_walkers .- cumsum(n_died)
end

I was mainly wondering whether there’s any better way to deal with indexing the lattice, without so many conversions, from cartesian indices to floats and back to int.

Dan · January 11, 2025, 3:23pm

Memory access is slower than on-CPU calculation. If the lattice wall is as simple as in the code, instead of keeping the array, it’s faster to calculate hit_wall on the fly.
If only the counts matter, as the function returns, then simulating each walker separately and not keeping memory structure for all the walkers seems faster.

aris · January 11, 2025, 3:39pm

Thank you for the reply Dan,

The lattice array would have to be in memory unfortunately, as it would be some 3D tomographic image. I was just wondering whether there’s a more elegant or efficient way to index it.

As for your second point, I’m not sure what “memory structure” refers to in this case. At the end, only the count would matter indeed, but I don’t see how the code above is tracking the memory of each walker.
The only reason I preallocate all the starting positions (xyz array) at the beginning is because I wanted to avoid searching for a valid starting point at the beginning of each iteration in the loop, which in my understanding would require generating something like the starting_point_indx array many times.

Dan · January 11, 2025, 7:22pm

This array:
xyz = [collect(Tuple(starting_point_indx[i])) for i in 1:n_walkers] .* voxel_size

aris · January 11, 2025, 9:59pm

Yes, that is just the starting positions of each walker.
I’m not sure how to skip creating this array while searching for valid starting points efficiently.

Dan · January 11, 2025, 10:36pm

Because you are only interested in time-to-boundary, you can simulate each starting-point and walk in a for loop and never remember older starting points, just make a new one on-the-fly, like in stevengj’s code.

aris · January 12, 2025, 9:07am

You’re right, I was just a bit confused regarding implementation.
I think I’ve got it now:

Code

function random_walk2(lattice)

    n_walkers = 1000
    n_steps = 1000
    n_died = zeros(Int, n_steps) # number that died on step i
    step_size = 1e-6
    voxel_size = 1e-5
    hit_wall = false
    step = @SVector rand(3)
    xyz = @SVector rand(3)
    kill_probablility = 0.1
    
    starting_point_indx = rand(findall(x -> x == false, lattice), n_walkers);

    for cart_ind in starting_point_indx

        xyz = SVector{3, Float64}(Tuple(cart_ind) .* voxel_size)
        xyz -= (@SVector rand(3)) .* voxel_size

        for i in 1:n_steps
            xyz += (step = normalize(@SVector(randn(3))) * step_size)

            hit_wall = lattice[Int.(cld.(xyz , voxel_size))...]

            if hit_wall 
                if rand() < kill_probablility
                    n_died[i] += 1
                    break
                else # take a step back and continue
                    xyz -= step
                end
            end
        end
    end

    return n_walkers .- cumsum(n_died)
end

@time random_walk2(lattice);

Thanks for the advice!

stevengj · January 12, 2025, 12:41pm

It isn’t necessary to initialize these variables outside the loop; they are only used inside the loop body.

Instead you can do

valid_starting_points = findall((!), lattice)

and then do

for _ = 1:n_walkers
    starting_point = rand(valid_starting_points)
    ...
end

to generate a starting point on each loop iteration rather than allocating them all in advance.

aris · January 12, 2025, 1:16pm

Thank you @stevengj!

I thought this would be effectively the same as the code above, but it actually shaves off a few more allocations.

stevengj · January 12, 2025, 8:29pm

It’s different because it doesn’t allocate an array of n_walkers starting points. This would become more important if you made n_walkers huge in order to get better statistics.

Dan · January 12, 2025, 8:41pm

Also note, that accepting a discretization cost to the walkers domain, can allow you to calculate the exact probabilities and expected exit times in a reasonable time. Have you considered looking Fast Marching methods? (or using common-sense logic of ‘backtracking’ from boundaries to middle of domain). Anyway, I’m not sure what the actual problem details are to know if this is applicable.

foobar_lv2 · January 12, 2025, 9:09pm

Are you actually interested in the limit stepsize->0 ?

I’m a little rusty with that, but this looks suspiciously like \partial_t u = \Delta u with a robin boundary condition, for the probability density of finding a particle at position x at time t. You only care about survival times for initially uniform density, i.e. \int u(t, x) d x, so you don’t need to solve the heat equation, just the elliptic eigenvalue thing for the first handful of eigenvalues. Might be faster than simulation?

aris · January 12, 2025, 10:53pm

Thank you @Dan and @foobar_lv2 for the extra suggestions.

This is about simulating the results of nuclear magnetic resonance experiments of fluids diffusing inside porous materials.

Surely both methods you propose above could be theoretically better, but there’s already the limiting factor of the CT image resolution, which will always be quite a few orders of magnitude larger than the scale of molecular movements. Therefore it seems as long as the step size is a lot smaller than the voxel size, the results are pretty good. The aforementioned random walk method is generally tried and tested in the literature (since the 80’s I believe), so I wouldn’t be keen to invest much more energy in exporing the “best” method, since this is only a side project to complement a mostly-experimental paper.

This paper takes a PDE approach to tackle the problem, you might find it interesting.

foobar_lv2 · January 13, 2025, 12:20am

My other point is:

Even if you don’t want to touch the PDE model, you may consider for the free particle (away from the boundary) u_{n+1} = u_n + \mu N, where N is unit normal distributed. That is, skip your weird normalization step (where each step is normalized onto the unit sphere, i.e. is uniform on the unit sphere).

Note that this is a different system than yours! But for the limit of small timesteps/large times, both are the same, and if you really look at a time-discretization of a time-continuous stochastic process / stochastic differential equation, then it’s much more faithful to skip the weird normalization.

In fact, that formula is exact away from the boundary.

That gives rise to a nice trick: If you are very far away from the boundary, you can “skip ahead” and compute u_{n+2} = u_n + \mu \sqrt{2} N (the sum of two independent unit normal distributed variables is \sqrt 2 times a unit normal normal distribution). This can save you a lot of time.

But you haven’t confirmed yet whether the problem you stated is your actual problem, or a specific time-discretization of your actual problem (which you should then also state!).

The SDE people here may have some insight on that.

PS. There are other tricks that immediately come to mind. Given your particle at time t, draw a sphere of maximal radius that doesn’t intersect the boundary. Skip ahead / teleport the particle to the next point where it leaves this ball.

Each point on this sphere has the same probability of being the exit point. Then you have some random distribution on the exit time t_{exit} that you look up in a book. Ta-da, you massively accelerated your evolution far away from the boundary, and this is even exact (no discretization error at all!).

aris · January 13, 2025, 9:28am

Thanks again for the advice.

Probably it would be the latter in this case.

Pretty sure I’ve seen this approach in some old paper, but if I remember correctly, handling the probablility of the walker dying (which would be proportional to the step size in the crude model above), might become a bit more complicated.

This one seems interesting. To be honest I don’t understand much but I’d be happy to read a bit more about it when I find some time.

foobar_lv2 · January 13, 2025, 9:43am

Ok, I took a very short look at the background. I think you’re doing something like https://www.wellesu.com/10.1016/j.jmr.2007.05.024 (Random-walk technique for simulating NMR measurements and 2D NMR maps of porous media with relaxing and permeable boundaries, Toumelin, Torres-Verdin, Sun, Dunn 2006)?

These people use a similar model to yours, including the imo batshit insane normalization of the random walk offset. They in turn (section 3, 3.1) cite

L.M. Schwartz, J.R. Banavar, Transport properties of disordered continuum systems, Phys. Rev. B 39 (1989) 11965–11970, which does not contain the weird normalization and instead considers only lattice positions
D.J. Wilkinson, D.L. Johnson, L.M. Schwartz, Nuclear magnetic
relaxation in porous media: the role of the mean lifetime s (q,D),
Phys. Rev. B 44 (1991) 4960–4973 which does not talk about random walks at all
D.J. Bergman, K.-J. Dunn, L.M. Schwartz, P.P. Mitra, Self-diffusion
in a periodic porous medium: a comparison of different approaches,
Phys. Rev. E 51 (1995) 3393–3400. which does contain the normalization in an off-hand way, but crucially also contains the correct PDE

So I’d say, trust the continuous-time model by [Bergmann Dunn Schwartz Mitra 95], but don’t do the stupid normalization they included in a throwaway “maximally naive” time-discretization without any explanation. They probably goofed.

Do your own literature sleuthing, then ask somebody who knows a little bit about SDEs to confirm that normalization of the random walker offsets is, in fact, batshit insane, and then use a better discretization.

PS. Ah, I think they normalize in the stupid way because they don’t want to tunnel across grains?

That’s still bad. Use instead an adaptive scheme:

When far away from the boundary, use the trick with exit time to a sphere that touches the lattice, and look up the exact integrals for the time-to-exit and the magnetic field component (you presumably want that integral as well?).
When close to the boundary, ask an SDE person for a good local step. If you don’t have an SDE person to ask, you can still use the old random walker thing with normalization. But I’m pretty sure that there are far better local boundary steps you could use!

Give [Bergmann Dunn Schwartz Mitra 95] to the SDE person as an intro. That paper is really readable and un-obfuscated!

Topic		Replies	Views
Simple random walk performance tips Performance	6	1386	May 21, 2021
How to change from random walk ro self avoiding random walk in Julia General Usage question	7	681	July 26, 2021
Memory best practices in Julia with Arrays vs Vectors New to Julia memory-allocation , arrays , staticarrays , simulations	19	834	August 18, 2024
Writing more optimal Julia code New to Julia	10	1787	August 10, 2017
Efficient simulation with different path lengths New to Julia time-series , monte-carlo , simulations	5	191	June 28, 2024

Optimizing a simple random walk simulation

Related topics