I have recently tried a new approach to detect potential collisions in a 2D space, just for the broad phase (In this phase we just try to quickly reduce the number of potential collision to check) so I have come up with a new idea but I don’t want to miss something.
So my idea is simple, you use a grid to subdivide the plane into small cells, each cells has a given index.
Now each object is (just for this phase) represented as a rectangle.
This way we can map each object to a set of cell given by his dimensions.
x0,y0,w0,h0 = get_dimensions(obj)
Next we map each of these data to integer values to represent cells
x, w = _map_to_range(x0,1:size,grid.size[1]), _map_to_range(w0,1:size,grid.size[1])
y, h = _map_to_range(y0,1:size,grid.size[2]), _map_to_range(h0,1:size,grid.size[2])
So x being 1 means cells number 1 on the x axis.
size
is the size of the grid, we assume the grid is a square matrix.
Then what we will do now is encoding the size of the rectangle in a UInt.
traits = (h << BIT_WIDTH) | (w)
We will encode each dimensions on 5 bits so BIT_WIDTH = 5
Now what we will do is that we will use a PEXT bitboard to directly get all the cells covered by this object.
I have used Arceus.jl
for that.
zones::Vector{UInt64} = grid.bitboard[traits]
bitboard is a MagicBitboard
, you ccan check Arceus.jl for more information but basically, you give it a Uint
with each trait encoded as a bit add it return the corresponding value in less than 30 ns.
So for each possible combination of dimensions (that suffice on 10 bits), we will get all the matching cells from the origin of the grid.
That’s why we need to offset the result:
data = grid.zones # All the possible zones for collisions
offset = (y-1) * size + x
L = length(data)
@inbounds for i in zones # For all the cells index the object is covering
idx = i+offset
# We will ignore collisions out of the grid
idx <= L && push!(data[idx], obj) # we add the object in it
end
Next we only have to do this with every objects and just do a filter for for cells with more than 1 object in it.
As for the bitboard, we precompute it with every possible traits combination, so for each trait combination, we do this
"""
MakeRoute(x::UInt64)
This is our lookup function, it takes a bit bounding box as input and output all the zones he is in.
Using magic bitboard, foreach possible bounding box, a collection of zones are directly mapped to it
"""
function MakeRoute(x::UInt64)
# We collect our data just by decoding them from `x`.
# I guess it's not so hard to understand.
# We first use the mask to put to zero all irrelevants data with a AND
# Then if necessary we use shift to put the data at their correct position,
sz = BIT_WIDTH
w, h = (x & DEFAULT_MASK) + 1, ((x & (DEFAULT_MASK << sz)) >> sz) + 1
result::Vector{UInt64} = UInt64[]
size = 1 << sz # Faster than doing 2^sz
for i in 1:w
for j in 1:h
push!(result, (j-1)*size+i)
end
end
return result
end
result
will be assigned in the lookup table to the index corresponding to this set of traits.
Now I tested this and here are the results on a Intel Pentium T4400 with just 1 thread and a software PEXT function (10 times slower than the harware version)
# Uniform repartition
# 28.931 μs (2 allocations: 16.25 KiB) for 100 objects
# 117.588 μs (2 allocations: 16.25 KiB) for 1k objects
# 1.030 ms (2 allocations: 16.25 KiB) for 10k objects
## Concentration
# 38.263 μs (2 allocations: 16.25 KiB) for 100 objects
# 212.312 μs (2 allocations: 16.25 KiB) for 1k objects
# 1.960 ms (2 allocations: 16.25 KiB) for 10k objects
What do you think about this, does it seems cool or I missed something important ?
Anyways I find this pretty fast, on a modern computer with BMI2, we can have < 1ms for 10k object, even when concentrated.