Possible to speed up this function for calculating cartesian indices?

Does the precision of fld matter? If not, then floating-point division is much faster:

julia> x = rand(); y = 0.5;

julia> @btime Int(fld($x, $y));  # sometimes only 9ns
  19.523 ns (0 allocations: 0 bytes)

julia> @btime floor(Int, $x/$y);
  4.090 ns (0 allocations: 0 bytes)

I tried to integrate it into the loop inside BatchExtractCells!, but that slowed things down for some reason.
With

function BatchExtractCells3!(Cells, Points, CutOff)
    # @batch per=thread
    for i ∈ eachindex(Cells)
        t = map(Tuple(Points[i])) do x
            floor(Int, x/CutOff)+2
        end
        Cells[i] = CartesianIndex(t)
    end
end

this gives:

julia> @btime BatchExtractCells!($Cells, $Points, $CutOff)
  5.172 ms (0 allocations: 0 bytes)

julia> @btime BatchExtractCells3!($Cells, $Points, $CutOff)
  318.036 μs (0 allocations: 0 bytes)
3 Likes