Hi there. I’m trying to do the following, given a vector of elements, I wish to get the vector of tuples containing the elements choose 2, with the order of the tuples obviously not mattering.
Importantly, I am trying to not use Combinatorics. The clear approach is to use two for loops, which is what I’ve been doing, but I need to do it faster. Is this the fastest way to conceivably do this? For context, this is using Float32s.
Say the length of the input vector is n. Given the index of a spot in the length n-choose-2 vector that is being returned, even inefficiently its around ~1.5ns to calculate the indexes for the two element in the input vector that should be present at that index in the output tuple.
For some actual times with for-loops, for a 500 length vector input, it takes ~815us. Here is a profile.
using LinearAlgebra
using BenchmarkTools
using GeometryBasics
using StatProfilerHTML
function emptyvec(T, n)
return Vector{T}(undef, n)
end
function choose2(vec::Vector{Line{2, Float32}})
n = length(vec)
pairs = emptyvec(Tuple{typeof(vec[1]), typeof(vec[1])}, Int32(n*(n-1)/2))
index = 1
for i in 1:n
for j in i+1:n
pairs[index] = (vec[i], vec[j])
index += 1
end
end
return pairs
end
random_vector(n) = [ GeometryBasics.Line(GeometryBasics.Point(Float32(rand()), Float32(rand())), GeometryBasics.Point(Float32(rand()), Float32(rand()))) for _ in 1:n]
vec = random_vector(500)
The resulting vector of pairs is 124750 elements long. Using the time to calc the indices along with this, the time spent calculating is at least on the order of ~190us, which suggested to me that I could maybe do it faster than what I have above with another methodology. I tried to use this calculation for the indices and just loop over the indices of the output vector in order to make the pairs, but it was much slower. I tried map, and replace!, but I haven’t been able to do any better than the original way with for loops.
I don’t think you can do it significantly faster. There are ~120.000 elements in the final vector and your CPU can do ~3 instructions per ns. If each element took just 1 instruction it would still take ~40.000ns = 40μs, so 190μs for seems pretty reasonable to me.
Ofc if you don’t need an actual Vector of these results you can do much faster by just constructing a generator instead:
julia> function choose2_generator(vec::Vector)
return ((vec[i], vec[j]) for i in 1:length(vec) for j in 1:i)
end
julia> v = rand(500);
julia> @btime choose2_generator($v);
3.702 ns (0 allocations: 0 bytes)
The speed of choose2 is bounded below by the amount of memory that needs to be read and written.
The following ‘trick’ will take about 0 time, by not materializing the output vector, but allowing lazy on-the-fly calculation of the tuples:
julia> using MappedArrays
julia> @inline function iuppert(k::Integer,n::Integer)
i = n - 1 - floor(Int,sqrt(-8*k + 4*n*(n-1) + 1)/2 - 0.5)
j = k + i + ( (n-i+1)*(n-i) - n*(n-1) )÷2
return i, j
end
iuppert (generic function with 1 method)
julia> function choose22(vec)
n = length(vec)
mappedarray(1:n*(n-1)÷2) do I
i,j = iuppert(I,n)
(vec[i], vec[j])
end
end
choose22 (generic function with 1 method)
basically an “iterator” is just a fuzzy concept for types that implement a few iteration related methods (including iterate)
a “generator” here refers to the type Base.Generator which has implemented all those methods, and is also privileged in the sense that it is the type of the object constructed from comprehensions (i for i in ...)
Wow. That’s really interersting. Read-only is fine in my case, exactly what I want I think. But, I’m guessing that this is not possible without the use of MappedArrays? My initial goal was to remove external dependencies, but that seems quite powerful.
Ahh okay. Thank you, that makes sense. I feel like the Julia documentation could do a better job of being explicit with that kind of information and detail. The section on them is pretty minimal for some reason.
But can one do that like in a full-multi line if statement? Say I had a f(i) that I had to call two times in the if statement in the example you provided. In a multiline statement I could assign it to a variable and use that, but how can one do such a thing in a single line if statement without calling something twice? Sorry if thats a dumb question lol
julia> f(x) = rand()
f (generic function with 1 method)
julia> collect( let u = f(i); i%2==1 ? u : u*3; end for i=1:10)
10-element Vector{Float64}:
0.07940946494560674
1.6035072917117752
0.4815119250120512
1.381748619911222
0.9124211172296329
1.665817797002973
0.49427982995996966
1.7073952286259733
0.06165322397334305
0.05246104092133874
As an example: right after getting the pairs choose 2, I had used another vector in the same manner to store information obtained from these. I tried a couple different forms but none worked. It has variables being declared and also conditionals from these, so I don’t know how I could conceivably produce a generator with the same logic and effect.
intersections = Vector{Tuple{Tuple{GeometryBasics.Line{2, Float32}, GeometryBasics.Line{2, Float32}}, GeometryBasics.Point{2,Float32}}}()
for (seg1, seg2) in choose2_generator(segments)
x1, y1, x2, y2, x3, y3, x4, y4, t, u = line_data1(seg1, seg2,)
if is_intersecting(t, u)
point = lineintersection(x1, y1, x1-x2, y1-y2, x3, y3, x3-x4, y3-y4, t, u)
if point != false && not_on_endpoint(seg1, seg2, point)
push!(intersections, ((seg1, seg2) , point) )
end
end
end
return intersections
Thank you. I tried this and, and it is much much faster itself. But, trying the generator out now with other functions, in graphing the points, getting the number of points, getting the number of unique points, in creating a matrix etc, the generator seems to allocate much more, and it takes longer such that the gain from these new functions is outdone by the large increase in time when manipulating them. Is there a tradeoff with the generators for using them in this regard?
I assume that the allocations and increased execution time are due to the functions being called in intercept_p. The generator puts off calling them until it is iterated, but sooner or later you have to pay for the calculations you are doing. You could try to optimize these functions for memory and speed, perhaps with help from this forum. You could also multi thread the loop over the generator.