Here is an attempt to use FlexiJoins by first finding a bounding box for each geometry. This reduces the quadratic complexity of the query (depending on how annoying the geometries are):
# prep data & environment
using DataFrames, GeoDataFrames, ArchGDAL, FlexiJoins, IntervalSets
dfa = GeoDataFrames.read("a.csv")
dfb = GeoDataFrames.read("b.csv")
getbb(g) = begin
gg = ArchGDAL.boundingbox(g)
gg1 = ArchGDAL.getgeom(gg,0)
x1, y2 = ArchGDAL.getpoint(gg1,0)
x2, y1 = ArchGDAL.getpoint(gg1,2)
(x1..x2, y1..y2)
end
getxy(p) = begin
x,y = ArchGDAL.getpoint(p,0)
(x, y)
end
# process data
dfa2 = transform(dfa,
:geometry => ByRow(getbb) => [:xinterval, :yinterval])
dfb2 = transform(dfb,
:geometry => ByRow(getxy) => [:x, :y])
subset!(
innerjoin(
(innerjoin((dfb2,dfa2), by_pred(:x, ∈, :xinterval)),dfa2),
by_pred(:y, ∈, :yinterval)
), [:geometry, :geometry_1] => (a, b) -> intersects.(a, b))
The results are similar to the ones from a crossjoin. The bounding boxes and intervals may be of further use possibly.