I made some test and the outcome is that FlexiJoins typically performs better than SortMerge, but the latter is faster when the catalogs are pre-sorted.
MWE (see documentation here):
using SkyCoords, AstroLib, SortMerge
using FlexiJoins, Unitful, UnitfulAstro
# Generate some random coordinates
nn = 10_000_000
c1 = ICRSCoords.(rand(nn) .* 2pi, rand(nn) .* pi .- pi/2);
c2 = ICRSCoords.(rand(nn) .* 2pi, rand(nn) .* pi .- pi/2);
# Cross match using SortMerge:
lt(v, i, j) = (v[i].dec < v[j].dec)
function sd(v1, v2, i1, i2, threshold_arcsec)
threshold_rad = threshold_arcsec / 3600. * pi / 180.
d = (v1[i1].dec - v2[i2].dec) / threshold_rad
(abs(d) >= 1) && (return sign(d))
maxd = max(abs(v1[i1].dec), abs(v2[i2].dec))
if pi/2. - maxd > pi / 180. # avoid this optimization in regions below 1 deg from the poles
d = abs(v1[i1].ra - v2[i2].ra)
(d > pi) && (d = 2pi - d)
d *= cos(maxd) / threshold_rad
(d >= 1) && (return 999)
end
dd = gcirc(0, v1[i1].ra, v1[i1].dec, v2[i2].ra, v2[i2].dec)
(dd < threshold_rad) && (return 0)
return 999
end
@time jj = sortmerge(c1, c2, lt1=lt, lt2=lt, sd=sd, 1.);
# Cross match using FlexiJoins:
@time result = innerjoin((c1, c2), by_distance(identity, separation, <(1u"arcsecond")));
To cross match two catalog, each with 10 million entries, SortMerge takes ~33 sec on average, while FlexiJoin provides some ~30% better performance with an average of ~24 s.
On the other hand, when the input catalogs are sorted by declination:
cs1 = c1[sortperm([r.dec for r in c1])];
cs2 = c2[sortperm([r.dec for r in c2])];
# Cross match using SortMerge:
@time jj = sortmerge(cs1, cs2, sd=sd, 1., sorted=true);
# Cross match using FlexiJoins:
@time result = innerjoin((cs1, cs2), by_distance(identity, separation, <(1u"arcsecond")));
SortMerge takes ~9.6 sec, while FlexiJoins is requires ~16 sec.
As discussed during the meeting, I will submit a PR to list SortMerge in the JuliaAstro ecosystem.
Thanks (and sorry for the slightly off-topic post).