Hi,
Hello @zsz00, I worked on standardizing the API, several performance improvements, and an auto-tuning feature. In any case, Faiss
is great when you can use GPUs or have a huge database (applying PQ).
In MHO, the SimilaritySearch
package should be used on medium-sized datasets (e.g., a few million?) with high dimensionality. Among its nice features: it is written in pure Julia and supports many distance functions; it is multithreaded and has a novel auto-configuration feature.
I made a few changes in the benchmark to make it work with SimilaritySearch
(quite long, indeed, there will be some bugs)
using Pkg, Faiss, SimilaritySearch, LinearAlgebra, JLD2
function create_benchmark(dim, n, m)
X = rand(Float32, dim, n)
Q = rand(Float32, dim, m)
for c in eachcol(X) normalize!(c) end
for c in eachcol(Q) normalize!(c) end
X, Q
end
function create_gold(X, Q, k, dist)
db = MatrixDatabase(X)
queries = MatrixDatabase(Q)
ex = ExhaustiveSearch(; db, dist)
searchbatch(ex, queries, k; parallel=true)
end
function simsearch_test(X, Q, k, dist)
db = MatrixDatabase(X)
k = 100
queries = MatrixDatabase(Q)
G = SearchGraph(; db, dist)
index!(G; parallel_block=512)
searchbatch(G, queries, k; parallel=true)
end
function faiss_index(X, Q, k)
dim, n = size(X)
str = "HNSW,Flat"
# str = "Flat"
metric = "IP"
gpus = ""
idx = Faiss.Index(dim; str, metric, gpus) # init Faiss Index
show(idx) # show idx info
add(idx, permutedims(X))
D, I = Faiss.search(idx, permutedims(Q), k)
I, D = permutedims(I), permutedims(D)
I .+= 1
I, D
end
function faiss_test(X, Q, k)
vs_gallery = permutedims(X)
vs_query = permutedims(Q)
D, I = local_rank(vs_query, vs_gallery, k=k, metric="IP", gpus="")
I, D = permutedims(I), permutedims(D)
I .+= 1
I, D
end
function main()
dim = 128
k = 100
dbfile = "benchmark-$dim.jld2"
goldfile = "gold-$dim.jld2"
faissfile = "faiss-$dim.jld2"
faissindexfile = "faissindex-$dim.jld2"
simfile = "sim-$dim.jld2"
dist = NormalizedCosineDistance()
X, Q = if isfile(dbfile)
load(dbfile, "X", "Q")
else
X, Q = create_benchmark(dim, 10^6, 10^3)
jldsave(dbfile, X=X, Q=Q)
X, Q
end
@info "== gold"
gI, gD, exhaustivetime = if isfile(goldfile)
load(goldfile, "gI", "gD", "exhaustivetime")
else
exhaustivetime = @elapsed gI, gD = create_gold(X, Q, k, dist)
jldsave(goldfile; gI, gD, exhaustivetime)
gI, gD, exhaustivetime
end
@show exhaustivetime
@info "== faiss"
fI, fD, faisstime = if isfile(faissfile)
load(faissfile, "fI", "fD", "faisstime")
else
faisstime = @elapsed fI, fD = faiss_test(X, Q, k)
jldsave(faissfile; fI, fD, faisstime)
fI, fD, faisstime
end
@show faisstime
@info "== faiss index"
fiI, fiD, faissindextime = if isfile(faissindexfile)
load(faissindexfile, "fiI", "fiD", "faissindextime")
else
faissindextime = @elapsed fiI, fiD = faiss_index(X, Q, k)
jldsave(faissindexfile; fiI, fiD, faissindextime)
fiI, fiD, faissindextime
end
@show faissindextime
@info "== SearchGraph"
sI, sD, simtime = if isfile(simfile)
load(simfile, "sI", "sD", "simtime")
else
simtime = @elapsed sI, sD = simsearch_test(X, Q, k, dist)
jldsave(simfile; sI, sD, simtime)
sI, sD, simtime
end
@show simtime
@show faisstime => macrorecall(gI, fI)
@show faissindextime => macrorecall(gI, fiI)
@show simtime => macrorecall(gI, sI)
println(Pkg.status())
end
For instance, I ran it with julia -t64
in a 32 core workstation with hyperthreading (without GPUs). The functions also need to specify that you want multithreading, as seen in the code.
The results are:
faisstime => macrorecall(gI, fI) = 39.756926338 => 0.99999
faissindextime => macrorecall(gI, fiI) = 89.104171865 => 0.061250000000000054
simtime => macrorecall(gI, sI) = 114.653610227 => 0.4511400000000003
However, the ExhaustiveSearch computes the entire batch in 3.06s. So, the best approach for this setup and benchmark is brute force; larger query sets will perform differently. Note that SimilaritySearch
achieves a pretty bad 0.45 recall (the recall goes from 0 to 1, higher is better), nonetheless, the HSNW of FAISS (with default setup) archives ~0.06 of recall. These bad performances are because the dimension of the dataset is pretty high. The SearchGraph
index performs better than HNSW
because even when it uses the default setup, it has an auto-tuning among its features (by default, it tries to get a good tradeoff between search speed and quality).
It is possible to improve the recall adding the following line after the index!
line
optimize!(G, OptimizeParameters(kind=MinRecall(0.9)))
This will try to adjust its hyperparameters to reach the specified recall, and the result is:
simtime => macrorecall(gI, sI) = 93.36208313 => 0.8652799999999999
The small speed up is also a nice unexpected improvement due to the optimized hyperparameters.
I used SimilaritySearch v0.8.12
and Faiss v0.3.1
for this comparison.
Best regards,
Eric