Hi all,
I’m pleased to announce OptimalTransport.jl, a Julia package (currently still a work-in-progress) offering various algorithms for computational optimal transport, such as the celebrated Sinkhorn scaling algorithm for entropically regularised optimal transport.
Many algorithms are implemented natively in Julia, whilst other functionality is currently offered as a wrapper to the excellent POT (Python OT) library [1]. We will be progressively adding more Julia-implemented algorithms in the future.
Any feedback or suggestions are welcome!
Check out the documentation here:
http://zsteve.phatcode.net/OptimalTransportDocs
Example
using OptimalTransport
using Seaborn
μ_spt = ν_spt = LinRange(-2, 2, 100)
C = pairwise(Euclidean(), μ_spt', ν_spt').^2
μ = exp.((-(μ_spt).^2)/0.5^2)
μ /= sum(μ)
ν = ν_spt.^2 .*exp.((-(ν_spt).^2)/0.5^2)
ν /= sum(ν)
γ = OptimalTransport.sinkhorn_stabilized(μ, ν, C, 1e-4, max_iter = 5000)
using Random, Distributions
F = DiscreteNonParametric(1:prod(size(γ)), reshape(γ, prod(size(γ))))
t = rand(F, 5*10^3)
μ_idx = t .% size(γ, 1)
ν_idx = t .÷ size(γ, 2)
Seaborn.figure()
Seaborn.jointplot(x = [μ_spt[i] for i in μ_idx], y = [ν_spt[i] for i in ν_idx], kind = "kde")
Seaborn.gcf()
Seaborn.savefig("example.png")
Links
[1] POT Python Optimal Transport library, Flamary, Remi and Courty, Nicolas
14 Likes
Nice
I have some optimal-transport related algorithms like finding barycenters and barycentric coordinates implemented in
Do you know how performance and stability of your julia implementations of sinkhorn compare against POT?
5 Likes
Good question! In terms of performance, comparing Julia sinkhorn to the POT implementation on 1000x1000 problem gives me the following:
Julia:
using OptimalTransport
using Distances
using LinearAlgebra
N = 1000; M = 1000
μ_spt = rand(N)
ν_spt = rand(M)
μ = ones(size(μ_spt, 1))
ν = ones(size(ν_spt, 1))
C = pairwise(Euclidean(), μ_spt', ν_spt').^2
ϵ = 0.01
using BenchmarkTools
@benchmark γ = sinkhorn(μ, ν, C, ϵ)
Output:
julia> @benchmark sinkhorn(μ, ν, C, ϵ)
BenchmarkTools.Trial:
memory estimate: 27.24 MiB
allocs estimate: 572
--------------
minimum time: 259.265 ms (0.00% GC)
median time: 286.214 ms (0.20% GC)
mean time: 299.338 ms (2.71% GC)
maximum time: 355.400 ms (4.59% GC)
--------------
samples: 17
evals/sample: 1
POT:
import ot
import numpy as np
N = 1000
mu = np.random.rand(N, 1)
nu = np.random.rand(N, 1)
C = ot.utils.dist(mu, nu)
eps = 0.01
%timeit ot.sinkhorn(np.ones(N), np.ones(N), C, eps)
Output:
In [8]: %timeit ot.sinkhorn(np.ones(N), np.ones(N), C, eps)
577 ms ± 12.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Edits: realised there was a bug in some of the code I ran my brain obviously not working on Sunday
4 Likes
I’m delighted to see this! For mostly personal philosophical reasons I’m a big optimal transport fan and hope to see the tools become more widespread.
2 Likes