[ANN] OptimalTransport.jl - Optimal transport algorithms for Julia

zsteve · May 14, 2020, 6:30am

Hi all,

I’m pleased to announce OptimalTransport.jl, a Julia package (currently still a work-in-progress) offering various algorithms for computational optimal transport, such as the celebrated Sinkhorn scaling algorithm for entropically regularised optimal transport.

Many algorithms are implemented natively in Julia, whilst other functionality is currently offered as a wrapper to the excellent POT (Python OT) library [1]. We will be progressively adding more Julia-implemented algorithms in the future.

Any feedback or suggestions are welcome!

Check out the documentation here:

http://zsteve.phatcode.net/OptimalTransportDocs

Example

using OptimalTransport
using Seaborn

μ_spt = ν_spt = LinRange(-2, 2, 100)
C = pairwise(Euclidean(), μ_spt', ν_spt').^2
μ = exp.((-(μ_spt).^2)/0.5^2)
μ /= sum(μ)
ν = ν_spt.^2 .*exp.((-(ν_spt).^2)/0.5^2)
ν /= sum(ν)
γ = OptimalTransport.sinkhorn_stabilized(μ, ν, C, 1e-4, max_iter = 5000)
using Random, Distributions

F = DiscreteNonParametric(1:prod(size(γ)), reshape(γ, prod(size(γ))))
t = rand(F, 5*10^3)
μ_idx = t .% size(γ, 1)
ν_idx = t .÷ size(γ, 2)


Seaborn.figure()
Seaborn.jointplot(x = [μ_spt[i] for i in μ_idx], y = [ν_spt[i] for i in ν_idx], kind = "kde")
Seaborn.gcf()
Seaborn.savefig("example.png")

Links

[1] POT Python Optimal Transport library, Flamary, Remi and Courty, Nicolas

baggepinnen · May 14, 2020, 8:48am

Nice
I have some optimal-transport related algorithms like finding barycenters and barycentric coordinates implemented in

Do you know how performance and stability of your julia implementations of sinkhorn compare against POT?

zsteve · May 18, 2020, 3:18am

Good question! In terms of performance, comparing Julia sinkhorn to the POT implementation on 1000x1000 problem gives me the following:

Julia:

using OptimalTransport
using Distances
using LinearAlgebra

N = 1000; M = 1000
μ_spt = rand(N)
ν_spt = rand(M)

μ = ones(size(μ_spt, 1))
ν = ones(size(ν_spt, 1))

C = pairwise(Euclidean(), μ_spt', ν_spt').^2
ϵ = 0.01

using BenchmarkTools
@benchmark γ = sinkhorn(μ, ν, C, ϵ)

Output:

julia> @benchmark sinkhorn(μ, ν, C, ϵ)
BenchmarkTools.Trial:
  memory estimate:  27.24 MiB
  allocs estimate:  572
  --------------
  minimum time:     259.265 ms (0.00% GC)
  median time:      286.214 ms (0.20% GC)
  mean time:        299.338 ms (2.71% GC)
  maximum time:     355.400 ms (4.59% GC)
  --------------
  samples:          17
  evals/sample:     1

POT:

import ot
import numpy as np

N = 1000
mu = np.random.rand(N, 1)
nu = np.random.rand(N, 1)
C = ot.utils.dist(mu, nu)
eps = 0.01

%timeit ot.sinkhorn(np.ones(N), np.ones(N), C, eps)

Output:

In [8]: %timeit ot.sinkhorn(np.ones(N), np.ones(N), C, eps)
577 ms ± 12.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Edits: realised there was a bug in some of the code I ran my brain obviously not working on Sunday

evanfields · May 18, 2020, 11:23pm

I’m delighted to see this! For mostly personal philosophical reasons I’m a big optimal transport fan and hope to see the tools become more widespread.

Topic		Replies	Views
Computing discrete Wasserstein (EMD) distance in Julia? Machine Learning question	16	4640	December 20, 2021
Performance help for this short matrix function? Performance linearalgebra	11	781	January 30, 2019
How to use Wasserstein distance in OptimalTransport.jl General Usage question	1	455	May 15, 2023
[ANN, NEW] OptimalTransportNetworks.jl - Optimal transport networks in spatial equilibrium Package Announcements package , announcement , optimization	3	235	September 20, 2024
Optimization tips for my julia code. Can I make it even faster and/or memory efficient? Performance question , python	24	4212	February 15, 2020

[ANN] OptimalTransport.jl - Optimal transport algorithms for Julia

Check out the documentation here:

Example

Links

Related topics