Manipulating vectors with logsumexp

samerb · July 8, 2022, 3:20pm

I have an n-dimensional vector \delta and I would like to compute the following vector expression.

\left[ \log \left( \sum_{i=1}^n \exp(\delta_i) \right) - \log \left( \sum_{i \neq j} \exp(\delta_i) \right) \right]_{j=1}^n

What’s the best way to do this in a compact way (hopefully no loops).

I have access to the function logsumexp from StatsFuns so ideally I could use that but not necessary. Thanks!

skleinbo · July 8, 2022, 4:22pm

logsumexp takes an iterator as its argument, so this works

f(d) = logsumexp(d) .- ( logsumexp( (d[i] for i in eachindex(d) if i!=j) ) for j in eachindex(d) )

mikmoore · July 8, 2022, 5:28pm

With some algebra, the expression you are trying to evaluate is equal to
x_j = \log\left(1+\frac{\exp(\delta_j-c)}{\sum_{i\ne j} \exp(\delta_i-c)}\right)
for any choice c. I’ll recommend the conventional logsumexp choice c=\max_i \delta_i. This version can take advantage of the special function log1p to be more accurate for small arguments.

Note that we can equivalently (although risking some numerical cancellation in the implementation) write \sum_{i\ne j} \exp(\delta_i-c) = -\exp(\delta_j-c)+\sum_{i=1}^n \exp(\delta_i-c), where the right-side version allows the precomputation of the sum. Weigh that tradeoff of efficiency versus precision. Below, I have opted for the efficient version that risks that cancellation error in some edge cases:

expscale = let c=maximum(deltas); z->exp(z-c); end
sumexpdeltas = sum(expscale, deltas)
x = [log(sumexpdeltas) - log(sumexpdeltas - expscale(d)) for d in deltas] # initial version
x = [(ed = expscale(d); log1p(ed/(sumexpdeltas-ed))) for d in deltas] # with log1p trasform

EDIT: the cancellation error of the sum transformation I used is most-significant when the sum is dominated by a single term (one \delta_i is much bigger than all others). If this is a possibility, you’ll want to use the leave-one-out version of the summation instead. The log1p version avoids a numerical error when \delta_j is much smaller than \max_i \delta_i.

EDIT 2: For the purposes of my previous statement and Float64 precision, a difference of 20 or 30 would be the threshold to consider a number “much bigger”. That isn’t necessarily a lot, so I suspect you’ll want to use both leave-one-out summation and the transformed log1p equation to be safe.

EDIT 3:
Just a teensy bit more algebra yields
x_j = \log\left(1+\frac{1}{\sum_{i\ne j} \exp(\delta_i-\delta_j)}\right)
Given my previous numerical concerns, this is the version I’ll recommend you implement.

Topic		Replies	Views
Reduce(logaddexp,x) has arbitrarily better accuracy than logsumexp(x) Numerics	11	613	January 2, 2024
Sum of logs Performance	19	1038	May 10, 2022
How to LogSumExp.jl General Usage	3	972	November 16, 2023
Fast logsumexp Performance benchmark	14	7598	June 26, 2019
Speeding up my logsumexp function Performance performance , array , tullio , loopvectorization	35	5121	October 21, 2022

Manipulating vectors with logsumexp

Related topics