# Manipulating vectors with logsumexp

I have an n-dimensional vector \delta and I would like to compute the following vector expression.

\left[ \log \left( \sum_{i=1}^n \exp(\delta_i) \right) - \log \left( \sum_{i \neq j} \exp(\delta_i) \right) \right]_{j=1}^n

What’s the best way to do this in a compact way (hopefully no loops).

I have access to the function logsumexp from StatsFuns so ideally I could use that but not necessary. Thanks!

logsumexp takes an iterator as its argument, so this works

f(d) = logsumexp(d) .- ( logsumexp( (d[i] for i in eachindex(d) if i!=j) ) for j in eachindex(d) )

2 Likes

With some algebra, the expression you are trying to evaluate is equal to
x_j = \log\left(1+\frac{\exp(\delta_j-c)}{\sum_{i\ne j} \exp(\delta_i-c)}\right)
for any choice c. I’ll recommend the conventional logsumexp choice c=\max_i \delta_i. This version can take advantage of the special function log1p to be more accurate for small arguments.

Note that we can equivalently (although risking some numerical cancellation in the implementation) write \sum_{i\ne j} \exp(\delta_i-c) = -\exp(\delta_j-c)+\sum_{i=1}^n \exp(\delta_i-c), where the right-side version allows the precomputation of the sum. Weigh that tradeoff of efficiency versus precision. Below, I have opted for the efficient version that risks that cancellation error in some edge cases:

expscale = let c=maximum(deltas); z->exp(z-c); end
sumexpdeltas = sum(expscale, deltas)
x = [log(sumexpdeltas) - log(sumexpdeltas - expscale(d)) for d in deltas] # initial version
x = [(ed = expscale(d); log1p(ed/(sumexpdeltas-ed))) for d in deltas] # with log1p trasform


EDIT: the cancellation error of the sum transformation I used is most-significant when the sum is dominated by a single term (one \delta_i is much bigger than all others). If this is a possibility, you’ll want to use the leave-one-out version of the summation instead. The log1p version avoids a numerical error when \delta_j is much smaller than \max_i \delta_i.

EDIT 2: For the purposes of my previous statement and Float64 precision, a difference of 20 or 30 would be the threshold to consider a number “much bigger”. That isn’t necessarily a lot, so I suspect you’ll want to use both leave-one-out summation and the transformed log1p equation to be safe.

EDIT 3:
Just a teensy bit more algebra yields
x_j = \log\left(1+\frac{1}{\sum_{i\ne j} \exp(\delta_i-\delta_j)}\right)
Given my previous numerical concerns, this is the version I’ll recommend you implement.

1 Like