I have an n-dimensional vector \delta and I would like to compute the following vector expression.

\left[ \log \left( \sum_{i=1}^n \exp(\delta_i) \right) - \log \left( \sum_{i \neq j}
\exp(\delta_i) \right) \right]_{j=1}^n

What’s the best way to do this in a compact way (hopefully no loops).

I have access to the function `logsumexp`

from StatsFuns so ideally I could use that but not necessary. Thanks!

`logsumexp`

takes an iterator as its argument, so this works

```
f(d) = logsumexp(d) .- ( logsumexp( (d[i] for i in eachindex(d) if i!=j) ) for j in eachindex(d) )
```

2 Likes

With some algebra, the expression you are trying to evaluate is equal to

x_j = \log\left(1+\frac{\exp(\delta_j-c)}{\sum_{i\ne j} \exp(\delta_i-c)}\right)

for any choice c. I’ll recommend the conventional `logsumexp`

choice c=\max_i \delta_i. This version can take advantage of the special function `log1p`

to be more accurate for small arguments.

Note that we can equivalently (although risking some numerical cancellation in the implementation) write \sum_{i\ne j} \exp(\delta_i-c) = -\exp(\delta_j-c)+\sum_{i=1}^n \exp(\delta_i-c), where the right-side version allows the precomputation of the sum. Weigh that tradeoff of efficiency versus precision. Below, I have opted for the efficient version that risks that cancellation error in some edge cases:

```
expscale = let c=maximum(deltas); z->exp(z-c); end
sumexpdeltas = sum(expscale, deltas)
x = [log(sumexpdeltas) - log(sumexpdeltas - expscale(d)) for d in deltas] # initial version
x = [(ed = expscale(d); log1p(ed/(sumexpdeltas-ed))) for d in deltas] # with log1p trasform
```

EDIT: the cancellation error of the sum transformation I used is most-significant when the sum is dominated by a single term (one \delta_i is much bigger than all others). If this is a possibility, you’ll want to use the leave-one-out version of the summation instead. The `log1p`

version avoids a numerical error when \delta_j is much smaller than \max_i \delta_i.

EDIT 2: For the purposes of my previous statement and `Float64`

precision, a difference of 20 or 30 would be the threshold to consider a number “much bigger”. That isn’t necessarily a lot, so I suspect you’ll want to use both leave-one-out summation and the transformed `log1p`

equation to be safe.

EDIT 3:

Just a teensy bit more algebra yields

x_j = \log\left(1+\frac{1}{\sum_{i\ne j} \exp(\delta_i-\delta_j)}\right)

Given my previous numerical concerns, this is the version I’ll recommend you implement.

2 Likes