Softmax applied to arrays?

dadada_dadada · April 14, 2021, 7:30pm

Hi everybody,

I would like to repeatedly apply (a variation of) the softmax function to parts of an array.

I would like to know if there are ways to leverage packages like Tullio, Yeppp, and other Julianic features as well to improve this.

For example, for each individual i, there are several J vectors x_{i,j} of K choice characteristics. The probability of choosing j over 1,...,J is given by the softmax function.

s_{i,j}(\theta) = \frac{\exp(x_{i,j}^\prime \theta)}{1 + \sum_h^N \exp(x_{i,h}^\prime \theta)}

The probability of not choosing anything (j=0) is

s_{i,0}(\theta) = \frac{1}{1 + \sum_h^N \exp(x_{i,h}^\prime \theta)}

Let y_{ij} = 1 if i purchases j and be zero otherwise. The log-likelihood of observing a sample of N individuals is

\frac{1}{N}\sum_{i=1}^N \sum_{j=0}^J y_{i,j} \log\left(s_{i,j}(\theta)\right).

A MWE where I try to apply the softmax! function repeatedly over the dataset is:

using TensorCore, BenchmarkTools, Tullio

# setup
N = 1000000    # observations
J  = 4      # options (there's an extra "outside option")
K  = 20      # product characteristics

# data
θ = randn(K)            # parameters
y = rand(0:10,J+1,N)    # purchases
x = rand(K,J,N)         # product characteristics

# softmax function
function softmax!(
    r::AbstractVector,
    xθ::AbstractVector
)
    J = length(xθ)
    u = maximum(xθ)
    s = 0.
    s += exp(-u)
    @inbounds for j = 1:J
        s += (r[j] = exp(xθ[j] - u))
    end
    invs = inv(s)
    @inbounds for j = 1:J
        r[j] *= invs
    end
end

# log-likelihood function
function llk(
    y::AbstractMatrix,  # (J+1) × N
    x::AbstractArray,   # K × J × N
    θ::AbstractVector   # K-
)
    K,J,N = size(x)
    r = zeros(J)
    llk = 0.
    xθ = θ ⊡ x
    for n=1:N
        softmax!(r,xθ[:,n])
        for j=1:J
            llk += y[j,n] * log(r[j])       # likelihood of choices 1,...,J
        end
        llk += y[J+1,n] * log(1.0 - sum(r)) # likelihood of outside good
    end
    return llk/N
end

@btime llk($y,$x,$θ)

Is there any way to leverage Tullio to obtain the choice probabilities? I was thinking about something that involves the entire array…

# softmax over entire array using Tullio?
xθ = θ ⊡ x
@tullio num[j,n] := exp(xθ[j,n])
@tullio denom[n] := num[j,n]
@tullio shares[j,n] := num[j,n] / (1.0 + denom[n])
@tullio shares0[n] := 1.0 / (1.0 + denom[n])

Thank you all, Julia community is very helpful!

dadada_dadada · April 14, 2021, 7:31pm

BTW, how can I write centralized typeset equations here?

mcabbott · April 14, 2021, 9:56pm

You may want NNlib.softmax, or more specifically something like dot(logsoftmax(θ ⊡ x; dims=1), @view y[2:end,:])/N. That’s not quite right, as you have this extra zeroth case; perhaps you can adapt it to keep the “denominator”. But working with solid arrays not slices, and directly working out the log, are likely to help.

Topic		Replies	Views
Softmax using Tullio General Usage	2	410	November 6, 2022
Gumbel Softmax in Julia General Usage question , package	4	1294	July 12, 2019
Alternativa to softmax? Machine Learning question	1	524	May 18, 2020
Maximizing an objective function with many logs Performance tullio	6	723	April 14, 2021
Fast LogSumExp over 4th dimension Performance tullio , loopvectorization	36	2230	October 21, 2022

Softmax applied to arrays?

Related topics