Hi everybody,
I would like to repeatedly apply (a variation of) the softmax function to parts of an array.
I would like to know if there are ways to leverage packages like Tullio, Yeppp, and other Julianic features as well to improve this.
For example, for each individual i, there are several J vectors x_{i,j} of K choice characteristics. The probability of choosing j over 1,...,J is given by the softmax function.
s_{i,j}(\theta) = \frac{\exp(x_{i,j}^\prime \theta)}{1 + \sum_h^N \exp(x_{i,h}^\prime \theta)}
The probability of not choosing anything (j=0) is
s_{i,0}(\theta) = \frac{1}{1 + \sum_h^N \exp(x_{i,h}^\prime \theta)}
Let y_{ij} = 1 if i purchases j and be zero otherwise. The log-likelihood of observing a sample of N individuals is
\frac{1}{N}\sum_{i=1}^N \sum_{j=0}^J y_{i,j} \log\left(s_{i,j}(\theta)\right).
A MWE where I try to apply the softmax! function repeatedly over the dataset is:
using TensorCore, BenchmarkTools, Tullio
# setup
N = 1000000    # observations
J  = 4      # options (there's an extra "outside option")
K  = 20      # product characteristics
# data
θ = randn(K)            # parameters
y = rand(0:10,J+1,N)    # purchases
x = rand(K,J,N)         # product characteristics
# softmax function
function softmax!(
    r::AbstractVector,
    xθ::AbstractVector
)
    J = length(xθ)
    u = maximum(xθ)
    s = 0.
    s += exp(-u)
    @inbounds for j = 1:J
        s += (r[j] = exp(xθ[j] - u))
    end
    invs = inv(s)
    @inbounds for j = 1:J
        r[j] *= invs
    end
end
# log-likelihood function
function llk(
    y::AbstractMatrix,  # (J+1) × N
    x::AbstractArray,   # K × J × N
    θ::AbstractVector   # K-
)
    K,J,N = size(x)
    r = zeros(J)
    llk = 0.
    xθ = θ ⊡ x
    for n=1:N
        softmax!(r,xθ[:,n])
        for j=1:J
            llk += y[j,n] * log(r[j])       # likelihood of choices 1,...,J
        end
        llk += y[J+1,n] * log(1.0 - sum(r)) # likelihood of outside good
    end
    return llk/N
end
@btime llk($y,$x,$θ)
Is there any way to leverage Tullio to obtain the choice probabilities? I was thinking about something that involves the entire array…
# softmax over entire array using Tullio?
xθ = θ ⊡ x
@tullio num[j,n] := exp(xθ[j,n])
@tullio denom[n] := num[j,n]
@tullio shares[j,n] := num[j,n] / (1.0 + denom[n])
@tullio shares0[n] := 1.0 / (1.0 + denom[n])
Thank you all, Julia community is very helpful!