Saving Probabilities in Log Domain (Constructor Issues)

I am working with probabilities that can get really low. So I am going to (optionally) store them in log domain and operate there. I do not want my function code to change whether I am using log domain or not. Note that plain probabilities are faster, where as log domain is more stable. I want my loss function to be able to handle both.

I first define log domain numbers.

import Base: +, *, log, convert

struct LogNum{F<:AbstractFloat} <: AbstractFloat
    v::F
end
*(a::LogNum, b::LogNum) = LogNum(a.v+b.v)
+(a::LogNum, b::LogNum) = LogNum(log(exp(a.v)+exp(b.v)))
log(a::LogNum) = a.v

Now I have a function that works on probabilities. I want the second argument to be the Type in which I want the computations done.

function loss(p::AbstractMatrix{<:AbstractFloat}, ::Type{T})   # T can be Float64 or LogNum{Float64}
    s = Array{T}(p[:, 1])  # When T === LogNum{Float64}
                           # This line should automatically take logarithms and save s as Vector{LogNum{Float64}}
                           # How do I write constructors to do that???
    for i in 2:size(p)[2]
        # Do some operations on s based on p[:, i] using *, +
    end
    log(s[end])
end

Array{LogNum}([0., 1., 2.]) should give me

3-element Array{LogNum,1}:
 LogNum{Float64}(-Inf)
 LogNum{Float64}(0.0)
 LogNum{Float64}(0.632)

How do I write the constructors so that my code in loss does not change?
How do I gracefully convert F<:AbstractFloat to LogNum{F}?

I think my problem is that the original number is Float64 and so is the stored value v, so I am unable to use dispatch properly to construct a LogNum.

Thanks.

Possibly, but this depends on the distribution (some calculations need logs), and exp is rather cheap anyway. FWIW, I would just use logs everywhere.

I would approach this differently, and define a version of loss that operates on LogNums, and then wrap that in a function that converts.

1 Like

@Tamas_Papp, I am trying to reduce code duplication, when I can use Julia’s powerful multiple dispatch system to run the exact same code in two domains, just based on type.
I say this jokingly: write two pieces of code? Like a poor python programmer? Nah!
My main intention is to get better at Julia, so this is kind of also an academic question.

My question is really simple. Forget the use case.
How do I write a good contructor for problems like this?

struct ExpNum{F<:AbstractFloat} <: AbstractFloat
    v::F
end
ExpNum(a::AbstractFloat) = ExpNum(exp(a))
ExpNum(0.)  # Need this to store the number 1.0 as `v`
# But leads to stack-overflow because of repeated calls to exp

I don’t think you understood what I suggested, which is along the lines of

function loss(p::AbstractMatrix{<:LogNum})
    # actual implementation goes here
end

function loss(p::AbstractMatrix{<:Real})
    loss(LogNum.(log.(p))
end

For robustness, I would not make LogNum <: AbstractFloat, and always require explicit conversion.

You can distinguish an inner constructor somehow, eg

struct ExpNum{F<:AbstractFloat} <: AbstractFloat
    v::F
    function ExpNum(x::T, ::Val{:already_exp}) where T
        new{T}(x)
    end
end
ExpNum(a::AbstractFloat) = ExpNum(exp(a), Val{:already_exp}())

Another solution is to use Base.convert for conversion to ExpNum and state in the documentation that the ExpNum constructor takes a number that has already been exponentiated.

Expressions like ExpNum[0., 1., 2.] should then work as expected, because Base always uses convert rather than constructors for conversion.

(I think this is the reason why convert exists in the first place.)

Example:

struct ExpNum{F<:AbstractFloat} <: AbstractFloat
    v::F
end

Base.convert(T::Type{<:ExpNum}, x) = T(exp(x))

Array{ExpNum{Float64}}([0., 1., 2.]) 
# Returns [ExpNum(exp(0)), ExpNum(exp(1)), ExpNum(exp(2))]

Thanks @Tamas_Papp, this is what I was looking for.

As for the use case, I want to be able to run the algorithm (ideally same code) in plain domain (not logged) and log domain. As plain-domain is much faster in my (very complicated) loss function and the user does not need robustness all the time and hence need not lose speed. So I want to keep that option open.

Now I have achieved that and it works wonderfully well. Same code! The magic of Julia!

I am skeptical about both claims: log / exp are not that costly, and if you do not use log probabilities then likelihoods will degrade (usually over/underflow) for trivial models (unless you employ a host of tricks which are orders of magnitudes more costly than log / exp. Intuitively, it is very similar to using eg Float16 for calculations — occasionally worth it, but very rarely.

Did you profile your code and found that log / exp and related transformations are a significant bottleneck?