How to handle "optional data"?


#1

Hi everyone,

Let’s say I have a package that provides some functions. Of these functions only a few depend on some global data downloaded from the internet. The package should be able to download the data automatically but should not force this on the user (e.g. calling it in __init__) because they might already have the data elsewhere. Also the rest of the package should work even if the data is not available.

The question is what would the type of this “optional data” be?

module SomeModule

# What am I?
const data = ?

# I don't need the data
f(x) = 1.0x

# I don't need the data
g(x) = 2.0x

# I need the data and will error if it's not here
h(x) = data*x

end

I have experimented with several solutions and would like to know which one you would prefer.

  1. Nullable

    Works in principle but gives a constant redefinition warning when the data is made available.

    const data = Nullable{TypeOfData}()
    
    function download()
        global data
        file = download(...)
        # Will give a constant redefinition warning
        data = Nullable(TypeOfData(file))
    end
    
    function h(x)
        if isnull(data)
            error()
        end
        get(data)*x
    end
    
  2. Wrap Nullable in a mutable type

    type OptionalData{T}
        data::Nullable{T}
    end
    
    OptionalData{T}(::Type{T}) = OptionalData(Nullable{T}())
    
    function push!{T}(opt::OptionalData{T}, data::T)
        opt.data = Nullable(data)
        opt
    end
    
    get(opt::OptionalData) = get(opt.data)
    isnull(opt::OptionalData) = isnull(opt.data)
    
    const data = OptionalData(TypeOfData)
    
    function download()
        file = download(...)
        push!(data, TypeOfData(file))
    end
    
    function h(x)
        if isnull(data)
            error()
        end
        get(data)*x
    end
    
  3. Just use a Ref

    const data = Ref{TypeOfData}()
    
    function download()
        file = download(...)
        data[] = TypeOfData(file)
    end
    
    function h(x)
        if !isassigned(data)
            error()
        end
        data[]*x
    end
    

#2

I would prefer the Ref solution. Alternatively wrapping the Nullable in a Ref gves you the same as 2.


#3

Why do you want to declare data as “constant” when it is not?
data = Nullable{TypeOfData}() will work fine, as long as you don’t prefix it with const.

Put a type assertion in h() to get efficient code.

h(x) = get(data::Nullable{TypeOfData})*x

#4

A trick I have used several times, which may not be ideomatic/efficient is to define a global const data = TypeOfData[]. You can then check isempty(data) and work on data[1].


#5

While I initially used the Ref pattern I repeated myself quite often. So in the end I used approach 2 and turned it into a package: https://github.com/helgee/OptionalData.jl