Proper declaration of function accepting Dicts with mixed number types

I have a function that accepts a dictionary as input where the arguments will always be string keys with values that are a vector of real numbers. They may in practice be integers or floats or whatever. What is important is that this function does not work if the arguments are not real numbers (So a vector of strings, or complex, etc, are not valid inputs)

julia> function foo(input_dict::Dict{String, <:AbstractVector{<:Real}})
           println(input_dict)
       end
foo (generic function with 1 method)

If I declare a dictionary like so

julia> a = Float32.(rand(3));
julia> b = Float64.(rand(3));
julia> c = Integer.(ceil.(rand(3).*10));
julia> my_dict = Dict("key1"=>a, "key2"=>b, "key3"=>c)
Dict{String, Vector} with 3 entries:
  "key2" => [0.223777, 0.394267, 0.0128682]
  "key3" => [6, 3, 10]
  "key1" => Float32[0.788629, 0.146732, 0.93549]

then

julia> typeof(my_dict)
Dict{String, Vector}

and the call to foo will fail, it seems because of the fact that my_dict has the type {String, Vector} (without the extra qualifier on type).

julia> foo(my_dict)
ERROR: MethodError: no method matching foo(::Dict{String, Vector})
Closest candidates are:
  foo(::Dict{String, <:AbstractVector{<:Real}}) at REPL[5]:1
Stacktrace:
 [1] top-level scope
   @ REPL[6]:1

On the other hand, this will work fine if all the value entries in the dict have the same type

julia> a = rand(3);
julia> b = rand(3);
julia> my_other_dict = Dict("a"=>a, "b"=>b)
Dict{String, Vector{Float64}} with 2 entries:
  "b" => [0.700047, 0.765836, 0.603907]
  "a" => [0.623392, 0.476285, 0.500553]

julia> foo(my_other_dict)
Dict("b" => [0.7000467111406179, 0.7658355327841554, 0.6039073311484552], "a" => [0.6233916490131216, 0.4762846111664425, 0.500552988508849])

This function is supposed to be part of a package, so ideally the fix involves changing the definition of foo rather than requiring the dictionary to be typed during creation.

What I’m wondering about is the “correct” way to fix this in terms the most Julia-esque way of doing it. I’m not terribly concerned about efficiency as this isn’t going to be handling high volumes of data.

My first thought was to do something with overloading. I.E

function foo(key::String, value::AbstractVector{<:Real})
...
end
# or...
function foo(input::Dict{String, <:AbstractVector{<:Real}})
...
end
# Then...
function foo(input::Dict{String, <:AbstractVector})
    for (key, value) in input
        foo(key, value)
    end
    # Or perhaps could also construct a dict for each entry?
    for (key, value) in input
        foo(Dict(key=>value))
    end
end

My only problem with this approach is that foo ultimately deals with writing to a file that already exists. It wouldn’t be great to have it write the first few keys and then fail halfway through because one key has the wrong type, which was why I was hoping to structure foo in a way that it fails before trying to write anything at all.

what about

foo(input::Dict)

That would ultimately still allow “bad” inputs though, wouldn’t it?

The file format can only support String keys and Vector{<:Real} values, so I’d want to avoid, for example, Dict{Integer, String}. Perhaps the solution is some sort of sanity check within foo rather than trying to type foo such that it can’t be called with bad inputs?

1 Like

Because the values are abstractly typed, anything using this dictionary will be type-unstable. You might as well just use foo(input::AbstractDict) and duck type.

The basic issue here is that:

julia> typejoin(Vector{Int}, Vector{Float64})
Vector

and is not any more specific type.

if you want to avoid bad inputs at the type level, then you have to error on

julia> Vector
Vector (alias for Array{T, 1} where T)

you can’t have it both ways

The most Julian way is to allow bad inputs, and let it error when those non-reals are used in a real context.

4 Likes

I would like to add that checking arguments should be made at the call site, not inside a function.

You can provide a helper function to validate arguments.

e.g.


function foo_bad(a) 
   if isa(a, String)
       throw("a cannot be a string")
   end
   a * 3
end

because then everyone pays the price for the check


julia> check_a(a) = isa(a, String) && throw("a cannot be a string") || true
julia> foo_good(a) = a * 3
julia> check_a(1) && foo_good(1)
3

Note that if you’re checking a type there to disallow it, having it in the function is perfectly fine. Type based checks are typically eliminated, since they’re static information that can be eliminated in most cases. Putting that check in another function is dangerous, since you may forgot to check that invariant at some places.

Obviously this is just opinion. I’m from the old school where writing functions which assume the validity of arguments is a design principle

I’ve always taken this in context of an internal function, where you’ve validated it already. When writing user facing API, checking invariants is usually a good idea.

This is still a valid principle, it’s just not enforced at function definition by the language. Like Python, Julia is a dynamic language.

I’m also an “old timer” and struggled with this for a while. But types in Julia are meant for dispatching and specialization, not for structural invariant checking. You can, however, add checks to the function to assert your expectations as needed.

Actually, the Julian way is duck typing - to have as little type information as possible in function definitions to allow composability.

These are all great discussion! Thank you for the replies!

My main issue is that letting the wrong stuff get through to the file could result in a corruption issue (Or rather, non-conformity with the spec - strictly speaking I think the format itself can handle other datatypes as long as the keys are strings). It sounds like the best way to approach this will be to add in a validity check that is called within the write function before the actual writes take place to make sure everything that is to be written has a supported type, so that’s the route I will go.

Thank you for all the responses!

2 Likes

It is a shame we don’t have domains applied to types too.

e.g. something like a::Int(range=1:100) which would restrict a to the values 1:100 inclusive

You can make your own type that does this. I don’t see how this could be made a part of the native machine integer type…

1 Like