Are generic types worth the complexity?

When writing methods for a function in my package, I used abstract type declarations for two reasons:

  1. Type declarations will ensure a user gets MethodError rather than some other internal error if they use the wrong type.
  2. I heard generic types are appreciated in packages, so people can use your function with their own subtypes.

However, now I am having trouble remembering how to use my own function let alone making it user friendly for others.

VSCode's pop-up help is confusing.

image

The output of `methods` is unreadable.
julia> methods(polyfit)
# 12 methods for generic function "polyfit":
[1] polyfit(x::AbstractArray{var"#s38", N} where {var"#s38"<:Number, N}, y::AbstractArray{var"#s37", N} where {var"#s37"<:Number, N}) in ASME_Section8_Division3_Edition2019 at C:\Users\nboyer.AIP\.julia\packages\ASME_Section8_Division3_Edition2019\aIvuO\src\fracture.jl:193
[2] polyfit(x::AbstractArray{var"#s11", N} where {var"#s11"<:Number, N}, y::AbstractArray{var"#s10", N} where {var"#s10"<:Number, N}, knots::AbstractArray{Int64, N} where N) in ASME_Section8_Division3_Edition2019 at C:\Users\nboyer.AIP\.julia\packages\ASME_Section8_Division3_Edition2019\aIvuO\src\fracture.jl:102
[3] polyfit(x::AbstractArray{var"#s8", N} where {var"#s8"<:Number, N}, y::AbstractArray{var"#s7", N} where {var"#s7"<:Number, N}, knots::AbstractArray{Int64, N} where N, polyorder::Int64) in ASME_Section8_Division3_Edition2019 at 
C:\Users\nboyer.AIP\.julia\packages\ASME_Section8_Division3_Edition2019\aIvuO\src\fracture.jl:102
[4] polyfit(x::AbstractArray{var"#s4", N} where {var"#s4"<:Number, N}, y::AbstractArray{var"#s2", N} where {var"#s2"<:Number, N}, knotsets::AbstractArray{var"#s1", N} where {var"#s1"<:(AbstractArray{Int64, N} where N), N}) in ASME_Section8_Division3_Edition2019 at C:\Users\nboyer.AIP\.julia\packages\ASME_Section8_Division3_Edition2019\aIvuO\src\fracture.jl:140
[5] polyfit(x::AbstractArray{var"#s18", N} where {var"#s18"<:Number, N}, y::AbstractArray{var"#s19", N} where {var"#s19"<:Number, N}, knotsets::AbstractArray{var"#s20", N} where {var"#s20"<:(AbstractArray{Int64, N} where N), N}, polyorder::Int64; fitchoice) in ASME_Section8_Division3_Edition2019 at C:\Users\nboyer.AIP\.julia\packages\ASME_Section8_Division3_Edition2019\aIvuO\src\fracture.jl:140
[6] polyfit(x::AbstractArray{var"#s36", N} where {var"#s36"<:Number, N}, y::AbstractArray{var"#s35", N} where {var"#s35"<:Number, N}, npolys::Int64) in ASME_Section8_Division3_Edition2019 at C:\Users\nboyer.AIP\.julia\packages\ASME_Section8_Division3_Edition2019\aIvuO\src\fracture.jl:193
[7] polyfit(x::AbstractArray{var"#s34", N} where {var"#s34"<:Number, N}, y::AbstractArray{var"#s33", N} where {var"#s33"<:Number, N}, npolys::Int64, polyorder::Int64; fitchoice) in ASME_Section8_Division3_Edition2019 at C:\Users\nboyer.AIP\.julia\packages\ASME_Section8_Division3_Edition2019\aIvuO\src\fracture.jl:193
[8] polyfit(xsets::AbstractArray{var"#s36", N} where {var"#s36"<:(AbstractArray{var"#s35", N} where {var"#s35"<:Number, N}), N}, ysets::AbstractArray{var"#s34", N} where {var"#s34"<:(AbstractArray{var"#s33", N} where {var"#s33"<:Number, N}), N}) in ASME_Section8_Division3_Edition2019 at C:\Users\nboyer.AIP\.julia\packages\ASME_Section8_Division3_Edition2019\aIvuO\src\fracture.jl:218
[9] polyfit(xsets::AbstractArray{var"#s18", N} where {var"#s18"<:(AbstractArray{var"#s11", N} where {var"#s11"<:Number, N}), N}, ysets::AbstractArray{var"#s10", N} where {var"#s10"<:(AbstractArray{var"#s8", N} where {var"#s8"<:Number, N}), N}, knotsets::AbstractArray{var"#s7", N} where {var"#s7"<:(AbstractArray{Int64, N} where N), N}) in ASME_Section8_Division3_Edition2019 at C:\Users\nboyer.AIP\.julia\packages\ASME_Section8_Division3_Edition2019\aIvuO\src\fracture.jl:159
[10] polyfit(xsets::AbstractArray{var"#s6", N} where {var"#s6"<:(AbstractArray{var"#s5", N} where {var"#s5"<:Number, N}), N}, ysets::AbstractArray{var"#s4", N} where {var"#s4"<:(AbstractArray{var"#s2", N} where {var"#s2"<:Number, 
N}), N}, knotsets::AbstractArray{var"#s1", N} where {var"#s1"<:(AbstractArray{Int64, N} where N), N}, polyorder::Int64; sortby, fitchoice) in ASME_Section8_Division3_Edition2019 at C:\Users\nboyer.AIP\.julia\packages\ASME_Section8_Division3_Edition2019\aIvuO\src\fracture.jl:159
[11] polyfit(xsets::AbstractArray{var"#s32", N} where {var"#s32"<:(AbstractArray{var"#s31", N} where {var"#s31"<:Number, N}), N}, ysets::AbstractArray{var"#s30", N} where {var"#s30"<:(AbstractArray{var"#s29", N} where {var"#s29"<:Number, N}), N}, npolys::Int64) in ASME_Section8_Division3_Edition2019 at C:\Users\nboyer.AIP\.julia\packages\ASME_Section8_Division3_Edition2019\aIvuO\src\fracture.jl:218
[12] polyfit(xsets::AbstractArray{var"#s28", N} where {var"#s28"<:(AbstractArray{var"#s26", N} where {var"#s26"<:Number, N}), N}, ysets::AbstractArray{var"#s25", N} where {var"#s25"<:(AbstractArray{var"#s24", N} where {var"#s24"<:Number, N}), N}, npolys::Int64, polyorder::Int64; sortby, fitchoice) in ASME_Section8_Division3_Edition2019 at C:\Users\nboyer.AIP\.julia\packages\ASME_Section8_Division3_Edition2019\aIvuO\src\fracture.jl:218

Moreover, I realized that to be truely generic all of the AbstractArray types should be replaced by Union{AbstractArray, Tuple} which would make the readability problem much worse.

Simplifying: I have a function f(x) where x could be a number, a collection of numbers, or a collection of a collection of numbers. Each method is different than just broadcasting the previous. The collections will probably be Vectors. The numbers will probably be Floats. How would you define the methods?

TLDR: Is it better to declare an expected readable type signature for a method than to make any usable type work?

I think a good related reference could be Style Guide · The Julia Language.

Personally, I think that the “role” of the input should be clear, so allowing f to accept either numbers, list of numbers, or list of lists of numbers (which all do different things) may cause unnecessary complexity. It’s natural to think “of course it’s clear what the user means when writing this”, but supporting all those cases may complicate things a lot.

If your function requires lists of lists of numbers (I imagine that’s the more general case), than that should be the input. If all iterators work, you should probably not restrict the type, otherwise restrict as much as the function f needs. In other words, if the most natural way of writing f forces the user to pass inputs of a certain type, then IMO that’s a fair restriction.

1 Like

It’s worth it to at least annotate argument types as AbstractVector or alike. Because it won’t be long before someone would like to pass an array view to a function that only accepts Vectors.

I would not recommend constraining the element type of collection at all because e.g. units from Unitful.jl do not subtype Number but may be used as number-like values in many circumstances.

I agree that it’s a bit disturbing, but couldn’t find a better solution :man_shrugging:

3 Likes

I have on open issue related to this:

1 Like

To explain my current situation a little more:

  • f(x::Number) - do a pre-processing step then call f() normally. (The variable is not actually called x in this case, but sits in the same position.)
  • f(x::AbstractArray{<:Number}) - the basic function
  • f(x::AbstractArray{<:AbstractArray{<:Number}}) - call the basic function repeatedly, find the best case, then apply the best case parameters to all inputs.

The third case is the most common and general, but I think it would be strange to force the user to wrap their x in brackets to get the basic function too.

How can I not restrict the type of the container or elements, but still allow different methods based on how many containers exist?

I could be wrong, of course, since I don’t know the details of your use case, but the third method in your list kind of sounds like it should be a function with a different name. For example, your second and third bullets sound kind of like this:

  • fit : fit one model.
  • cross_validate : fit multiple models and compare results.

Sometimes I write functions like this:

  • run_once : the basic function.
  • run : run the basic function multiple times and do other things (i.e. more complicated than just broadcasting run_once over a collection).

Of course the base function name is usually more specific than run.

1 Like

Yeah, that makes sense. I wanted to make it easy on the user, so they only had to remember one function name. They could pass it anything and get something sensible in return, but maybe that’s not how multiple dispatch is used in practice.

My two cents: first, I agree with @CameronBieganek , apparently the functions could have different names.

But the signatures will always be limited on the information they can provide. Thus, documentation is always your friend. With proper documentation, on VSCode, just by sliding over the function call it shows the documentation, which can explain with clarity what the function calls do.

Actually it is, take a + function of base, for example. Or *, these call a whole lot of different methods, and it is on the documentation that you realize how to use them in each case. But note that * has 328 methods, but the documentation of a few cases with notably different behaviors is what is needed.

Another point of view is that you do not need to rely on multiple dispatch for everything. You can use conditionals as well:

function f(x)
    if x isa Number
          ...
    elseif x isa AbstracArray || x isa Tuple
          ...
    else
          error("Input of f must be of types .... ")
    end
end

Custom error messages help users a lot in many cases.

You can use collect to turn scalars, tuples, and vectors into vectors. There’s also RecursiveArrayTools for dealing with arrays of arrays.