Type stable return types

Hello, I have a question regarding type-stability and good practices.

Let’s say I have a function/constructor that takes a params::Dict which has some optional fields and returns a Session struct instance with the following fields:

struct Session
  previous::Union{Nothing, State}
  language::Symbol
  complete::Bool

And then I realise I don’t want to be manually checking fields on my types later and I would rather encode this in the type system:

abstract type AbstractSession end

struct NewSession <: AbstractSession
      language::Symbol
     complete::Bool
 end

# complete = true
struct CompleteSession <: AbstractSession
    previous::State
    language::Symbol
end

# complete = false
struct ActiveSession <: AbstractSession
    previous::State
    language::Symbol

end

Then I can have a constructor sort of like this, analysing the params:

function Session(params::Dict)::AbstractSession
    if haskey(params, :previous)
        if params[:complete] == true
            CompleteSession(params)
        else
            ActiveSession(params)
        end
    else
        NewSession(params
    end
end

which it’s obviously type-unstable due to the if-else branching.

Note: For simplicity I have ommited the step of taking a Dict to construct an object.

So, a few questions:

  • Is this an anti-pattern? Using the type system to exactly model your data seems like a desirable thing to do. On the other hand, having more types will lead to more inference down the line, so I suppose there is always a compromise.
  • Is there any way around this?
  • Am I overthinking this and type unstability is something you only need to care if it’s a problem?

I could imagine a helper macro that generates constructos from NamedTuple with exactly the right arguments for each type, but this seems like just moving where the unstability happens, the Dict would still need to be transformed into a NamedTuple.

As I said, maybe I am overthinking this and type-unstability is sometimes a necessary compromise. The improvement on code-readability / safety probably outweighs the performance cost in my use-case.

Still, would be nice to know what are good practices, work-arounds or if have any misconceptions about performance and type-stability.

Thank you in advance!

If what you do is to use this structure later as a parameter of other function, that one being performance critical, my opinion is that what you are doing is completely fine. The construction of a problem/system is a place where instabilities can occur. Julia has the feature that after that step you can have functions specialized to the types chosen by the user (and that avoid branches by using multiple dispatch), with possible performance benefits in general.

1 Like

Well, I think my usecase is fundamentally type-unstable: a conversational agents that executes different code-paths based on natural language processing and state. So it’s kind of a pipeline where a request comes and goes through different stages before producing an answer.

I don’t have any performance problems, so this was more of a theoretical question / desire to learn about good practices / see if I had any fundamental misconceptions.

1 Like

A little bit of type-instability can be fine (and very convenient) even in performance-sensitive code. The key is to understand how you can structure your code to reduce the cost of type-instability. For a general outline, see this section on “function barriers”: Performance Tips · The Julia Language

Basically, if your code looks something like this:

session = Session(params)
do_something_expensive(session)

then even if Session(params) is type-unstable, you’ll only pay for that once per call to do_something_expensive. Within the body of do_something_expensive, the type of session will be known (since you’ve already called it!), so there’s no impact on performance at all within that function. That means that if do_something_expensive takes more than a few microseconds, you may not notice any slowdown from type-instability at all.

On the other hand, if your code looks something like this:

for i in 1:many
  session = Session(params)
  do_something_cheap(session)
end

then you’ll pay the price of Session’s type-instability many times, and you may see a significant slowdown.

4 Likes