Issues with incomplete parameterized types

I thought I would build up a data structure with intermediate values I needed for a computation, to which I would later add the final results and return the whole thing. I’ve attempted to add type information, but some of the types are themselves incomplete, and some of the types involve data that is created after I create the results structure.

The first issue is whether I should be trying to do this at all. An alternative would be to create one type that holds intermediate data and another that holds the later results, including a reference to the intermediate data. This seems a little clunky because the consumer of the result then gets some values by going 2 levels down and some by going 1 level down. Of course, that could be hidden with accessor functions (tedious to program) or the flattening could be done at the final creation of the result object (also tedious, and creating considerable redundancy).

I tried

abstract type AbstractResult end

mutable struct SimpleResult{T <: Real, AllComboType} <: AbstractResult
    β::NamedVector{T} # coefficents
    se::NamedVector{T}  # std errr of coefficients
    vcv::NamedMatrix{T} # variance covariance matrix
    H::NamedMatrix{T} # Hessian
    r::Optim.OptimizationResults  # results of optimization
    nms  # names of individual terms, likely strings, possibly symbols
    ### below here are intermediate values for the calculation
    mm  # rhs model (design) matrix, though may not literally be a Matrix
    allcombo::AllComboType  # all combinations of categorical variables in outcome
    s  # schema
    ts # terms after applying schema
    w::Vector{T}  # weights, normalized to sum  to sample size
    function IntermediateSimpleResult{T}(mm, allcombo::AllComboType, s, ts)
        a = new{T, AllComboType}()
        a.mm = mm
        a.allcombo = allcombo
        a.s = s
        a.ts = ts
        return a
    end
end

The compiler says
LoadError: UndefVarError: IntermediateSimpleResult not defined

Raising a lot of questions:

  1. Does the inner constructor need to have the same name as the type?
  2. How to handle the relation between type information for the inner constructor and the type? Should I repeat the types for the inner constructor? Is the implicit useage in the above code OK, i.e., AllComboType is inferred from the argument to the inner constructor? Can I omit the types entirely from the inner constructor as they are implicitly inherited form the struct? There are quite a few spots type information could go (the struct definition; the inner constructor definition, the invocation of new inside the inner constructor, and the invocation of the inner constructor from somewhere out in the program) and I’m not sure which are required/allowed/prohibited.
  3. Does it matter if the type names are the same or different between the type and the inner constructor? Parametric inner constructor inherits parameter from global scope indicates some surprises with the namespace used for type parameters.*
  4. I’m concerned that I don’t have the specifics of, e.g., the OptimizationResults at hand (it’s an abstract type), so that even if I get past these initial problems that will be a hang-up. Does the system require concrete types for everything before creating a compound object?

By the way, one reason I don’t want to pass around tuples is that I want to dispatch on the type of the object for my intermediate calculations.

* I think I would expect that each parameter in a parametric type, e.g., T, is treated as a new symbol within the scope for which it applies. So inside mutable struct SimpleResult, T is a new type, unrelated to any uses outside the scope. Apparently that’s not how it works. The rule I was expecting would also imply that the T used in the function IntermediateResult would in turn hide the T used in the structure definition. That would make it hard to link the two T’s, which may be one reason it doesn’t work as I expect.

Probably not. Type information benefits the compiler when it is concrete, or a small Union of such types. Anything else is fine with Any unless you use it for dispatch.

In order of your questions:

  1. nope, the constructor can have a different name (even though in practice it is not very common). Your error comes from an incorrect f{T}(...) function syntax which is now obsolete, use f(...) where {T}. See the manual chapter on functions.

  2. Think of the constructor as a function or a callable. Your interface design should determine how much type information the user should/would want to provide, and how much can be inferred from arguments. Parametric constructors are special cases because you can include type parameters in the callable, but with IntermediateSimpleResult this is not possible, as there is no such type, it’s just a plain function (with access to new).

  3. No, it does not, eg

    struct Foo{T}
        Foo(::Type{Z}) where Z = new{Z}()
        Foo{S}() where S = new{S}()
    end
    

    will work fine. But it is better style to be consistent.

  4. Yes, that could be tricky. Type calculations always are. If possible, redesign your interface not to preallocate, if not, write a small auxiliary function to do it for you. Concrete type parameters are not required, but then you lose some compiler optimizations.

Keep in mind that Julia’s parametric type system is extremely rich and it is easy to get carried away with when encountering it for the first time. Avoiding abstract type parameters helps with performance, the rest should be kept simple when possible.

2 Likes