How to join existing and new parametric types under a common abstract type?

I do not know how types work internally, so I’d like some advice on how to achieve high performance code. My script already takes into account the general performance tips: my helper functions _my_kernel_function(...) only receive Vectors and Matrixs of specific type, these types never change in such functions, and I try to minimize the allocations in each function.

I have existing types Matrix{tw} and SparseMatrixCSC{tw,tp}. I create new ones, like:

abstract type AbstractMat{tw} end; 

struct MatDPW{tu<:Integer, tv<:Integer, tw} <: AMat{tw}
    #= A sparse matrix data-structure in the dictionary-of-positions-and-weights format. Element dct[(u,v)]=w represents a matrix entry X[u,v]=w. =#
    size::Tuple{Int,Int} 
    dct::Dict{Tuple{tu,tv},tw} 
end;

I’d like to have Matrix{tw} and SparseMatrixCSC{tw,tp} also be subtypes of AbstractMat{tw}. How can this be achieved? I do not wish to wrap those two in a new struct just to be able to subtype it to AbstractMat{tw}, I’d have to redefine all existing methods on Matrix{tw} and SparseMatrixCSC{tw,tp} to the two new structs.

I’d like to minimize any performance penalties from passing my AbstractMat{tw}s to functions. Probably the following is not a good idea?

const global AbstractMat ::Type = Union{Matrix,SparseMatrixCSC, MatDPW}

julia> function my_func(X::AbstractMat{tw})  where {tw}
           return size(X)
       end
ERROR: TypeError: in Type{...} expression, expected UnionAll, got Type{AbstractMat}
Stacktrace:
 [1] top-level scope
   @ REPL[30]:1

Also, how do I create new functions that take as input Vectors of AbstractMats (e.g. Matrix and MatDPW can be in the same Vector)?

function SparseArrays.nnz(XX::Vector{<:AbstractMat{tw}})  where {tw}
    return sum(nnz(X) for X in XX)
end
function SparseArrays.nnz(XXX::Vector{Vector{<:AbstractMat{tw}}})  where {tw}
    return sum(nnz(XX[end]) for XX in XXX)
end

julia> nnz([X,Y])
48

julia> nnz([[X],[Y]])
ERROR: MethodError: no method matching nnz(::Vector{Vector})

I think this may be of help Types · The Julia Language

They already have their own supertypes in the type tree, so what you’re suggesting would make it not a tree, which is impossible.

You’re on the right track to make a type alias to a type union, which is outside the type tree. Don’t do the global and ::Type annotation like that, they don’t do what you think or anything good here.

The reason for this error is that a parametric method needs AbstractMat to be a parametric type, but it’s not because there’s no parameter in the type alias or the type union. This is what that looks like:

const AbstractMat = Union{Matrix{T},SparseMatrixCSC{T}, MatDPW{T}} where T

or with the shorthand:

AbstractMat{T} = Union{Matrix{T},SparseMatrixCSC{T}, MatDPW{T}}

Hard to check because you’re not providing a MWE at all here, but you can manually designate the element type of an Array if the automatic process doesn’t work out, like Vector{AbstractMat}[[X],[Y]]. [X] and [Y] should have the proper type, of course.

1 Like

Note that you don’t need an abstract super type for performance reasons. For performance you always need to have concrete types. This abstract supertype thing you seek is only useful for making defining dispatch easier. If you ever end up having a Vector{AbstractMat} you have a performance problem that’s basically the same as if you had a Vector{Any}. Now in a method singature you might want to dispatch on Vector{<:AbstractMat} and that makes sense but does not have performance implications by itself.

1 Like

Perhaps this would solve your problem:

You would still have to wrap your objects at the time of construction. But there would be no need to redefine methods.

To be clear, this has no impact on performance in most situations. Type annotations on function arguments are purely for dispatch reasons. When a function is called, the compiler produces (if it has not already) a version custom-built for the precise set of input types regardless of your annotations (except in a few notable cases).

This is not strictly required. Although we use it, “type stability” is perhaps not the best name for what is required. What is required is type predictability: that at every point in the code, the compiler knows the type of the involved objects.

For example, there’s nothing wrong with

function foo(x::Number)
  y = 1
  x = x + y
  y = 3//2
  x = x * y
  return x
end

even though x and y may be bound to multiple types throughout the lifetime of this function, at every line the compiler can figure it out. Some common places where one might introduce type instability include

  • if statements where the type of a variable depends on which branch is taken, when the branch cannot be decided at compile time
  • for/while loops where the type gets promoted during the loop, which means that it might be one type for the first loop and another for the remaining loops
  • accessing elements from non-concretely-typed containers, such as Vector{Number}, Vector{Any}, or a custom struct with a non-concretely-typed field

By the last point, it is still very important to concretely type field types in a struct (or to have them parametrically typed and those parameters set to concrete types).

Your definitions for SparseArrays.nnz are piracy and might break things (not just your things, but other far-away code too) in confusing and awful ways. I would suggest you define your own function like

nnz_recursive(x::AbstractArray{<:AbstractArray}) = sum(nnz_recursive, x) # on arrays of arrays, count nonzeros recursively
nnz_recursive(x::AbstractArray) = nnz(x) # on other arrays, use nnz
2 Likes

This is not strictly required. Although we use it, “type stability” is perhaps not the best name for what is required. What is required is type predictability: …

This section and the example was quite useful, much appreciated!

Some common places where one might introduce type instability include…
So generally, to low-level inner helper/kernel functions should have all types known, and the outer wrapper functions may have some unknown types at compile time? I’m looking at SparseArrays.jl and LinearAlgebra.jl specifically, trying to model my code after them.

When a function is called, the compiler produces (if it has not already) a version custom-built for the precise set of input types regardless of your annotations

By this logic, if I define function f(x) return x^2 end and call it with f(1) and f(1.0), that should create 2 methods, right? I’m seeing just 1:

julia> methods(f)
# 1 method for generic function "f" from Main:
 [1] f(x)
     @ REPL[1]:1

Your definitions for SparseArrays.nnz are piracy and might break things (not just your things, but other far-away code too) in confusing and awful ways. I would suggest you define your own function

Hmm, if I define new types of matrices, isn’t it better style to extend existing functions like nnz to those types? (and of course watch out that I do not overload any existing methods) Otherwise I’d just be polluting the namespace with many new names that do the same thing as an existing function. Also, AFAIK, method nnz(Vector} does not exist yet.

Two method instances AFAIK.

Perhaps, but that’s not what happened above - you didn’t create Vector.

2 Likes

As the previous commenter noted: this creates one method (as you saw) but two method instances. A method has one set of source code, but can be compiled to as many method instances as are required.

julia> methods(f)[1].specializations
svec(MethodInstance for f(::Int64), MethodInstance for f(::Float64), nothing, nothing, nothing, nothing, nothing, nothing)

I can’t tell you exactly how this svec is organized (it appears to be filled in order of use, and I assume it would be lengthened if necessary), but you can see that there are two method instances. You can call it with additional input types (e.g., f(1//2)) and see additional instances added to the list.


While SparseArrays.nnz(::Vector) is undefined by SparseArrays.jl, that doesn’t mean someone else couldn’t try to define it in some other package. If you and that other person both redefined it, which one should Julia use? It will pick one (as I recall, it will usually use whichever one is defined most recently). But in any case, it is likely it won’t do precisely what both you and the other definition wanted so problems may emerge (loudly or silently). You can get lucky with piracy much of the time, but there’s no telling when an update to some far-away code might break things.

It is not piracy if you own the function and/or one of the dispatched types. Because you own at least one of those, nobody else can accidentally shadow your methods without violating the “own at least one” rule. SparseArrays owns SparseArrays.nnz and Base owns Base.Vector, so to define the methods you did is piracy.

This is why I suggest a new function (such as MyModule.nnz) that you control. That function can have the fallback MyModule.nnz(x::Any) = SparseArrays.nnz(x) so that it behaves exactly like SparseArrays.nnz except where you override it. Since you own MyModule.nnz, you can create whatever methods you want without the risk of breaking other people’s code (assuming they don’t attempt to pirate your function).

By these rules, and as the above commenter noted, you could safely define SparseArrays.nnz(::MyArrayType) without piracy if you own MyArrayType.

2 Likes

Good point about method and struct ownership, I hadn’t thought about potential clashes so far ahead. I will rename my method as suggested.

The info on method instances is also most useful. Appreciate the advice!

1 Like