Making type stable code

Hello,

I am struggling with making NamedArrays more type stable. I need to implement getindex() for the same range of slices that AbstractArray can handle. For the backing array that NamedArrays wraps, this is easy. However, I also need to compute the names of the indices of the slices. The types of the names that survive the sliced dimensions determine the type of the resulting NamedArray. In my current implementation, the @inferred type of such a NamedArray-slice is Any, which is not good.

I don’t really know what operations I can use that help the type-inference, and which operation I should definitely avoid. Is there some guidance toward this?

Thanks,

—david

Can you give an MWE?

Ehm, a NWE, the problem being namedgetindex()

## uniqnames computes the names of the indices, ensuring they are unique, because they are used as an index
## single index
uniquenames(dict::Associative, i::Integer) = names(dict)[i] ## inefficient
uniquenames(dict::Associative{K,V}, i::K) where {K,V} = i

## multiple indices
uniquenames(dict::Associative, i::AbstractArray{<:Integer}) = makeunique(names(dict)[i])
uniquenames(dict::Associative{K,V}, i::AbstractArray{K}) where {K,V} = makeunique(i)

function makeunique(i::AbstractArray{T}) where T
    length(i) == length(Set(i)) || error("Cannot make elements unique")
    return i
end
function makeunique(index::AbstractArray{S}) where S<:AbstractString
    counts = DefaultDict{S,Int}(0)
    for i in index
        counts[i] += 1
    end
    for (k, v) in counts
        if v == 1
            delete!(counts, k)
        else
            counts[k] = 0
        end
    end
    res = similar(index)
    for (k, i) in enumerate(index)
        if i in keys(counts)
            res[k] = i * "." * string(counts[i])
        else
            res[k] = i
        end
        counts[i] += 1
    end
    return res
end
makeunique(index::AbstractArray{Symbol}) = Symbol.(makeunique(String.(index)))

## namedgetindex collects the elements from the array, and takes care of the index names
## `index` is an integer now, or an array of integers, or a cartesianindex
## and has been computed by `indices()`

dimkeepingtype(x) = false
dimkeepingtype(x::AbstractArray) = true
dimkeepingtype(x::Range) = true
dimkeepingtype(x::BitArray) = true

## Slices etc.
function namedgetindex(n::NamedArray, index...)
    a = getindex(n.array, index...)
    newnames = [uniquenames(n.dicts[k], i) for (k, i) in enumerate(tuple(index...)) if dimkeepingtype(i)]
    newdimnames = [dimnames(n, k) for (k, i) in enumerate(tuple(index...)) if dimkeepingtype(i)]
    return NamedArray(a, tuple(newnames...), tuple(newdimnames...))
end

Your namedgetindex function returns NamedArray(…).
Consider specifying its type explicitly. An educated guess…

return NamedArray{eltype(T),ndims(a),typeof(tuple(newnames...)),typeof(tuple(newdimnames...))}(
     NamedArray(a, tuple(newnames...), tuple(newdimnames...)) )

Thanks,

I think I had a variant of your suggestion working but I can’t reproduce that any more…

But I don’t really understand why the original construction (without the explicit type parameters) doesn’t give type stability. This is the constructor that was originally called (I checked by println() debugging)

function NamedArray{T,N}(array::AbstractArray{T,N}, names::NTuple{N,Vector}, dimnames::NTuple{N, Any}=defaultdimnames(array))
    dicts = defaultnamesdict(names)
    NamedArray{T, N, typeof(array), typeof(dicts)}(array, dicts, dimnames)
end

That constructor should make types explicit, right? I tested that defaultnamesdict() is type stable using @inferred.

I made this suggestion because I encountered the following in my code.

Base.round( ::Type{T}, x::MyType) where {T<:Integer} =
    round(T, my_convert(MyOtherType, x)) #my_convert always returns a Real number

The inferred type of my_convert is Any and the inferred type of this round method is also Any.
I expected the inferred type of round(T,y) to be T.

I then tried

  T(round(T, my_convert(MyOtherType, x)))  # T(round(T... makes this type stable.

My guess is that the compiler gave up on inferring the type of round(::Int64,::Any), but was happy with
T(::Any).

In your case you may also have to give the compiler this extra hint as to the return type.

If the return type of your function depends on the data, that is, depends on which slice you take, then all bets are off. But if you pass that result to a function that can be specialized based on the type of the result, you may make it work after all.

Thanks,

In certain ways the type does depend on which slice you take: even Array itself has that property (e.g., the type parameter N depends on the way you slice. But obviously Array has found a way to deal with that properly. For NamedArray it is a bit harder, because the type of the dicts that do the name–> index conversion are a parameter of the type. This is necessary, I believe, to make fast name–>index conversion possible, and also to disambiguate in the case of weird names (such as Ints).

The original question was really about: are there any constructions I should definitely avoid? For instance, I can imagine that comprehensions are hard, and that ifs are even harder. Currently I debug a bit by trial and error, using @inferred as my judge. Obviously, writing type-stable code is not something that comes naturally to me, so I am trying to find a way of operation that helps me write better code.

Cheers,

—david