Hello,
I am struggling with making NamedArrays
more type stable. I need to implement getindex()
for the same range of slices that AbstractArray
can handle. For the backing array that NamedArrays wraps, this is easy. However, I also need to compute the names of the indices of the slices. The types of the names that survive the sliced dimensions determine the type of the resulting NamedArray. In my current implementation, the @inferred
type of such a NamedArray-slice is Any
, which is not good.
I don’t really know what operations I can use that help the type-inference, and which operation I should definitely avoid. Is there some guidance toward this?
Thanks,
—david
Ehm, a NWE, the problem being namedgetindex()
## uniqnames computes the names of the indices, ensuring they are unique, because they are used as an index
## single index
uniquenames(dict::Associative, i::Integer) = names(dict)[i] ## inefficient
uniquenames(dict::Associative{K,V}, i::K) where {K,V} = i
## multiple indices
uniquenames(dict::Associative, i::AbstractArray{<:Integer}) = makeunique(names(dict)[i])
uniquenames(dict::Associative{K,V}, i::AbstractArray{K}) where {K,V} = makeunique(i)
function makeunique(i::AbstractArray{T}) where T
length(i) == length(Set(i)) || error("Cannot make elements unique")
return i
end
function makeunique(index::AbstractArray{S}) where S<:AbstractString
counts = DefaultDict{S,Int}(0)
for i in index
counts[i] += 1
end
for (k, v) in counts
if v == 1
delete!(counts, k)
else
counts[k] = 0
end
end
res = similar(index)
for (k, i) in enumerate(index)
if i in keys(counts)
res[k] = i * "." * string(counts[i])
else
res[k] = i
end
counts[i] += 1
end
return res
end
makeunique(index::AbstractArray{Symbol}) = Symbol.(makeunique(String.(index)))
## namedgetindex collects the elements from the array, and takes care of the index names
## `index` is an integer now, or an array of integers, or a cartesianindex
## and has been computed by `indices()`
dimkeepingtype(x) = false
dimkeepingtype(x::AbstractArray) = true
dimkeepingtype(x::Range) = true
dimkeepingtype(x::BitArray) = true
## Slices etc.
function namedgetindex(n::NamedArray, index...)
a = getindex(n.array, index...)
newnames = [uniquenames(n.dicts[k], i) for (k, i) in enumerate(tuple(index...)) if dimkeepingtype(i)]
newdimnames = [dimnames(n, k) for (k, i) in enumerate(tuple(index...)) if dimkeepingtype(i)]
return NamedArray(a, tuple(newnames...), tuple(newdimnames...))
end
Your namedgetindex
function returns NamedArray(…).
Consider specifying its type explicitly. An educated guess…
return NamedArray{eltype(T),ndims(a),typeof(tuple(newnames...)),typeof(tuple(newdimnames...))}(
NamedArray(a, tuple(newnames...), tuple(newdimnames...)) )
Thanks,
I think I had a variant of your suggestion working but I can’t reproduce that any more…
But I don’t really understand why the original construction (without the explicit type parameters) doesn’t give type stability. This is the constructor that was originally called (I checked by println()
debugging)
function NamedArray{T,N}(array::AbstractArray{T,N}, names::NTuple{N,Vector}, dimnames::NTuple{N, Any}=defaultdimnames(array))
dicts = defaultnamesdict(names)
NamedArray{T, N, typeof(array), typeof(dicts)}(array, dicts, dimnames)
end
That constructor should make types explicit, right? I tested that defaultnamesdict()
is type stable using @inferred
.
I made this suggestion because I encountered the following in my code.
Base.round( ::Type{T}, x::MyType) where {T<:Integer} =
round(T, my_convert(MyOtherType, x)) #my_convert always returns a Real number
The inferred type of my_convert
is Any
and the inferred type of this round
method is also Any
.
I expected the inferred type of round(T,y) to be T.
I then tried
T(round(T, my_convert(MyOtherType, x))) # T(round(T... makes this type stable.
My guess is that the compiler gave up on inferring the type of round(::Int64,::Any)
, but was happy with
T(::Any)
.
In your case you may also have to give the compiler this extra hint as to the return type.
If the return type of your function depends on the data, that is, depends on which slice you take, then all bets are off. But if you pass that result to a function that can be specialized based on the type of the result, you may make it work after all.
Thanks,
In certain ways the type does depend on which slice you take: even Array
itself has that property (e.g., the type parameter N
depends on the way you slice. But obviously Array
has found a way to deal with that properly. For NamedArray
it is a bit harder, because the type of the dicts that do the name–> index conversion are a parameter of the type. This is necessary, I believe, to make fast name–>index conversion possible, and also to disambiguate in the case of weird names (such as Int
s).
The original question was really about: are there any constructions I should definitely avoid? For instance, I can imagine that comprehensions are hard, and that if
s are even harder. Currently I debug a bit by trial and error, using @inferred
as my judge. Obviously, writing type-stable code is not something that comes naturally to me, so I am trying to find a way of operation that helps me write better code.
Cheers,
—david