Please assist with AbstractArray child struct overload

As may be apparent by my use of the word “child”, I come from an OOP background. I want to replicate some code I have in Python in Julia as I find this language amazing.

I created a simplified example for below, but I want to create a struct array type that contains a multidimensional array of integers along with some fields for metadata. I originally went about writing out every function overload that was required (some 500 lines) before coming to a dead end where dispatch wasn’t selecting the functions I had created.

I then found reference to AbstractArray subtyping and went down that path. While there are two very informative examples there, I’m still struggling.

Here is my example where I accept any set of data (numbers) and if signed store it straight away and if unsigned, absolute it (a silly example but it represents what I need solved):

struct SignedArray{T,N} <: AbstractArray{T,N}
    data :: Array{T,N}
    sign :: String
    function SignedArray(data :: Array{T, N}, sign :: String) where {T, N}
        if sign == "signed"
            return new{typeof(data), ndims(data)}(data, sign)
        elseif sign == "unsigned"
            return new{typeof(data), ndims(data)}(abs.(data), sign)
        end
    end
end
Base.size(S :: SignedArray) = size(S.data)
Base.getindex(S :: SignedArray{T, N}, I::Vararg{Int, N}) where {T, N} = get(S.data, I, zero(Int))
Base.similar(S :: SignedArray, ::Type{T}, dims::Dims) where {T} = SignedArray{T}([],"signed")
Base.setindex!(S :: SignedArray{T, N}, v, I::Vararg{Int, N}) where {T, N} = (S.data[I] = v)
Base.length(S :: SignedArray{T, N}) where {T, N} = prod(size(S.data))

I’m pretty certain that I am declaring the similar function wrong as I don’t quite understand its function. Furthermore, if I now go about trying to instantiate a struct i.e.

R = SignedArray([1,2,3,4], "unsigned")

I get the following error:

MethodError: Cannot `convert` an object of type Int64 to an object of type Vector{Int64}

If I look at the dictionary example for sub-typing an AbstractArray in the documentation (see link above), it would seem that none of the constructors actually store the data in the data field i.e:

SparseArray(::Type{T}, dims::NTuple{N,Int}) where {T,N} = SparseArray{T,N}(Dict{NTuple{N,Int}, T}(), dims);

which makes little sense to me, but perhaps that is the flaw in my understanding?

Any help with this problem or my lack of understanding of it would be very very helpful.
Thank you all.

First, welcome to this forum!

I think your issue comes from the way you retrieve your T parameter in the constructor: if you get a data::Array{T,N} argument, the T parameter is retrieved via eltype(data) (it’s the element typg of data, not the type of data itself).

So your struct definition would read like:

struct SignedArray{T,N} <: AbstractArray{T,N}
    data :: Array{T,N}
    sign :: String
    function SignedArray(data :: Array{T, N}, sign :: String) where {T, N}
        if sign == "signed"
            return new{eltype(data), ndims(data)}(data, sign)
        elseif sign == "unsigned"
            return new{eltype(data), ndims(data)}(abs.(data), sign)
        end
    end
end

The goal of b = similar(a, new_eltype, new_size) is to create a new container b of the same type as a, possibly changing the element type (if new_eltype is provided) and/or the size (if new_size is provided). Be aware that the data in b is usually not initialized!

In your case, and implementation similar could look like:

Base.similar(sa::SignedArray, ::Type{T}, dims::Dims) where {T} = SignedArray(similar(sa.data, T, dims), sa.sign)

I also went ahead and made some adjustments to your functions where I thought that made sense. Note in particular that there is no need to always specify the parameters {T,N} when they are not actually used in the implementation of the method:

struct SignedArray{T,N} <: AbstractArray{T,N}
    data :: Array{T,N}
    sign :: String
    function SignedArray(data :: Array{T, N}, sign :: String) where {T, N}
        if sign == "signed"
            return new{eltype(data), ndims(data)}(data, sign)
        elseif sign == "unsigned"
            return new{eltype(data), ndims(data)}(abs.(data), sign)
        end
    end
end
Base.size(sa::SignedArray) = size(sa.data)
Base.getindex(sa::SignedArray, I) = sa.data[I]
Base.similar(sa::SignedArray, ::Type{T}, dims::Dims) where {T} = SignedArray(similar(sa.data, T, dims), sa.sign)
Base.setindex!(sa::SignedArray, v, I) = (sa.data[I] = v)
Base.length(sa::SignedArray) = length(sa.data)

With this implementation, the new SignedArray type can be used in (what I think is) a non-surprising way:

julia> sa = SignedArray([1,2,3,4], "signed")
4-element SignedArray{Int64, 1}:
 1
 2
 3
 4

# A new SignedArray containing uninitialized data
# with the prescribed eltype and size
julia> cumsum_sa = similar(sa, Float64, 1+length(sa))
5-element SignedArray{Float64, 1}:
 0.0
 6.93874527930976e-310
 6.93874527930976e-310
 0.0
 0.0

julia> cumsum_sa[1] = zero(eltype(cumsum_sa))
       for i in eachindex(sa)
           cumsum_sa[i+1] = cumsum_sa[i] + sa[i]
       end

julia> cumsum_sa
5-element SignedArray{Float64, 1}:
  0.0
  1.0
  3.0
  6.0
 10.0

julia> cumsum_sa[2:end] == cumsum(sa)
true
3 Likes

Ah, what an amazing community this is! Thank you for this @ffevotte.

I had one qualm which may or may not matter but I am curious. When I slice from a SignedArray i.e:

R = SignedArray([1,2,3,4], "unsigned")
q = R[2]
println(typeof(q))
s = R[1:3]
println(typeof(s))

Then where I might expect q and s to be of type SignedArray, they appear as Tuples:

Tuple{Int64, String}
Tuple{Vector{Int64}, String}

I altered the struct definition to try insure that getindex returns a SignedArray:

struct SignedArray{T,N} <: AbstractArray{T,N}
    data :: Array{T,N}
    sign :: String
    function SignedArray(data :: Array{T, N}, sign :: String) where {T, N}
        if sign == "signed"
            return new{eltype(data), ndims(data)}(data, sign)
        elseif sign == "unsigned"
            return new{eltype(data), ndims(data)}(abs.(data), sign)
        end
    end
    function SignedArray(data :: Number, sign :: String)
        if sign == "signed"
            return new{eltype(data), 1}([data], sign)
        elseif sign == "unsigned"
            return new{eltype(data), 1}(abs.([data]), sign)
        end
    end
end
Base.size(sa::SignedArray) = size(sa.data)
Base.getindex(sa::SignedArray, I) = SignedArray(sa.data[I], sa.sign)
# Base.getindex(sa::SignedArray, I) = (sa.data[I], sa.sign)
Base.similar(sa::SignedArray, ::Type{T}, dims::Dims) where {T} = SignedArray(similar(sa.data, T, dims), sa.sign)
Base.setindex!(sa::SignedArray, v, I) = (sa.data[I] = v)
Base.length(sa::SignedArray) = length(sa.data)

Which now means that q and s have the expected types:

SignedArray{Int64, 1}
SignedArray{Int64, 1}

But my setindex! definition fails so that:

V = SignedArray(zeros(Int64, 10), "unsigned")
V[3:6] = R

Yields the error: MethodError: Cannot convert an object of type SignedArray{Int64, 1} to an object of type Int64.
I can understand why that error might appear so I thought to create a convert overload like so:
Base.convert(T::Type, sa::SignedArray) where {T} = convert(T, sa.data)
But that does not seem to fix things.

Can you help me any further please? :slight_smile:

Just FYI, but this:

can also be written like this:

   function SignedArray(data :: Array{T, N}, sign :: String) where {T, N}
        if sign == "signed"
            return new{T, N}(data, sign)
        elseif sign == "unsigned"
            return new{T, N}(abs.(data), sign)
        end
    end

making use of the type parameters directly.

Also, be aware that abs can’t give a positive value for every input - e.g. abs(Int8(-128)) is still Int8(-128).

Design wise, I’d also store the sign as a Bool instead of a string.

2 Likes

Oh really? That is cool. Looks quite neat too! I’ll give it a try.

Right of course, this is just a struct I quickly cooked up to illustrate my problem but I am noting your design suggestions for elsewhere.

Thanks.

I copy-pasted your recent version and ran the snippet

V = SignedArray(zeros(Int64, 10), "unsigned")
V[3:6] = R

But this gave me a StackOverflowError, which caused by

Base.getindex(sa::SignedArray, I) = SignedArray(sa.data[I], sa.sign)

The problem is that for <:AbstractArrays Julia uses getindex to retrieve all the elements of that array to print them nicely formatted. But since getindex returns a SignedArray it does that over and over and …
This is fixed with

Base.getindex(sa::SignedArray, I) = sa.data[I]
Base.getindex(sa::SignedArray, I...) = sa.data[I...] # that's the ::Vararg version from the AbstractArray page you linked

On the other hand, your setindex! method is totally fine. Just remember that you have to broadcast also assignments = when you want to set multiple values at once. If R is either a number or an <:AbstractArray, you should call it with

V[3:6] .= R

Okay thanks, but the reason I had the getindex function return an instance of SignedArray is that I wanted the type to be that of SignedArray… not Vector{Int64} or Int64 as I get when using your suggestion.

Would there be a function overload that “tells” Julia how to display a SignedArray so that when the getindex function returns a SignedArray, it is able to display it rather than recursively dive down?

Oh, I see. My bad.
But I am not 100% sure if returning a SignedArray when indexed by a scalar is how getindex is assumed to work.

E.g. consider

julia> x = [1,2,3]
3-element Vector{Int64}:
 1
 2
 3

julia> x[1] # scalar index gives you something of eltype(x)
1

julia> x[1:2] # gives you a new vector
2-element Vector{Int64}:
 1
 2

To implement the above you need to dispatch on scalar indices, e.g.

Base.getindex(sa::SignedArray, I) = SignedArray(sa.data[I], sa.sign)
# special case for scalar index
Base.getindex(sa::SignedArray, I::Integer) = sa.data[I]
1 Like

If you want to extract a 1-element sub array then you could do that with V[1:1].

Hmm… yeah I see what you mean. I need to bare in mind that this is different to the OOP way. Your reasoning makes sense. Let me see what I can do with it and how I progress in my work. Thanks @fatteneder.

It would be nice to have a scalar type for SignedArray perhaps so that just because we index it with a single Integer does not mean that we now get returned an Integer type and lose the associate information in the sign field.

So what I did was create a struct Signed like so:

struct Signed <: Number
    data :: Number
    sign :: String
    function Signed(data :: Number, sign :: String)
        if sign == "signed"
            return new(data, sign)
        elseif sign == "unsigned"
            return new(abs(data), sign)
        end
    end
end

and then change
Base.getindex(sa::SignedArray, I::Integer) = sa.data[I] to Base.getindex(sa::SignedArray, I::Integer) = Signed(sa.data[I], sa.sign)
which kind of works i.e:

R = SignedArray([1,2,3,4], "unsigned")
q = R[2]
println(q)
s = R[1:3]
println(s)

gives

Signed(2, "unsigned")
[Signed(1, "unsigned"), Signed(2, "unsigned"), Signed(3, "unsigned")]

but breaks my setindex! function and I get the following error from running the following code :frowning: :

V = SignedArray(zeros(Int64, 10), "unsigned")
V[3:6] = R
>> MethodError: no method matching Int64(::Signed)

Is this a simple fix or have I opened a can of worms :frowning: ?

Oh wait, think I might have solved that issue with:
Base.setindex!(sa::SignedArray, v, I) = (sa.data[I] = v.data)
which appears to work well giving V being equal to:

10-element SignedArray{Int64, 1}:
 Signed(0, "unsigned")
 Signed(0, "unsigned")
 Signed(1, "unsigned")
 Signed(2, "unsigned")
 Signed(3, "unsigned")
 Signed(4, "unsigned")
 Signed(0, "unsigned")
 Signed(0, "unsigned")
 Signed(0, "unsigned")
 Signed(0, "unsigned")

Note that this will only work when v is of type Signed or for any other type that has a field called data.

Base.getindex(sa::SignedArray, I::Integer) = Signed(sa.data[I], sa.sign)

I fear this is not really what you want.
Consider

R = SignedArray([1,2,3,4], "unsigned") # has eltype Int
R[1:3] # is now also a SignedArray but with eltype `Signed`

Don’t know what your real world problem is, but if you want to retain the signedness info per element then you could just go with a plain Vector{Signed} (or Array).

If you want to keep your SignedArray struct then you should make the interior data field a Array{Signed,N} to keep consistency with your setindex! method. But obviously this duplicates then the signdness info. Or you will have to pick some other function then getindex to retrieve values from a SignedArray.

Thanks @fatteneder I understand your qualms. I’m trying to create a FixedPoint Number Type that allows for variable bitwidths and fraction along with rounding schemes etc. This type is then intended to be used in DSP algorithms for simulating FixedPoint precision.
For every step of the algorithms, one usually needs to have the metadata (bitwidths etc) on hand, because it will govern how the operations are performed on the integer data fields.
In Python, I had created a class with an integer numpy array for the data field and been able to alter get/setindex to return new FixedPoint arrays with altered metadata.

Anyway, long story short is that I’ll need to make some adjustments to retain the metadata of the struct when doing slicing etc so that it is not lost.
As this metadata grows, it will mean that we probably want to have this metadata as a second struct which is a field inside of SignedArray and revert to the original design we had of:

struct SignedScheme
    sign :: String
end

struct SignedArray{T,N} <: AbstractArray{T,N}
    data :: Array{T,N}
    scheme :: SignedScheme
    function SignedArray(data :: Array{T, N}, scheme :: SignedScheme) where {T, N}
        if scheme.sign == "signed"
            return new{T, N}(data, scheme)
        elseif scheme.sign == "unsigned"
            return new{T, N}(abs.(data), scheme)
        end
    end
    function SignedArray(data :: T, scheme :: SignedScheme) where {T}
        if  scheme.sign == "signed"
            return new{T, 1}([data], scheme)
        elseif scheme.sign == "unsigned"
            return new{T, 1}(abs.([data]), scheme)
        end
    end
end
Base.size(sa::SignedArray) = size(sa.data)
Base.getindex(sa::SignedArray, I) = sa.data[I]
Base.getindex(sa::SignedArray, I...) = sa.data[I...] # that's the ::Vararg version from the AbstractArray page you linked
Base.similar(sa::SignedArray, ::Type{T}, dims::Dims) where {T} = SignedArray(similar(sa.data, T, dims), sa.scheme)
Base.setindex!(sa::SignedArray, v, I) = (sa.data[I] = v.data)
Base.length(sa::SignedArray) = length(sa.data)

So that now every time we getindex, I can manually retain the SignedScheme and create a new SignedArray from it :slight_smile:

Will work with this structure for now.