Efficient structure of Unitful arrays

I’m experiencing performance issues when using Unitful arrays inside a structure. I don’t know how to specify the type so well, so the problem may be from the type definition.
Here is a simple example using a struct and a function to compute the sum of an array. The version with Unitful is a lot slower and has many more allocations:

struct MyData1{T <: AbstractFloat}
    arr1::Array{T, 3}
end

struct MyData2{T <: AbstractFloat}
    arr1::Array{<: Unitful.Temperature{T}, 3}
end

function my_sum(data)
    nx, ny, nz = size(data.arr1)
    tmp = data.arr1[1] * 0
    for k in 1:nz, j in 1:ny, i in 1:nx
        tmp += data.arr1[i, j, k]
    end
    return tmp
end

function do_compute1()
    nx, ny, nz = 10, 10, 200
    arr1 = ones(Float64, (nx, ny, nz))
    data = MyData1(arr1)
    my_sum(data)
end

function do_compute2()
    nx, ny, nz = 10, 10, 200
    arr1 = ones(Float64, (nx, ny, nz))u"K"
    data = MyData2(arr1)
    my_sum(data)
end

Benchmarking do_compute1() (no Unitful) and do_compute2() (with Unitful) gives:

@btime do_compute1()
  27.280 μs (2 allocations: 156.33 KiB)

@btime do_compute2()
  1.801 ms (64409 allocations: 1.66 MiB)

Perhaps the problem is that I’m specifying Array{<: Unitful.Temperature{T}, 3}, but how to specify a specific type of array (in this case, array of K, for a given type T)? If I use Array{Unitful.Temperature{T}, 3} it won’t work. Since

julia> typeof(zeros(3, 3, 3)u"K")
Array{Quantity{Float64,𝚯,Unitful.FreeUnits{(K,),𝚯,nothing}},3}

I also tried

Array{Quantity{Float64,Unitful.𝚯,Unitful.FreeUnits{(Unitful.K,),Unitful.𝚯,nothing}},3}

but that doesn’t seem to match the type either. Bizarrely these types are not the same:

julia> typeof(zeros(3, 3, 3)u"K") == Array{Quantity{Float64,Unitful.𝚯,Unitful.FreeUnits{(Unitful.K,),Unitful.𝚯,nothing}},3}
false

Even though if I print them out they seem to be about the same.

Back to the actual question: how to declare Unitful arrays in struct so that the type inference is not some generic array?

1 Like

Here’s the “easy” solution, which also reveals the problem. The type of your Array needs to be a type parameter of the struct to store it concretely.

struct MyData2{T}
    arr1::Array{T, 3}
end

To actually enforce your type constraint I think requires an inner constructor, like this

struct MyData4{T}
    arr1::Array{T, 3}

    function MyData4(a::AbstractArray{T, 3}) where {T <: Quantity{<:Float64}}
        new{T}(a)
    end
end
2 Likes

Thank you for the suggestion. Indeed it fixes the problem in this simple example. But generally speaking, the point of having a struct is to combine several arrays (not necessarily with the same units). Your suggestion works only if all the arrays have the same units. Is there no hope to have a struct with a concrete type but having different types of the members? This seems to be possible without Unitful (e.g. mixing a struct with Int’s, Floats, Arrays, etc.).

If you want an Array to have elements of multiple distinct types inside it you will always have to pay a performance cost. Not only because the Array will have to wrap each element in a way it remember its types, but because of type instability. Each function that access the Array elements will not be able to generate specialized code, because there many possible paths.

1 Like

The function signature will get long, but there’s no reason you can’t scale this up to multiple type parameters for arrays of multiple types.

struct MyData7{T, T2, T3}
    arr1::Array{T, 3}
    arr2::Array{T2, 3}
    arr3::Array{T3, 3}

    function MyData7(a::AbstractArray{T, 3}, b::AbstractArray{T2, 3}, c::AbstractArray{T3, 3}) where {
                     T <: Quantity{<:Float64}, T2 <: Quantity{<:Int64}, T3 <: Quantity{<:Float64} }
        new{T, T2, T3}(a, b, c)
    end
end

# works 
# julia> MyData7(ones(Float64, (5, 5, 5))u"K", ones(Int64, (5, 5, 5))u"K", ones(Float64, (5, 5, 5))u"K")

# correctly errors 
# julia> MyData7(ones(Int64, (5, 5, 5))u"K", ones(Int64, (5, 5, 5))u"K", ones(Float64, (5, 5, 5))u"K")
4 Likes

I think the problem he is facing is that Unitful.Temperature{Float64} is an abstract type. Probably what the OP wants is to have the flexibility of having multiple different types (one type on each array), but all using the same Float type. The solution above of course works, but is verbose, if one wanted to only annotate on the type of Float used.

I think his question raises one doubt that I have as well. There is not simple syntax (could there be?) a syntax to define a vector that accepts one type of any of the Real types? We can do this:

julia> struct A{T<:Real}
         x::Union{Array{T,1},Array{T,2}}
       end

x is type unstable. I would like something as:

struct A{T<:Real}
       x::Xor{Array{T,1},Array{T,2}}  
end

or more generally

struct A{T<:Real}
  x::Xor{<:AbstractArray{T}}
end

where the type of any instance of A would be parsed as if it belonged to, for example:

struct A{Float64}
  x::Vector{Float64}
end

Seems that a macro could expand to specific constructors for each type of type. Anyone thought of that?

This is what I would do:

struct MyData2{A}
    arr1::A
end

There is rarely a need to specify types like that, just let the compiler handle it and you wont experience this kind of problem unless you use mixed types (which you aren’t). If you need to enforce what A is, do it in an inner constructor, or like:

struct MyData2{A<:AbstractArray{<:Unitful.Temperature,3}}
    arr1::A
end
1 Like

Thank you for all your input. @chris-b1’s solution works, although getting cumbersome to code. Indeed what @lmiq mentions is what I would like: having a struct with arrays of the same primitive type (e.g. Float64), but for different Unitful quantities (e.g. temperature, length, etc.). Unitful arrays are supposed to be “native” arrays in some type plus some metadata. Julia can already optimise if I create a struct with different arrays of Float64, and it is disappointing that it cannot do the same for a bunch of Unitful arrays that share the same primitive type.

Just to be clear, you can do that with arrays if the only parameter missing from the arrays is the Float64. Otherwise you would have to write a more verbose annotation (with dimensions). The problem is not that it cannot specialize, the solutions above work. It is a problem of convenience construction and, perhaps, pretty output.

1 Like