How to trick Array{T} into behaving covariantly?


#1

I am currently stuck trying to get the benefits of mutable lists while maintaining type covariance logic.

Essentially I would like to be able to do the following:

abstract VTKData
typealias VTKDataGroup{T} Array{T<:VTKData, 1}
type VTKUnstructuredData <: VTKData
end
type VTKStructuredData <: VTKData
end
type VTKMultiblockData <: VTKData
    blocks::VTKDataGroup{VTKData}
end
type VTKTimeSeriesData <: VTKData
    data::VTKDataGroup{VTKData}
end

Additionally, I would like VTKMultiblockData to hold data from different VTKData subtypes while imposing a constraint on VTKTimeSeriesData to have the same VTKData subtype throughout the array.

Of course I am ommitting most of the fields. But the main point is that I want to be able to define generic behaviour on arrays of abstract types, that also applies to arrays of their subtypes without using tuple types, often with an additional homogeneity constraint. Is there any way to do this that does not involve tuple types, a ton of assertions, and/or metaprogramming?

Here is a little attempt that uses Union almost abusively and achieves homogeneity.

julia> isa([1,"a"], Union{Vector{Int},Vector{String}})
false

julia> isa(["a","a"], Union{Vector{Int},Vector{String}})
true

julia> isa([1,1], Union{Vector{Int},Vector{String}})
true

It also supports non-homogeneity only when types can be promoted to some common type, which kind of defeats the purpose of using Union.

julia> isa([1,2.], Vector{Union{Int,Float64}})
false
julia> isa([1,2], Vector{Int})
true
julia> isa([1,2.], Union{Vector{Int},Vector{Float64}})
true
julia> isa([1,2.], Vector{Float64})
true
julia> isa([1,2.], Vector{Int})
false

Here is another weird but related example that I don’t understand. So I would appreciate it if someone explains.

julia> typealias RealArray{T} Array{T<:Real,1}
Array{false,1}
julia> isa([1,2.],RealArray{Float64})
false

#2

How important is performance accessing the vectors of “mixed” data?
Do the different types have common fields?

If the performance of accessing the “blocks” field (which will have to essentially hold pointers to boxed VTKData types) isn’t so important, then the following (using the new syntax) should work:

abstract type AbstractVTK end

mutable struct VTKUnstructuredData <: AbstractVTK
end

mutable struct VTKStructuredData <: AbstractVTK
end

mutable struct VTKMultiblockData <: AbstractVTK
    blocks::Vector{AbstractVTK}
end

mutable struct VTKTimeSeriesData{T<:AbstractVTK} <: AbstractVTK
    data::Vector{T}
end

The VTKTimeSeriesData is parameterized so the “data” field will be a homogenous vector of a single subtype.


#3

Please note Array{false,1}.

On 0.6 you would write this as:

julia> RealArray{T} = Array{T,1} where T <: Real
Array{T,1} where T<:Real

julia> isa([1,2.],RealArray{Float64})
true

#4

Thanks for your suggestions. How is a “mutable struct” different from “type”? Also I would like to be able to add and read new data efficiently, including in certain cases, modifying existing blocks, not very much adding new blocks. I guess for blocks I could get away with a tuple type of mutables, as it allows me to modify each block efficiently, but adding a new block requires making a new tuple.

I like this part about VTKTimeSeriesData.

About VTKMultiblockData, does the code below allow mixed types in the “blocks” field?

mutable struct VTKMultiblockData <: AbstractVTK
    blocks::Vector{AbstractVTK}
end

I am using this as a reference:

julia> isa([1,2.],Vector{Real})
false

#5

Interesting. But does that only work for data that can be promoted to a common concrete type? Or does it support data of mixed concrete types that have a common abstract type e.g. isa([1,2.],RealArray{Real})?


#6

First, when you write [1, 2.] you actually just create a Float64 vector:

julia> [1,2.]
2-element Array{Float64,1}:
 1.0
 2.0

Anyway, the implicitly defined convert constructor will help you:

julia> immutable Foo
         v::Vector{Real}
       end

julia> Foo([1, 2.])
Foo(Real[1.0, 2.0])

#7

This is getting slightly confusing. I remember in the documentation, it was mentioned that only concrete types can be instantiated, so how does Real[1,2] work exactly given the above claim?


#8

Ok I think I get it. Real is abstract, but Array{Real} is concrete, which is a nice side effect of non-covariance I guess.


#9

Yep!


#10

In v0.6 (and onwards), type is now mutable struct and immutable is struct.

How many blocks do you typically have? Using tuples would probably be rather inefficient if more than a dozen (benchmarking would be required to see where the actual breakeven point is).

Yes, you can store anything that is of type AbstractVTK in blocks (but remember, it’s storing references to the instances, i.e. pointers to boxed values under the hood, so not as efficient, and when you use them, you’ll be doing dynamic dispatching - i.e. slow).
That’s why you might do better to have a different sort of architecture, with a common type, that uses a byte (or `@enum) value, to distinguish which of the underlying types the instance actually represents.

Are there only these 4 types? Would more be added in the future?

Here is an example of working with the types I provided: (I made abbreviations vs, vu, vt, vm for each of the types)

julia> abstract type AbstractVTK end

julia> mutable struct VTKStructuredData <: AbstractVTK ; a::Int ; b::String ; end

julia> mutable struct VTKUnstructuredData <: AbstractVTK ; c::Float64 ; end

julia> mutable struct VTKTimeSeriesData{T<:AbstractVTK} <: AbstractVTK ;  data::Vector{T} ; end

julia> mutable struct VTKMultiblockData <: AbstractVTK ; blocks::Vector{AbstractVTK} ; end

julia> vs = VTKStructuredData
VTKStructuredData

julia> vu = VTKUnstructuredData
VTKUnstructuredData

julia> vm = VTKMultiblockData
VTKMultiblockData

julia> vt = VTKTimeSeriesData
VTKTimeSeriesData

julia> a = vs(1,"foo")
VTKStructuredData(1, "foo")

julia> b = vu(1.5)
VTKUnstructuredData(1.5)

julia> c = vt{vu}([vu(1.0),vu(2.0),vu(3.0)])
VTKTimeSeriesData{VTKUnstructuredData}(VTKUnstructuredData[VTKUnstructuredData(1.0), VTKUnstructuredData(2.0), VTKUnstructuredData(3.0)])

julia> typeof(c)
VTKTimeSeriesData{VTKUnstructuredData}

julia> d = vm([a,b,c])
VTKMultiblockData(AbstractVTK[VTKStructuredData(1, "foo"), VTKUnstructuredData(1.5), VTKTimeSeriesData{VTKUnstructuredData}(VTKUnstructuredData[VTKUnstructuredData(1.0), VTKUnstructuredData(2.0), VTKUnstructuredData(3.0)])])

julia> d.blocks[1]
VTKStructuredData(1, "foo")

julia> typeof(d.blocks[1])
VTKStructuredData

julia> typeof(d.blocks[2])
VTKUnstructuredData

julia> typeof(d.blocks[3])
VTKTimeSeriesData{VTKUnstructuredData}

#11

I see, thanks alot. I am basically making two modules ReadVTK.jl and VTKDataTypes.jl that extract data from any vtk file to its corresponding struct. There are at least 2 more types I am planning to support, up to a maximum of 4. The number of blocks should be up to O(10) at the very most, per time step. The number of time steps can be pretty large, so the total number of blocks will be fairly large. Accessing them efficiently is important, but is generally not the bottleneck for my intended applications.

I am using Paraview and PyCall under the hood in ReadVTK.jl which will enable me to add some useful visualization and format translation capabilities in the future perhaps in ParaviewLimited.jl. Ideally, I would like to connect the existing world of CAD modeling and visualization to Julia’s scientific computing world to have a reliable platform for making and visually testing mechanical simulation and design modules in Julia. A nice side product of this will be a little geometry file translation genie that takes in any input format and translates it to any other compatible output format desired.


#12

If the number of blocks in only in the 10s, and you are not changing the number of blocks (frequently),
then maybe something like the StaticArray.jl package might be useful. It has a lot less overhead than
the Julia Vector type, and (for small arrays, I believe up to ~100) is faster.

BTW, join us on the Gitter chat room, https://gitter.im/JuliaLang/julia, there’s lots of people happy to help out with questions like this (it’s kind of a game there, “Let’s optimize this code!” :smile:)


#13

It’s concrete, but it won’t be fast.


#14

Optimizations are not exactly my priority right now as this is my first Julia module. I will try to get the functionality out first, to have something that can then be optimized. I appreciate all the suggestions though and I will see how to incorporate them somehow.