Special purpose subtypes (of arrays)

Alex_Ellison · January 31, 2019, 5:31pm

Hey there,
I am trying to leverage Julia’s type system to get as many compile time protections as I can while still being performant. I’m using static arrays of reals for a project. Say I have a function f1 which can take an array a, and perform some operation on it once a has been sorted. Is it possible to create a subtype of Array which represents a sorted array? Then f1 in use might look like

function sorted(a :: Array{Real}) :: SortedArray{Real}
...
end

function f1(a :: SortedArray{Real})
...
end

# so that this fails since it's not guaranteed sorted
f1(a)
# but this works
f1(sorted(a))

In a similar spirit, I am working with 3D vectors represented like SVector{3, N} where N, it’d be awesome if I could define functions which only accepted normalized vectors (you can imagine the analogy in code).

For both cases, I could probably write a struct/wrapper around the array, but that brings two problems for me:

Will there be a performance hit?
Is there a neat way to ensure all functions which worked with the original type also work for the new type?

So, can I create a direct subtype of Array without changing the data representation underneath? Thanks for bearing with, I’m still learning Julia’s type system.

Cheers,
Alex

Tamas_Papp · January 31, 2019, 5:35pm

I have an (unregistered) package

for this purpose.

If they original functions work for an <: AbstractArray via its interface, they should keep working.

Alex_Ellison · January 31, 2019, 6:09pm

Right on @Tamas_Papp that’s pretty nearly exactly what I was looking for! Thanks!

I notice that the implementation does put the array containing your data inside a struct. I was interested to see what the impact would be on performance so I wrote the following test which measures time it takes to operate/allocate them (so the actual sort time is excluded). For very small collections (size 2-4), it’s at least 4x slower, and for larger ones (size 1024) the performance is closer (30% slower).

using SortedVectors

for log_size in 1:10
    size = 1 << log_size
    println("size = ", size)
    # standard vectors
    collection_1 :: Vector{Vector{Float64}} = []
    for i in 1: size
        push!(collection_1, Vector(rand(size)))
    end
    @time reduce(+, collection_1)

    # SortedVectors
    collection_2 :: Vector{SortedVector{Float64}} = []
    for i in 1: size
        push!(collection_2, SortedVector(SortedVectors.AssumeSorted(), <, sort!(rand(size))))
    end
    @time reduce(+, collection_2)
    println()
end

Results:

size = 2
  0.000000 seconds (1 allocation: 96 bytes)
  0.000003 seconds (1 allocation: 96 bytes)

size = 4
  0.000000 seconds (3 allocations: 336 bytes)
  0.000002 seconds (3 allocations: 336 bytes)

size = 8
  0.000001 seconds (7 allocations: 1008 bytes)
  0.000004 seconds (7 allocations: 1008 bytes)

size = 16
  0.000002 seconds (15 allocations: 3.047 KiB)
  0.000006 seconds (15 allocations: 3.047 KiB)

size = 32
  0.000004 seconds (31 allocations: 12.109 KiB)
  0.000009 seconds (31 allocations: 12.109 KiB)

size = 64
  0.000007 seconds (63 allocations: 38.391 KiB)
  0.000020 seconds (63 allocations: 38.391 KiB)

size = 128
  0.000027 seconds (127 allocations: 144.859 KiB)
  0.000051 seconds (127 allocations: 144.859 KiB)

size = 256
  0.000116 seconds (255 allocations: 541.875 KiB)
  0.000207 seconds (255 allocations: 541.875 KiB)

size = 512
  0.000529 seconds (511 allocations: 2.058 MiB)
  0.000454 seconds (511 allocations: 2.058 MiB)

size = 1024
  0.003511 seconds (1.02 k allocations: 8.117 MiB, 55.97% gc time)
  0.004643 seconds (1.02 k allocations: 8.117 MiB, 57.17% gc time)

With that in mind, is there a way to achieve this such a subtyping without a container? Trying to work with normalized 3D vectors with an analogous implementation would be a pretty big penalty.

rdeits · January 31, 2019, 6:17pm

These benchmarks aren’t particularly precise, since you’re only running each function once. If you want meaningful results, try using GitHub - JuliaCI/BenchmarkTools.jl: A benchmarking framework for the Julia language and running @benchmark reduce(+, $collection_1), etc.

But to answer your last question, no: you cannot subtype a concrete type in Julia. Fortunately, the cost of a container should be infinitesimal or even zero in many cases, so I would try benchmarking more thoroughly before you give up on encapsulation.

Finally, if you actually have 3D vectors, you should definitely check out https://github.com/JuliaArrays/StaticArrays.jl . The performance benefit of using SVectors from StaticArrays can be immense, and they are ideal for representing 3D vectors.

Alex_Ellison · January 31, 2019, 7:58pm

@rdeits thanks for pointing out the better instrumentation - @btime did confirm a performance gap in this case.

As for using containers I’ll play around a bit more - whatever I make will definitely be based on StaticArrays, they clearly are the right tool for the job.

Topic		Replies	Views
Need for explicitly defining subtypes in function arguments? New to Julia	30	1300	October 14, 2023
Alternate ways to deal with arrays of subtypes of abstract type Performance type , type-stability	11	1168	April 19, 2022
StaticArrays: Sub-typing from FieldVector New to Julia question , staticarrays	2	520	July 1, 2022
Why [1, 2, 3] is not a Vector{Number}? New to Julia question , parametric-types	41	6950	December 9, 2022
Sorting custom types New to Julia sort	14	3259	March 23, 2018

Special purpose subtypes (of arrays)

Related topics