Sorting custom types

sort

#1

I have two types (FooType and BarType) which implement the AbstractArray interface. The former is just a vector of float numbers, the latter is a vector of FooType.
When I try to call sort() on them, it works smoothly for FooType objects but doesn’t for BarType.
The error is:

ERROR: MethodError: Cannot `convert` an object of type Int64 to an object of type BarType
This may have arisen from a call to the constructor BarType(...),
since type constructors fall back to convert methods.
Stacktrace:
 [1] #sort#8(::Array{Any,1}, ::Function, ::BarType) at .\sort.jl:546
 [2] sort(::BarType) at .\sort.jl:546
 [3] eval(::Module, ::Any) at .\boot.jl:235
 [4] eval(::Any) at .\boot.jl:234
 [5] macro expansion at C:\Users\leona\AppData\Local\JuliaPro-0.6.2.2\pkgs-0.6.2.2\v0.6\Atom\src\rep
l.jl:117 [inlined]
 [6] anonymous at .\<missing>:?

It’s important to note that sort(b.data) works, but I don’t really want to specify “.data” every time.
Both types are subtypes of AbstractArray (proof is “BarType <: AbstractArray” which returns true), so I expect to be able to simply call sort(b) with no frills.
A workaround is redefining Base.sort() like this:

Base.sort(b::BarType) = sort(b.data)

but then I should do the same for sort!(), sortperm() and so on…
So, I have two questions:

  1. Why does sort() work for FooType but not BarType?
  2. Is there a solution which doesn’t need the redefinition of every single array function?

Note: the following code is just an example and some methods may not make sense logically (see isless()).

using OffsetArrays

struct FooType <: AbstractVector{Float64}
    data::OffsetArray{Float64,1}
    dim::Int64
end

FooType(n::Int) = FooType(OffsetArray(Float64, 1:n), n)
Base.size(s::FooType) = (s.dim,)
Base.IndexStyle(::Type{<:FooType}) = IndexLinear()
Base.similar(s::FooType, ::Type, d::Dims) = FooType(d[1])
Base.getindex(s::FooType, i::Int) = s.data[i]
Base.setindex!(s::FooType, v, i::Int) = (s.data[i] = v)
Base.show(io::IO, s::FooType) = print(io, s.dim)
Base.indices(s::FooType) = indices(s.data)
Base.:(==)(a::FooType, b::FooType) = a.data == b.data
Base.isless(a::FooType, b::FooType) = a[1]<b[1]

struct BarType <: AbstractVector{FooType}
    data::Array{FooType,1}
    dim::Int64
end

function build_bar(m::Int, n::Int)
    bar = BarType(Array{FooType,1}(m), m)
    for i in 1:m
        bar.data[i] = FooType(n)
    end
    bar
end

BarType(m::Int, n::Int) = build_bar(m,n)
Base.size(b::BarType) = (b.dim,)
Base.IndexStyle(::Type{<:BarType}) = IndexLinear()
Base.similar(b::BarType, ::Type, d::Dims) = BarType(d[1])
Base.getindex(b::BarType, i::Int) = b.data[i]
Base.setindex!(b::BarType, val, i::Int) = (b.data[i] = val)
Base.show(io::IO, b::BarType) = print(io, b.dim)
Base.indices(b::BarType) = indices(b.data)

f = FooType(2)
b = BarType(3,2)
sort(f)     # -> works
sort(b)     # -> doesn't work (but I want it to work!)
sort(b.data)  #->works

#2

it appears that you have yet to define Base.isless for the second type


#3

But it makes no sense logically (imo), since the comparisons are between elements of BarType (FooType) and not BarType themselves!
To make a concrete example, when you sort an array of integer numbers, you need to define the comparison operator (isless) for integer numbers, not arrays of integers.
Perhaps I’m misunderstanding the meaning of the operator isless() ?


#4

By the way, please see: PSA: how to quote code with backticks.


#5

Thanks for the tip, I’ve reformatted the text, hope it’s more readable now.

Just in case, I tried to define Base.isless for FooType too as suggested by @JeffreySarnoff, and it doesn’t work indeed.

Base.isless(a::BarType, b::BarType) = a[1]<b[1]

That is the code I used, I’m still convinced that it doesn’t make sense though (see my reply above).


#6

I’m a bit surprised sort(b.data) works actually, if i understand correctly it is sorting a vector whos elements are also vectors?
Base.isless(a::BarType, b::BarType) = sum(a.data[1]) < sum(b.data[1]) would be one kind-of-sensible way of sorting them (sum of first vector of BarType), but you may want something different
** probably should be sum(a.data[1].data)

struct FooType <: AbstractVector{Float64}
    data::Array{Float64,1}
    dim::Int64
end

FooType(n::Int) = FooType(Array{Float64}(collect(n:-1:1)), n)
Base.size(s::FooType) = (s.dim,)
Base.getindex(s::FooType, i::Int) = s.data[i]
Base.setindex!(s::FooType, v, i::Int) = (s.data[i] = v)
Base.show(io::IO, s::FooType) = print(io, "dim:", s.dim, ", data:", s.data);
Base.isless(a::FooType, b::FooType) = a[1]<b[1] ##? defines sort(Bartype) sort type

struct BarType <: AbstractVector{FooType}
    data::Array{FooType,1}
    dim::Int64
end

function build_bar(m::Int, n::Int)
    bar = BarType(Array{FooType,1}(m), m)
    for i in 1:m
        bar.data[i] = FooType(n+m-i)
    end
    bar
end

BarType(m::Int, n::Int) = build_bar(m,n)
Base.size(b::BarType) = (b.dim,)
Base.getindex(b::BarType, i::Int) = b.data[i]
Base.setindex!(b::BarType, val, i::Int) = (b.data[i] = val)

f = FooType(2)
b = BarType(3,2)
sort(f)     # 
sort(b)     # working
sort(b.data) #

#7

(its not the way you are looking at it)

a < b is an element-centric operation (is the element a less than the element b? where you get to define the meaning of “less than” should a, b be elements of your own structured type.

[a1, a2, a3] < [b1, b2, b3] is not directly obainable in general given the element-centric definition of < and nothing more … are you comparing the lengths of those arrays or are you comparing a1<b1, a2<b2 and then tallying the boolean results to obtain a measure (are their more trues than falses) or are you taking the norm of each and determining which less …

For the built-in types, there has been much attention given these sort of questions. So you should not be surprised that you can operate with e.g. Float64 values without needing to define supportive functors.

When you introduce a type and you desire that it behave well in response to, here, queries on relative magnitude as it may be sensible to define that, then you should not be surprised that this occurs only after you define supportive functors.

isless, when used as an ordering-projective operator, is always initially considered pairwise. There are good mathematical and computer science reasons for this. isless is doing the hard work when sorting algorithms are invoked on data from an orderable domain (letting go of the distinction that total orders and partial orders bring – though Julia does work appropriately to serve both needs).


#8

It might be confusing the sorting algorithm, foo(A) != foo(B) where !(foo(A) < foo(B)) & !(foo(B) < foo(A))

Base.:(==)(a::FooType, b::FooType) = a.data == b.data
Base.isless(a::FooType, b::FooType) = a[1]<b[1] 

#9

Let my clarify again: Base.isless(a::FooType, b::FooType) = a[1]<b[1] may seem a nonsense way to compare two vectors, but it doesn’t matter here, the problem is different. The issue is not how elements are ordered, but the fact that I get an error when trying to sort a BarType (which is an array of FooType, and a order relationship exists between FooType objects because of isless).

@y4lu : your solution is very similar to my original code, it just replaces OffsetArray with Array in FooType. Your code works, mine doesn’t.
I have to assume that the issue is strictly related to the use of OffsetArray?
Can you provide a working code using OffsetArray or, at least, the reason why it doesn’t work?

@JeffreySarnoff Again, I think we’re talking about two different things. In reference to your last example, let’s make a parallelism between your a, [a1, a2, a3] and my FooType and BarType.

a    <->  FooType
b    <->  FooType
[a1,a2] <-> BarType

sort(my_bar) corresponds to sort([a1,a2]), which ultimately requires an order relationship to be established between a1 and a2. These two are instances of FooType, so the isless() operator must be defined for FooType. And this is exactly what I did in my code.
Probably the confusion is due to FooType being an array too, but as long as I define a comparison operator for FooType objects, it behaves like a scalar from the BarType’s point of view.
(end of my digression with Jeffrey)

I suspect the problem has to do with OffsetArray, hope someone can give an explanation.


#10

I’ve another example which is emblematic:

x = [FooType(2), FooType(3)]
sort(x)    #-> works

This is the proof that the problem isn’t comparing objects of FooType, but is something else.
The question remains open


#11

I haven’t played with Offset Arrays at all, but from the description they probably have their own way of doing getindex. You may need to pass in a set of index values?

There is a possible shortcut in that
Base.getindex(b::BarType, i::Int) = b.data[i].data[1]
should be enough to run sort(b::BarType)


#12

What happens is that this fails:

julia> similar(b, Float64,  (3,))
ERROR: MethodError: Cannot `convert` an object of type Int64 to an object of type BarType
This may have arisen from a call to the constructor BarType(...),
since type constructors fall back to convert methods.
Base.similar(b::BarType, ::Type, d::Dims) = BarType(d[1])

There is no constructor for BarType that accepts this.

I don’t know what the correct definition is for you but defining it something like:

julia> Base.similar(b::BarType, ::Type, d::Dims) = BarType(d[1], 1)

julia> sort(b)     # -> doesn't work (but I want it to work!)
3-element BarType:
 2
 2
 2

makes the sort work.


#13

The stacktrace on 0.7 is much better.

julia> sort(b)     # -> doesn't work (but I want it to work!)
ERROR: MethodError: no method matching BarType(::Int64)
Closest candidates are:
  BarType(::Int64, ::Int64) at REPL[15]:1
  BarType(::Any, ::Any) at REPL[13]:2
  BarType(::Any) where T<:AbstractArray at abstractarray.jl:22
  ...
Stacktrace:
 [1] Type at ./abstractarray.jl:22 [inlined]
 [2] similar at ./REPL[18]:1 [inlined]
 [3] similar at ./abstractarray.jl:555 [inlined]
 [4] similar at ./abstractarray.jl:554 [inlined]
 [5] copymutable at ./abstractarray.jl:803 [inlined]
 [6] #sort#8 at ./sort.jl:678 [inlined]
 [7] sort(::BarType) at ./sort.jl:678
 [8] top-level scope

#14

Thank you very much! I totally agree that stacktrace is better! How do I enable it? I need to download one of those nightly builds of Julia?


#15

Yes, this is with the latest master compiled. You can download one of the nightly builds from the julilang page but note that most packages are not updated yet so they might fail or dump a lot of deprecation warnings.