Custom iterators over custom types

juliohm · December 27, 2016, 1:22pm

I am exploring the type system of Julia seriously for the first time and I got stuck with a practical problem I have in one of my packages. Suppose I have various grid types following an interface:

@doc doc"""
  A data point in an `N`-dimensional space represented by
  `N` scalars (a.k.a. coordinates) of type `T`, holding a
  value of type `V`.
""" ->
abstract AbstractDataPoint{N,T,V}

@doc doc"""
  An iterable collection of points of type `P`.
""" ->
abstract AbstractPointCollection{P}

@doc doc"""
  A grid of points of type `P`.
""" ->
abstract AbstractGrid{P}

@doc doc"""
  Valid representations of a spatial domain.
""" ->
AbstractDomain = Union{AbstractPointCollection,AbstractGrid}

@doc doc"""
  A path in a spatial domain of type `D`.
""" ->
abstract AbstractPath{D}

In this interface, I have various representations of a spatial domain AbstractDomain, either grids types or point collections. I would like to traverse such domains with different types of paths. For achieving this goal I tried to introduce the AbstractPath type.

Now consider one possible implementation of the interface:

immutable DataPoint{N} <: AbstractDataPoint{N,Real,Real}
  coords::NTuple{N,Real}
  value::Real
end
coords(p::DataPoint) = p.coords
value(p::DataPoint) = p.value

type PointCollection{N} <: AbstractPointCollection{DataPoint{N}}
  points::AbstractVector{DataPoint{N}}
end
npoints(pc::PointCollection) = length(pc.points)

type ImageGrid{N} <: AbstractGrid{DataPoint{N}}
  data::AbstractArray{Real,N}
  dx::Real
  dy::Real
  dz::Real

  ImageGrid(data) = ImageGrid(data,1.,1.,1.)
end
npoints(g::ImageGrid) = prod(size(g.data))

Finally, the question. Suppose there is a natural order for traversing spatial domains. For example, an image grid is naturally traversed in column-major order, a point collection is traversed in forward order, etc. I would like to define a DefaultPath for all AbstractDomain:

type DefaultPath{AbstractDomain} <: AbstractPath{AbstractDomain}
  # TODO
end

How would you approach this problem? Should I define a start, next and done for all new types, make them part of the interface specification and then call these methods inside of the DefaultPath? How to do that in a Julian way?

Besides the question, please feel free to give feedback on the interface, if there is anything that can be done better in terms of performance, etc.

ScottPJones · December 27, 2016, 3:38pm

There’s one issue here for performance, you are using abstract types (i.e. Real) in your concrete types, which is a “real” performance killer. Also, does ImageGrid need to be a type instead of an immutable? (do you need to modify the dx/dy/dz values, or replace the data value (which is another case of having an abstract type in a concrete type).

juliohm · December 27, 2016, 3:46pm

Thank you @ScottPJones I was thinking about this Real issue too, I will probably replace it by Float64 in the concrete types. Regarding the type versus immutable issue, I don’t plan to change the array to which the type points to, but the content of the array instead (data[i,j,k] = foo), can I use immutable safely in this case?

stevengj · December 27, 2016, 4:13pm

Normally you would do parameterize the type, so that it can be both generic and concrete:

immutable DataPoint{N,T<:Real,S<:Real} <: AbstractDataPoint{N,T,S}
  coords::NTuple{N,T}
  value::S
end

If you have an array of DataPoint and only need data[...] = foo, then it being immutable is not a problem.

ScottPJones · December 27, 2016, 4:18pm

Yes, Steven is correct, parameterization of types is your friend in Julia, will let you avoid the performance problems of run-time dispatching, and allow the structures themselves to be much more efficient as far as memory and avoiding double indirection to get to the values.

juliohm · December 27, 2016, 5:52pm

Thank you @stevengj , in that case the type is concrete? Could you please elaborate? I am still trying to grasp the syntax for parametric types in Julia, could you please explain the meaning of the expressions below?

abstract Base{T}

# expression 1
type Foo{T} <: Base{T} end

# expression 2
type Foo{T<:Real} <: Base{T} end

# expression 3
type Foo <: Base{Real} end

What are the implications of using each?

sdanisch · December 27, 2016, 6:28pm

T is able to take any type it got constructed from.
T restrict to subtypes of Real
Base{Real} will alway be Base{Real} independent of what Foo is doing.
More about parametric types:
http://docs.julialang.org/en/release-0.5/manual/types/#man-parametric-types

juliohm · December 27, 2016, 6:33pm

Thank you @sdanisch, that is very clear.

juliohm · December 27, 2016, 6:52pm

@stevengj how can the type be concrete? Also, in my case I want to have DataPoint depending only on the dimension N. I want to fix T and V to be Float64 for example.

juliohm · December 28, 2016, 9:51am

Does anyone have a suggestion for the implementation of DefaultPath? What I have in mind is something like:

immutable DefaultPath{D<:AbstractDomain} <: AbstractPath{D} # is this the best way to write it?
  domain::D # pointer to domain object?
end
function start{D<:AbstractDomain}(p::DefaultPath{D}) # do we need this boilerplate?
  if isa(p.domain, PointCollection) # any cleaner way of doing this?
    return start(p.domain.points)
  elseif isa(p.domain, ImageGrid)
    return start(p.domain.data)
  else
    error("not implemented")
  end
end

As you can see, this code isn’t very elegant and I actually don’t know if it works, just brainstorming and learning the syntax.

kristoffer.carlsson · December 28, 2016, 11:48am

You can do

start{D <: PointCollection}(p::DefaultPath{D}) = start(p.domain.points)
start{D <: ImageGrid}(p::DefaultPath{D})       = start(p.domain.data)
start(p::DefaultPath)                          = error("not implemented")

instead of type checking inside a function. I would argue this is typically more “Julian” way of programming.

stevengj · December 28, 2016, 4:21pm

If you declare e.g. immutable Foo{T<:Real}; x::T; end, then it effectively defines an abstract type Foo and a family of concrete types Foo{T} for different types T. (Foo{Int}, Foo{Float64}, etcetera).

If, in turn, you use a concrete type T, e.g. Foo{Int}, then the resulting Foo{Int} type can have all of the benefits of a type where you just declared the fields explicitly to be Int (i.e. it is a “pointer-free” type and x is stored “inline” in the structure rather than as a pointer to some “boxed” heap object, and Array{Foo{Int}} is as efficient as Array{Int}). That’s why parameterized types allow you to write type-generic data-structures that still have the efficiency of fully concrete types.

stevengj · December 28, 2016, 4:22pm

Also, you would normally not bother with this method; just omit it, and it will throw a MethodError if you try to call it.

juliohm · December 28, 2016, 6:40pm

Thank you @kristoffer.carlsson, what do you think of doing:

start(p::DefaultPath{PointCollection}) = start(p.domain.points)
start(p::DefaultPath{ImageGrid}) = start(p.domain.data)
start(p::DefaultPath) = error("not implemented")

Is there an advantage in restricting D <: PointCollection or D <: ImageGrid?

kristoffer.carlsson · December 28, 2016, 6:55pm

It depends if ImageGrid and PointCollection are abstract types or not. If they are concrete type, then doing your way works just well. If it is abstract (or parameterized) then this will happen:

julia> immutable A{T}
       x::T
       end

julia> f(x::Vector{A}) = print("a")
f (generic function with 1 method)

julia> f([A(1), A(2)])
ERROR: MethodError: no method matching f(::Array{A{Int64},1})
Closest candidates are:
  f(::Array{A,1}) at REPL[19]:1

julia> f{T <: A}(x::Vector{T}) = print("b")
f (generic function with 2 methods)

julia> f([A(1), A(2)])
b

juliohm · December 28, 2016, 10:58pm

@kristoffer.carlsson that is tricky, why the first attempt fails? Shouldn’t it work just fine?

stevengj · December 28, 2016, 11:05pm

See the section about “invariant types” in the Julia manual. This is very important to understand when working with parametric types.

juliohm · December 28, 2016, 11:43pm

@stevengj I understand the “invariant types” property of Julia, but I don’t understand how defining a method for the abstract type A is not enough for catching the case with A{Int64}. I understood that A{T} <: A always.

kristoffer.carlsson · December 28, 2016, 11:49pm

Yes but Vector{A} is not <: Vector{A{Int64}}

Evizero · December 28, 2016, 11:54pm

you mean the other way around for this particular situation (?)

Topic		Replies	Views
Constructors a la AbstractArray General Usage question , type	12	1030	July 27, 2017
Proper definition of eltype for custom iterator General Usage	2	931	March 15, 2019
Can't iterate over an AbstractArray with custom indexing? General Usage	2	343	May 11, 2020
Generically iterating over arrays of points General Usage staticarrays	3	634	July 25, 2021
How to create non-abstract subtypes of built-in abstract types? What is the Julia's substitute for Interfaces? General Usage	3	759	September 10, 2018

Custom iterators over custom types

Related topics