Creating custom DataFrame types

I am putting some effort to include a special DataFrame type in my package that tracks columns representing spatial coordinates:

struct GeoData
  data::AbstractDataFrame
  coordnames::Vector{Symbol}
end

I can then define simple methods for it that just forward the arguments to the underlying DataFrame method:

function readtable(args...; coordnames=[:x,:y,:z], kwargs...)
  data = DataFrames.readtable(args...; kwargs...)
  GeoData(data, coordnames)
end

It works fine, but I was wondering if there is a better way of achieving this “inheritance” of methods automatically instead of typing them one by one. Is it possible to copy all the DataFrame API to the GeoData type in a safe manner?

In other words, if we adopt composition over inheritance, there is no hope for straightforward reuse of APIs?

Have you tried struct GeoData <: AbstractDataFrame? Then all you need to do (assuming there’s a well-defined one) is to implement the basic DataFrame interface your GeoData type, and all functions that 1) take an AbstractDataFrame and 2) use the interface should automatically work.

This is how we did it in MetaGraphs.

Also: you might reconsider your GeoData struct:

struct GeoData{T<:AbstractDataFrame} <: AbstractDataFrame
  data::T
  coordnames::Vector{Symbol}
end

This avoids having an abstract type as a field, which supposedly improves performance.

2 Likes

Thank you @anon94023334, I think in this case I really need composition. It would be awesome to have the goodies of inheritance and basically mimic the behavior of a DataFrame with extra metadata automatically though. I am still getting used to these paradigms, need to read more about them.

A similar topic was discussed here

1 Like

Thank you @baggepinnen, scanning that thread I found TypedDelegation.jl, which defines a set of very useful macros for this task. I will probably use it now. I hope it will be part of Base at some point.

1 Like