Defining `getindex` for `Tables.columntable` objects

pdeffebach · December 6, 2019, 9:15pm

Let’s say you are working with tabular objects and for convenience and want to use t[i, j] getindexing (and not too many other functions) throughout your code.

On the other hand you want to support the Tables.jl interface. Therefore when you receive a function argument t, you wrap it in Tables.columntable(t).

As far as I can tell, the Tables interface does not support indexing. But many implementations of Tables.columntable do support that, for instance Tables.columntable(df::DataFrame) = df.

Therefore it makes sense to do the following in your code

struct IndexableColumnTable{T}
    table::T
end

function Base.getindex(t::IndexableColumnTable)
	p = propertynames(t.table)[j]
	getproperty(t.table, p)[i]
end

Then in your function do

function foo(t)
    t = columntable(t)
    if hasmethod(getindex, (typeof(t), Int, Int))
    	return t
    else 
    	return IndexableColumnTable(t)
    end
end

Is this a reasonable approach?

nalimilan · December 11, 2019, 9:28am

I’m afraid this isn’t right. First, I think you are confusing Tables.columntable and Tables.columns: the former always returns a named tuple of vectors, so if you do t = columntable(t) you never get an object which can be indexed with two integers (even if the input was e.g. a DataFrame).

Second, supposing you did t = Tables.columns(t), the Tables.jl interface does not guaranty that getproperty(t.table, p) returns a vector, only an iterator. So I think you have to check whether the returned iterators are AbstractVector, and if not call collect on them. Then you can store the result as a named tuple of vectors (which is the most basic type of table) or use Tables.materializer(t)(cols) to create a table object of the same type as t.

Maybe Tables.jl could provide a function to do this more easily. Feel free to file an issue.

pdeffebach · December 11, 2019, 6:15pm

Thanks for the answer. It’s probably best just to use Tables.matrix in this context, even if it copies.

I get why Tables would want to be agnostic about the layout of columns, allowing for iterators or infinite streaming.

Topic		Replies	Views
Help implementing Tables.jl interface Data	5	670	November 19, 2019
Tables.jl: columntable to rowtable Data	6	1520	May 4, 2020
Table views Data api , tables	7	1675	August 31, 2020
Struggling to implement Tables.jl interface for Vector{MyStruct} New to Julia data_structures , parquet , tables	8	3542	July 2, 2020
Struggling to implement Tables.jl interface (again) New to Julia package	2	686	August 10, 2020

Defining `getindex` for `Tables.columntable` objects

Related topics