Regarding the Tables.jl interface, I find the part about partitions quite confusing. For starters it’s quite sparse, as far as I can tell there’s only a few paragraphs [1] on Tables.partitions
and Tables.partitioner
. And then it doesn’t actually give you any context. It doesn’t tell you what problem is being solved, it just tells you what actions are performed by calling the functions.
I have the following specific problem, if anyone knows the answer:
Conceptually, lets say I have a partitioned table:
using DataFrames
df1 = DataFrame(:a=>[1,2,3,4,5,6,7])
df2 = DataFrame(:a=>[8,9,10])
t = [df1, df2]
I call this a partitioned table because I can iterate over it, and each element is a Tables.istable
. However, t
by itself is not Tables.istable
. So if I want to make t
into something conforming to the Tables.jl interface, what do I have to do? Something that I can call Tables.rows
on, and will iterate first over df1
then over df2
seamlessly.
At first I thought this is the point of either Tables.partitions
or Tables.partitioner
- take an iterator of Tables and return a Table. But no, that’s not the case because neither of the things below work:
t |> Tables.partitions |> Tables.istable # false
t |> Tables.partitioner |> Tables.istable # false
[1] Home · Tables.jl
1 Like
You might find this video from @quinnj useful:
1 Like
I assume you want this:
julia> for row in Tables.rows(TableOperations.joinpartitions(Tables.partitioner([df1, df2])))
@show row
end
row = Tables.ColumnsRow{Tables.CopiedColumns{TableOperations.JoinedPartitions{Tables.Schema{(:a,), Tuple{Int64}}}}}:
:a 1
row = Tables.ColumnsRow{Tables.CopiedColumns{TableOperations.JoinedPartitions{Tables.Schema{(:a,), Tuple{Int64}}}}}:
:a 2
row = Tables.ColumnsRow{Tables.CopiedColumns{TableOperations.JoinedPartitions{Tables.Schema{(:a,), Tuple{Int64}}}}}:
:a 3
row = Tables.ColumnsRow{Tables.CopiedColumns{TableOperations.JoinedPartitions{Tables.Schema{(:a,), Tuple{Int64}}}}}:
:a 4
row = Tables.ColumnsRow{Tables.CopiedColumns{TableOperations.JoinedPartitions{Tables.Schema{(:a,), Tuple{Int64}}}}}:
:a 5
row = Tables.ColumnsRow{Tables.CopiedColumns{TableOperations.JoinedPartitions{Tables.Schema{(:a,), Tuple{Int64}}}}}:
:a 6
row = Tables.ColumnsRow{Tables.CopiedColumns{TableOperations.JoinedPartitions{Tables.Schema{(:a,), Tuple{Int64}}}}}:
:a 7
row = Tables.ColumnsRow{Tables.CopiedColumns{TableOperations.JoinedPartitions{Tables.Schema{(:a,), Tuple{Int64}}}}}:
:a 8
row = Tables.ColumnsRow{Tables.CopiedColumns{TableOperations.JoinedPartitions{Tables.Schema{(:a,), Tuple{Int64}}}}}:
:a 9
row = Tables.ColumnsRow{Tables.CopiedColumns{TableOperations.JoinedPartitions{Tables.Schema{(:a,), Tuple{Int64}}}}}:
:a 10
3 Likes
and then, of course,
julia> Tables.istable(TableOperations.joinpartitions(Tables.partitioner(t)))
true