Having trouble implementating a Tables.jl row-table; when using BadukGoWeiqiTools, DataFrame(tbl) no longer works!

I have checked through the other posts on implementing the Tables.jl row interface and I think I’ve implemented everything in the Tables.jl row-table interface

Here’s a MWE

using TableScraper, DataFrames
tbl = TableScraper.Table([["abc" for _ in 1:9] for j in 1:1], "names".*string.(1:9))
DataFrame(tbl) #this works!

using WeakRefStrings # the culprit
DataFrame(tbl) #the same code now fails

which gives this perplexing error.

Why does using WeakRefStrings change how DataFrame(tbl) works? This is really odd.

How do I go about debugging this? I tried tracing through the code, but the same code is called. A bit out of ideas at the moment.

ERROR: MethodError: promote_type(::Type{Union{}}, ::Type{String}) is ambiguous. Candidates:
  promote_type(::Type{Union{}}, ::Type{T}) where T in Base at promotion.jl:224
  promote_type(::Type{T}, ::Type{String}) where T<:WeakRefStrings.InlineString in WeakRefStrings at C:\Users\RTX2080\.julia\packages\WeakRefStrings\a3jYm\src\inlinestrings.jl:44
Possible fix, define
  promote_type(::Type{Union{}}, ::Type{String})
Stacktrace:
 [1] add_or_widen!(val::String, col::Int64, nm::Symbol, dest::Tables.EmptyVector, row::Int64, updated::Base.RefValue{Any}, L::Base.HasLength)
   @ Tables C:\Users\RTX2080\.julia\packages\Tables\gg6Id\src\fallbacks.jl:150
 [2] eachcolumns
   @ C:\Users\RTX2080\.julia\packages\Tables\gg6Id\src\utils.jl:127 [inlined]
 [3] _buildcolumns(rowitr::Tables.IteratorWrapper{TableScraper.Table}, row::Tables.IteratorRow{TableScraper.TableRow}, st::Int64, sch::Tables.Schema{(:names1, :names2, :names3, :names4, :names5, :names6, :names7, :names8, :names9), nothing}, columns::NTuple{9, Tables.EmptyVector}, updated::Base.RefValue{Any})
   @ Tables C:\Users\RTX2080\.julia\packages\Tables\gg6Id\src\fallbacks.jl:187
 [4] buildcolumns
   @ C:\Users\RTX2080\.julia\packages\Tables\gg6Id\src\fallbacks.jl:217 [inlined]
 [5] columns
   @ C:\Users\RTX2080\.julia\packages\Tables\gg6Id\src\fallbacks.jl:262 [inlined]
 [6] DataFrame(x::TableScraper.Table; copycols::Nothing)
   @ DataFrames C:\Users\RTX2080\.julia\packages\DataFrames\nxjiD\src\other\tables.jl:58
 [7] DataFrame(x::TableScraper.Table)
   @ DataFrames C:\Users\RTX2080\.julia\packages\DataFrames\nxjiD\src\other\tables.jl:49
 [8] top-level scope
   @ REPL[144]:1
versioninfo()

which is

Julia Version 1.6.1
Commit 6aaedecc44 (2021-04-23 05:59 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 6
  JULIA_PKG_DEVDIR = c:/git/
The `TableScraper.Table` implementation
``` import Tables # import Tables: istable, rowaccess, columnaccess, rows, columnnames, getcolumn, AbstractRow import Base: eltype, length, iterate

a table structure local to TableScraper

struct Table
rows
columnnames
end

struct TableRow <: Tables.AbstractRow
row::Int
source::Table
end

Tables.istable(::Table)=true
Tables.rowaccess(::Table)=true
Tables.columnaccess(::Table)=true
Tables.columnnames(t::Table)=t.columnnames
Tables.rows(t::Table)=t

Base.eltype(::Table) = TableRow
Base.length(t::Table) = length(t.rows)
Base.iterate(t::Table, st = 1) = st > length(t) ? nothing : (TableRow(st, t), st+1)

function Tables.getcolumn(t::TableRow, ::Type, col::Int, nm::Symbol)
tbl = getfield(t, :source)
row = tbl.rows[t.row]
row[col]
end

function Tables.getcolumn(t::TableRow, i::Int)
tbl = getfield(t, :source)
row_num = getfield(t, :row)
row = tbl.rows[row_num]
row[i]
end

function Tables.getcolumn(t::TableRow, nm::Symbol)
tbl = getfield(t, :source)
row_num = getfield(t, :row)
row = tbl.rows[row_num]
col = indexin([string(nm)], tbl.columnnames)[1]
row[col]
end

function Tables.getcolumn(t::TableRow, nm::String)
Tables.getcolumn(t, Symbol(nm))
end

Tables.columnnames(t::TableRow) = getfield(t, :source).columnnames

</summary>

somehow DataFrame calls promote_type(:Type{Union{}}, ::Type{T}) indirectly for T = String.

promote_type(::Type{Union{}}, ::Type{T}) where T in Base at promotion.jl:224

But WeakRefString.jl defined the below, causing an ambiguity cos Union{} <: InlineString is true.

  promote_type(::Type{T}, ::Type{String}) where T<:InlineString in WeakRefStrings at c:\git\WeakRefStrings\src\inlinestrings.jl:44

The ambiguity is caused by promote_type(::Type{T}, ::Type{String}). Since every type is a supertype of Union (as it’s the bottom type in the type lattice), neither method is more specific than the other. I’d investigate/open an issue about why DataFrames calls that promotion in the first place, it shouldn’t do that. WeakRefStrings seems to be behaving correctly here, since their T is restricted to being subtypes of their own type.

I think it is because WeakRefStrings overloaded promote_rule and not promote_type.

PR to fix use promote_rule not promote_type by oxinabox · Pull Request #73 · JuliaData/WeakRefStrings.jl · GitHub
PR to document this Document that you should not overload promote_type directly by oxinabox · Pull Request #41386 · JuliaLang/julia · GitHub
(someone will come and tell me there if i am wrong)

1 Like