JuliaDB - left join with string error

Hi all, I ran into an issue overnight in loading some data via JuliaDB where, with a left join on a common column in two tables, you can get an access to undefined reference exception with string columns being joined. It seems that someone filed a ticket concerning this about six months ago (https://github.com/JuliaComputing/JuliaDB.jl/issues/216), but not sure if anyone has addressed it. The bug is easily reproducible:

l = table([1,1,2,2], [1,2,1,2], ["1","2","3","4"], names=[:a,:b,:c], pkey=(:a, :b))

r = table([0,1,1,3], [1,1,2,2], ["1","2","3","4"], names=[:a,:b,:d], pkey=(:a, :b))

join(l, r, how=:left)

which yields…

Stacktrace:
 [1] getindex at /home/ubuntu/.julia/packages/WeakRefStrings/RmyGQ/src/WeakRefStrings.jl:296 [inlined]
 [2] copyto!(::IndexLinear, ::Array{Union{Missing, String},1}, ::IndexLinear, ::WeakRefStrings.StringArray{String,1}) at ./abstractarray.jl:753
 [3] Type at ./abstractarray.jl:745 [inlined]
 [4] convert at ./array.jl:474 [inlined]
 [5] outvec(::WeakRefStrings.StringArray{String,1}, ::Array{Int64,1}, ::Type{Missing}) at /home/ubuntu/.julia/packages/IndexedTables/iX34D/src/join.jl:177
 [6] (::getfield(IndexedTables, Symbol("##259#261")){DataType})(::WeakRefStrings.StringArray{String,1}) at /home/ubuntu/.julia/packages/IndexedTables/iX34D/src/join.jl:389
 [7] macro expansion at ./namedtuple.jl:182 [inlined]
 [8] map(::getfield(IndexedTables, Symbol("##259#261")){DataType}, ::NamedTuple{(:d,),Tuple{WeakRefStrings.StringArray{String,1}}}) at ./namedtuple.jl:177
 [9] #join#257(::Symbol, ::Bool, ::Tuple{Symbol,Symbol}, ::Tuple{Symbol,Symbol}, ::Tuple{Int64}, ::Tuple{Int64}, ::Nothing, ::Bool, ::Nothing, ::Nothing, ::Bool, ::Type, ::typeof(join), ::typeof(IndexedTables.concat_tup), ::IndexedTable{StructArrays.StructArray{NamedTuple{(:a, :b, :c),Tuple{Int64,Int64,String}},1,NamedTuple{(:a, :b, :c),Tuple{Array{Int64,1},Array{Int64,1},Array{String,1}}}}}, ::IndexedTable{StructArrays.StructArray{NamedTuple{(:a, :b, :d),Tuple{Int64,Int64,String}},1,NamedTuple{(:a, :b, :d),Tuple{Array{Int64,1},Array{Int64,1},Array{String,1}}}}}) at /home/ubuntu/.julia/packages/IndexedTables/iX34D/src/join.jl:389
 [10] (::getfield(Base, Symbol("#kw##join")))(::NamedTuple{(:how,),Tuple{Symbol}}, ::typeof(join), ::Function, ::IndexedTable{StructArrays.StructArray{NamedTuple{(:a, :b, :c),Tuple{Int64,Int64,String}},1,NamedTuple{(:a, :b, :c),Tuple{Array{Int64,1},Array{Int64,1},Array{String,1}}}}}, ::IndexedTable{StructArrays.StructArray{NamedTuple{(:a, :b, :d),Tuple{Int64,Int64,String}},1,NamedTuple{(:a, :b, :d),Tuple{Array{Int64,1},Array{Int64,1},Array{String,1}}}}}) at ./none:0
 [11] (::getfield(Base, Symbol("#kw##join")))(::NamedTuple{(:how,),Tuple{Symbol}}, ::typeof(join), ::IndexedTable{StructArrays.StructArray{NamedTuple{(:a, :b, :c),Tuple{Int64,Int64,String}},1,NamedTuple{(:a, :b, :c),Tuple{Array{Int64,1},Array{Int64,1},Array{String,1}}}}}, ::IndexedTable{StructArrays.StructArray{NamedTuple{(:a, :b, :d),Tuple{Int64,Int64,String}},1,NamedTuple{(:a, :b, :d),Tuple{Array{Int64,1},Array{Int64,1},Array{String,1}}}}}) at /home/ubuntu/.julia/packages/IndexedTables/iX34D/src/join.jl:404
 [12] top-level scope at none:0

Not sure if this bug is actually from JuliaDB or perhaps is something in WeakRefStrings, but it seems like a pretty significant issue.

1 Like