Hi all, I’m interested to know if it would be possible to adjust the Appender API in DuckDB to support JSON. Let me explain.
As noted here JSON Overview – DuckDB we can create columns with the JSON
data type. However, if we say create a dataframe with a column of formatted JSON strings and naively use the appender, it won’t work. I guess that when the appender gets a String
type from Julia, it will call down to the duckdb_append_varchar
method (see duckdb/tools/juliapkg/src/appender.jl at main · duckdb/duckdb · GitHub) but there is no automatic coversion of the VARCHAR
(corresponding DuckDB type) to JSON
. Although the bottom of this page indicates that it is possible, at least in some cases JSON Type – DuckDB.
So, to ask the DuckDB experts, is this possible? Is it possible from the Julia level to some how detect a column has JSON
type and allow VARCHAR
conversion? I also looked at the C API and am not even sure how (if) this would work at that level. Appender – DuckDB
For interest, here is a little code snippet that illustrates what I mean.
using DuckDB, DataFrames, JSON3
if isfile("./test.db")
rm("./test.db")
end
con = DBInterface.connect(DuckDB.DB, "./test.db")
DBInterface.execute(con, """
CREATE TABLE tab (
name TEXT PRIMARY KEY,
data JSON
);
""")
# works
# DBInterface.execute(con, """
# INSERT INTO tab VALUES
# ('a', '{"name":"Beavis", "age":30, "car":null}'),
# ('b', '{"name":"Butthead", "age":30, "car":"bad"}');
# """)
# Appender won't work
df = DataFrame(
name = ["a", "b"],
data = ["""{"name":"Beavis", "age":30, "car":null}""", """{"name":"Butthead", "age":30, "car":"bad"}"""]
)
appender = DuckDB.Appender(con, "tab")
for i in eachrow(df)
for j in i
DuckDB.append(appender, j)
end
DuckDB.end_row(appender)
end
# get the data
vals = DBInterface.execute(con, "SELECT * FROM tab;") |> DataFrame
JSON3.read(vals[1, :data])
DBInterface.close!(con)