DuckDB Array Not Supported?

I was able to use the DuckDB Appender API to insert data into a table in the form of:

CREATE OR REPLACE TABLE embeds (id INTEGER PRIMARY KEY, acc STRING, esm_embed NUMERIC(9,8)[1280])

where the esm_embed column is a vector of floats. However, when I query the table via

DBInterface.execute(dconn,"select * from embeds")

I get an error:

Unsupported type for duckdb_type_to_julia_type: DUCKDB_TYPE_ARRAY

It seems strange that the array type writes, but can’t be read back.

First question: Am I doing something wrong and this can work with correction?

Second question: If it truly isn’t supported, anybody have an idea of how hard it would be to implement? Seems like it should just be an exercise in mapping one type to another.

Is your DuckDB Julia package updated to the latest version? This was a recent addition, I posted about this recently Supporting appending `Vector` in DuckDB - #3 by slwu89

I’m looking specifically at the DuckDB ARRAY type from DuckDB Types.

This type is what is used for the VSS extension for vector similarity search, so I think there would be a benefit to having it integrate well with Vector{AbstractFloat} types in Julia.

1 Like

You can write a Vector{}, which duckdb will interpret as a LIST type and then cast to an ARRAY if that how the tables type is defined. However, if you query an ARRAY type it has a fixed length and that does not map to a Vector{} in Julia. Maybe it could map to an NTuple, but that was not implemented in that update.

If you want a Vector{} in Julia, you could try to explicitly cast the ARRAY column to a LIST in the select clause using select esm_embed::LIST, ...

1 Like