How to quickly bulk insert into postgres

Here’s a very minimal working example, with available technology, that uses COPY instead of INSERT INTO in order to append a DataFrame to an existing PostgreSQL table, assuming all the columns and types are already lined up:

using LibPQ, CSV, DataFrames

function append(conn::LibPQ.Connection, table::String, df::DataFrame)

  # Serialize DataFrame into a String
  buf = IOBuffer()
  CSV.write(buf, df, header=false)

  data = String(take!(buf))

  # Send COPY statement
  result = execute(conn, "COPY $table FROM STDIN (FORMAT csv, HEADER FALSE)")

  status(result) == LibPQ.libpq_c.PGRES_COPY_IN || error("Failed")

  # Send data
  LibPQ.libpq_c.PQputCopyData(conn.conn, data, Int32(sizeof(data))) == 1 || error("Failed")

  # Signal end-of-data
  LibPQ.libpq_c.PQputCopyEnd(conn.conn, LibPQ.libpq_c.NULL) == 1         || error("Failed")
end

In the example the whole DataFrame is converted into one big String before sending it to the database, but it could be easily done incrementally by issuing several
PQputCopyData() for each chunk of data before calling PQputCopyEnd().

1 Like