Connect to Redshift with SearchLightPostgreSQL

Following the tutorial Working with Genie Apps - Accessing databases with SearchLight models

I’m trying to use SearchLightPostgreSQL to connect to a Redshift database. I noticed it lists LibPQ as one of the dependencies which is what I normally use to establish connections to Redshift. However, as mentioned in LibPQ s FAQ, this requires a disabling unsupported options with the following:

conn = LibPQ.Connection("dbname=myredshift"; options=Dict{String, String}())

Is there any way I can pass configure the LibPQ connection in this way? I can’t figure out how to do it. I’m currently getting the same error I would if I used LibPQ without disabling unsupported options:

[error | LibPQ]: could not translate host name "ENV["HC_DWH_HOST"]" to address: nodename nor servname provided, or not known

I feel like I’m not getting all the information from your post…what “unsupported” options? That error looks like you passed “ENV[\“HC_DWH_HOST\”]” as the host name rather than “$(ENV[“HC_DWH_HOST”])”.

In my app I’m using LibPQ to connect to the my database and I do:

LibPQ.Connection("dbname=$(ENV["DB_NAME"]) user=$(ENV["DB_USER"]) password=$(ENV["DB_PASS"]) host=$(ENV["DB_HOST"]) port=$(ENV["DB_PORT"])")

Which works fine for passing in the values from the environment to the connect string.

I linked to the LibPQs FAQ where they explain how to connect to Redshift by disabling “unsupported options”. From the link:

Can I use LibPQ.jl with Amazon Redshift?

Yes. However, LibPQ.jl by default sets some client options to make interactions more reliable. Unsupported options must be disabled for Redshift to allow connections. To override all options, pass an empty Dict{String, String} :

conn = LibPQ.Connection("dbname=myredshift"; options=Dict{String, String}())

I don’t think the problem is what you say because this works on a clean REPL using only LibPQ:

using LibPQ
PORT = "5439";
DB = "abc";
CONN_URL = "postgresql://"*USER*":"*PASS*"@"*HOST*":5439"*"/"*DB;
conn = LibPQ.Connection(CONN_URL; options=Dict{String, String}())

I think the problem is that I don’t see a way to disable the unsupported options in LibPQ through SearchLightPostgreSQL.

The other thing that comes to mind would be the db/connection.yaml file where I have those ENV variables setup:

  adapter: postgreSQL
  database: abc
  host: ENV["HC_DWH_HOST"]
  username: ENV["HC_DWH_USER"]
  password: ENV["HC_DWH_PASS"]
  port: 5439

Hi @franco - if I understand correctly you need to pass an empty Dict as options. Let me check the source code, SearchLight supports the options attribute but I’m not sure if it passes it when empty.


Edit 1:
I see that the Postgres adapter uses the dns string connection SearchLightPostgreSQL.jl/SearchLightPostgreSQL.jl at f2d7dd248f053109f4c20320eee8bc22fb802dd2 · GenieFramework/SearchLightPostgreSQL.jl · GitHub

if you can provide an example of how to pass the options into the dns string into libpq, I’ll add support for it, should be very straightforward (or if you want to add it yourself, I’ll be happy to review and merge).

This is a draft of the solution I have come up with:

using LibPQ
using DataFrames

const DatabaseHandle = LibPQ.Connection
const CONNECTIONS = DatabaseHandle[]

PORT = "5439"
DB = "hc"

config = Dict(
    "host" => HOST, 
    "username" => USER, 
    "password" => PASS, 
    "port" => PORT, 
    "database" => DB

Add options parameter to my_connect (SearchLight.connect in reality)
This is passed to LibPQ.Connection() and allows connecting to RedShift
Refer to the source code:
And the corresponding option in LibPQ:
function my_connect(conn_data::Dict, options::Dict{String, String}=()) :: DatabaseHandle
    dns = String[]

    haskey(conn_data, "host")     && push!(dns, string("host=", conn_data["host"]))
    haskey(conn_data, "hostaddr") && push!(dns, string("hostaddr=", conn_data["hostaddr"]))
    haskey(conn_data, "port")     && push!(dns, string("port=", conn_data["port"]))
    haskey(conn_data, "database") && push!(dns, string("dbname=", conn_data["database"]))
    haskey(conn_data, "username") && push!(dns, string("user=", conn_data["username"]))
    haskey(conn_data, "password") && push!(dns, string("password=", conn_data["password"]))
    haskey(conn_data, "passfile") && push!(dns, string("passfile=", conn_data["passfile"]))
    haskey(conn_data, "connect_timeout") && push!(dns, string("connect_timeout=", conn_data["connect_timeout"]))
    haskey(conn_data, "client_encoding") && push!(dns, string("client_encoding=", conn_data["client_encoding"]))

    push!(CONNECTIONS, LibPQ.Connection(join(dns, " "); options))[end]

conn = my_connect(config, Dict{String, String}())

sql = """
    FROM page_ranking.position_metrics
    WHERE unique_imp = \$1

result = execute(conn, sql, ["1526313609394_20180514155553.a57c7cbf4348"])
df = DataFrame(result)

Last part is a test I did to a Redshift cluster I work with. Let me know what you think :slight_smile:

Thanks @franco looks cool!

For the final version we need to extend SearchLight.connect similar to how the MySQL adapter does. The options will be sent as conn_data["options"] similar to how it’s done here: SearchLightMySQL.jl/SearchLightMySQL.jl at 4bac3cfd5bac1e61c236c11ff52011ee6570534a · GenieFramework/SearchLightMySQL.jl · GitHub

If you make a PR, I’ll be happy to review and merge.

1 Like