Oracle DBI driver

tk3369 · January 6, 2018, 7:01pm

Does anyone know if there’s any Oracle DBI package around? At my work place, Oracle is still a dominant data source and it would be nice to have an easy interface from Julia.

Oracle has a well-documented C API

Given Julia’s easy interface to C, this may not be too hard? Would Clang be able to do this quick?

Liso · January 7, 2018, 11:51am

I don’t think so. Look at JuliaDatabases · GitHub.

At my too.

I am a little confused if new API for Oracle has to be DBI or DBAPI driver.

DBAPI.jl is new(er) database interface proposal (See: Eric Davies: Towards A Consistent Database Interface on youtube).

But it seems it may need some period to stabilize (or revive) after 1.0 (See:
Small question about Nullable, Missing and DBAPI.jl)

Another suspicios fact is that DBAPI branch in postgresql is unfinished (last commit 2 years ago).

avik · January 7, 2018, 11:55am

I would suggest (but I am biased) using JDBC.jl to get data out of an Oracle database. You will find that to be the most reliable method IMO (provided you have java installed on the machine you are querying from). Another option is to use ODBC.jl.

tk3369 · March 8, 2018, 6:39am

I tried a different method: PyCall & cx_Oracle.

It mostly works for the trivial examples that I’ve tried so far. This code, however, does not work:

cur[:setinputsizes](Int, 20)

gives

PyError (ccall(@pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, arg, C_NULL)) <class 'cx_Oracle.NotSupportedError'>
NotSupportedError('Variable_TypeByPythonType(): unhandled data type',)

Am I bumping into a PyCall limitation?

Reference (executemany): http://www.oracle.com/technetwork/articles/dsl/python-091105.html

Liso · March 8, 2018, 7:03am

I was just thinking about this this hour!

Do you have some example/testing code to share?

tk3369 · March 8, 2018, 7:12am

Sure. It’s fairly straightforward:

using PyCall
@pyimport cx_Oracle

# convenient function to execute sql queries
function run(cur, sql)
    rs = cur[:execute](sql)
    if rs != nothing
        i = 0
        for r in rs
            i += 1
            println(i, ": ", r)
        end
        i < 1 && println("No result")
    else
        println("Done")
    end
end

# let's start
conn = cx_Oracle.connect("user/pass@server")
cur = conn[:cursor]()

# Some CRUD examples
run(cur, """
create table testme
(
    empid int primary key,
    name varchar2(100) 
)
""")

run(cur, "insert into testme values (1, 'Tom')")
run(cur, "update testme set name = 'John' where empid = 1")
run(cur, "delete from testme where empid = 1")
run(cur, "drop table testme")

# batch insert
rows = [ (1, "First" ),
         (2, "Second" ),
         (3, "Third" ),
         (4, "Fourth" ),
         (5, "Fifth" ),
         (6, "Sixth" ),
         (7, "Seventh" ) ]

cur[:bindarraysize] = 7
#cur[:setinputsizes](Int, 20)   # this line does not work
cur[:executemany]("insert into testme(empid, name) values (:1, :2)", rows)

# close cursor and connection
cur[:close]()
conn[:close]()

Liso · March 8, 2018, 7:21am

Cool, thanks!

Something like this:

cur[:setinputsizes](PyCall.PyInt_Type, 20)

could not work? Sorry I have not usable Oracle in this moment…

tk3369 · March 8, 2018, 7:24am

same error…

Liso · March 8, 2018, 9:43am

I installed oracle-xe docker but have problem to connect using PyCall…

Do you use some workaround?

julia> ENV["TNS_ADMIN"] = "path_to_tnsnames.ora";
julia> ENV["LD_LIBRARY_PATH"] = "path_to_libclntsh.so:..."
julia> using PyCall

julia> @pyimport cx_Oracle

julia> conn = cx_Oracle.connect("test/test@XE")
ERROR: PyError (ccall(@pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, arg, C_NULL)) <class 'cx_Oracle.DatabaseError'>
DatabaseError(<cx_Oracle._Error object at 0x7fd92d395a08>,)                                                                                                                 

Stacktrace:
 [...]
# although Julia's python directly work: 
shell> python
Python 2.7.12 (default, Dec  4 2017, 14:50:18) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import cx_Oracle
>>> conn = cx_Oracle.connect("test/test@XE")
>>> conn
<cx_Oracle.Connection to test@XE>

EDIT:
Seems like PyCall don’t inherit changes added by ENV[...] = ... !

exporting env variables before starting Julia helps.

Liso · March 8, 2018, 10:09am

I think this works:

cur[:setinputsizes](PyCall.pybuiltin("int"), 20)

tk3369 · March 8, 2018, 3:05pm

Nice, thanks! I’ll try that when I get back to work.

tk3369 · March 8, 2018, 3:07pm

BTW, I’ve also started wrapping the odpi-c library. It’s in a very early WIP stage at the moment and if I ever gain more traction I can put it up and share with the community.

ExpandingMan · March 8, 2018, 4:18pm

I second that JDBC.jl seems to be the most reliable and widely applicable database querying package available in Julia right now. In fact I’m going to try to do an 0.7 update PR to that in the not-so-distant-future, hopefully.

tk3369 · March 9, 2018, 2:45am

Thanks for the reminder. I will try that, too.

This JavaCall.jl disclaimer seems to be a bit of a turn-off, however.

While basic memory management has been implemented, there is the possibility of some remaining memory leaks in this system. While this is stable enough for scripting style tasks, please test yourself before deplying this to long running tasks

ExpandingMan · March 9, 2018, 2:57am

Ugh, that is indeed discouraging. For what it’s worth, I found the memory characteristics of some of the Python query frameworks to be rather bad (independently from Julia, but the situation is exacerbated when one has to port the data), which was one of the reasons I switched to JDBC. I assume that the memory situation is somewhat better with ODBC, but I find ODBC to just be a huge pain-in-the-ass (I’m sorry I don’t have a more concrete justification for my complaint, I just always feel like I have to get through so many hoops to get something working). From the bit of time I spent digging around in the internals of these things I came to the conviction that SQL is awful and you shouldn’t be doing it. (Sigh. If only such a thing were possible .)

Liso · March 9, 2018, 7:27am

It is your choice but I would publish probably also very early stage.

Discussion about API and first proof of concept tests could be useful to share (to you too).

tk3369 · March 9, 2018, 7:06pm

Any idea why JDBC.jl gives these warnings?

julia> using JDBC
Loaded /usr/jdk/jdk1.8.0_102/jre/lib/amd64/server/libjvm.so
WARNING: import of DBAPI.commit into JDBC conflicts with an existing identifier; ignored.
WARNING: import of DBAPI.rollback into JDBC conflicts with an existing identifier; ignored.

ExpandingMan · March 9, 2018, 7:20pm

As I recall there was an early attempt to standardize database interfaces called DBAPI which was abandoned quite a long time ago. There are still some remnants of it in JDBC.jl. They need to be removed. I’m pretty sure this doesn’t affect anything.

Clearly all of the Julia database stuff is in need of maintenance.

tk3369 · March 9, 2018, 7:28pm

I see. It also looks like that executeBatch isn’t supported yet. Maybe it’s not too difficult to wrap…

ExpandingMan · March 9, 2018, 7:42pm

As far as I know the wrappers are all trivial. I think the only “Julia” in JDBC right now is the DBAPI interface and the DataStreams interface that I added some time ago. I will probably have to go revisit this soon, as currently JDBC seems to be the best solution for handling a wide variety of different data sources.

It might be worthwhile to modernize DBAPI as well.