Why is there a `sleep` in ODBC.jl

Hi all :slight_smile:

Just reposting this here
# By removing the sleep - query executes much faster #360

Does anyone know why there is an sleep?
Does something horrible happen if I remove it?

Cheers

The sleep is there because the ccall right before that may return, even though the query is still executing. Some very fast queries may be bottlenecked by that sleep, while for other longer running queries that sleep is essential for correctness. Note that this is happening in a loop, which is only broken out of once the query truly is already done.

For some reason I’m quite bottlenecked by this sleep.

  • Why can the ccall return, even though the query is still executing
  • Does something horrible happen if I remove it though?
1 Like

I do not know - it’s just the design of the C-library being called, presumably in an attempt to allow the calling code to do some other useful work in a multitasking environment, while the database crunches on the query.

Well, your program would no longer be correct, as the wrapper is built with the assumption that the function with the ccall inside only returns once the query is actually completed.

But this is not guaranteed - ccall could take longer than 0.001 seconds. Should there be an way to wait for the ccall to finish?

I don’t know the ODBC libs, but I don’t think it would affect correctness. It’ll just burn lots of CPU time repeatedly asking “you done yet?” without actually doing any meaningful work, blocking other threads/tasks/programs on your same machine from doing other things and potentially making the SQL query itself slower.

2 Likes

In that case souldn’t there be a way to disable the sleep?
(By deving and disabling the sleep I go from 10min to 3sec)

A good method would be to start small and exponentially increase delay, until some limit or even perhaps fail then. Currently this loop might not exit.

Definitely recommends a code change.

            while true
                ret = ccall( ($func, "odbc32"), stdcall, SQLRETURN, $args, $(vals...))
                ret == SQL_STILL_EXECUTING || break
                sleep(0.001)
            end

https://github.com/JuliaDatabases/ODBC.jl/blob/4e0545de10b9d6f43a5955c873180870b3551836/src/API.jl#L66-L70

3 Likes

This could be a fix. What do you think?

Hi, sorry for the slow responses here; I’m the primary author of ODBC.jl.

(By deving and disabling the sleep I go from 10min to 3sec)

This doesn’t make sense to me. The code in question is:

while true
    ret = ccall( ($func, "odbc32"), stdcall, SQLRETURN, $args, $(vals...))
    ret == SQL_STILL_EXECUTING || break
    sleep(0.001)
end

i.e. we perform the ODBC ccall, then only call sleep if the return code from the ccall is SQL_STILL_EXECUTING, which from the spec, has this to say:

When the application executes a function with a statement or connection that is enabled for asynchronous processing, the driver performs a minimal amount of processing (such as checking arguments for errors), hands processing to the data source, and returns control to the application with the SQL_STILL_EXECUTING return code. The application then performs other tasks. To determine when the asynchronous function has finished, the application polls the driver at regular intervals by calling the function with the same arguments as it originally used. If the function is still executing, it returns SQL_STILL_EXECUTING; if it has finished executing, it returns the code it would have returned had it executed synchronously, such as SQL_SUCCESS, SQL_ERROR, or SQL_NEED_DATA.

Whether a function executes synchronously or asynchronously is driver specific. For example, suppose the result set metadata is cached in the driver. In this case, it takes very little time to execute SQLDescribeCol and the driver should simply execute the function rather than artificially delay execution. On the other hand, if the driver needs to retrieve the metadata from the data source, it should return control to the application while it is doing this. Therefore, the application must be able to handle a return code other than SQL_STILL_EXECUTING when it first executes a function asynchronously.

There’s perhaps something fishy going on w/ the specific driver you’re using and it not doing something correctly perhaps wrt asynchronous execution? But if you’re disabling the sleep, it doesn’t seem possible that you can guarantee things are valid, according to spec. It also seems suspicious that the timing is going from 10min to 3sec by disabling a 1millisecond sleep? Something doesn’t seem to add up in all this.

8 Likes

With the sleep

Without

I would guess this is a corner-case where the SQL execution takes on the order of 10”s to complete (but is asynchronous) and is called 100,000 times?

2 Likes

I tested it for a couple of queries and all of them are slower and sleep. I used the same snowflake odbc driver in python with the pyodbc PKG and it didn’t had these problems

Reminds me of ramp up the sleep time when connecting to the socket by KristofferC · Pull Request #169 · sciapp/gr · GitHub. Doing some ramp up of that sleep time might work well.

2 Likes

The logic in

is pretty much what is needed. Note that the logic there supports failing after a maximum number of retries, which should apply here too. Exponential growth with a 2x factor (or some hard-coded const factor) sounds reasonable.

Final note, there is a bunch of repeated code as the PR from @Impressium shows. Perhaps it is an opportunity to refactor the retry loop and avoid the almost identical 3 times repeated code.