Hi all
Just reposting this here
# By removing the sleep - query executes much faster #360
Does anyone know why there is an sleep?
Does something horrible happen if I remove it?
Cheers
Hi all
Just reposting this here
# By removing the sleep - query executes much faster #360
Does anyone know why there is an sleep?
Does something horrible happen if I remove it?
Cheers
The sleep
is there because the ccall
right before that may return, even though the query is still executing. Some very fast queries may be bottlenecked by that sleep
, while for other longer running queries that sleep
is essential for correctness. Note that this is happening in a loop, which is only broken out of once the query truly is already done.
For some reason Iâm quite bottlenecked by this sleep.
ccall
return, even though the query is still executingI do not know - itâs just the design of the C-library being called, presumably in an attempt to allow the calling code to do some other useful work in a multitasking environment, while the database crunches on the query.
Well, your program would no longer be correct, as the wrapper is built with the assumption that the function with the ccall
inside only returns once the query is actually completed.
But this is not guaranteed - ccall
could take longer than 0.001
seconds. Should there be an way to wait for the ccall
to finish
?
I donât know the ODBC libs, but I donât think it would affect correctness. Itâll just burn lots of CPU time repeatedly asking âyou done yet?â without actually doing any meaningful work, blocking other threads/tasks/programs on your same machine from doing other things and potentially making the SQL query itself slower.
In that case souldnât there be a way to disable the sleep
?
(By deving and disabling the sleep I go from 10min to 3sec)
A good method would be to start small and exponentially increase delay, until some limit or even perhaps fail then. Currently this loop might not exit.
Definitely recommends a code change.
while true
ret = ccall( ($func, "odbc32"), stdcall, SQLRETURN, $args, $(vals...))
ret == SQL_STILL_EXECUTING || break
sleep(0.001)
end
This could be a fix. What do you think?
Hi, sorry for the slow responses here; Iâm the primary author of ODBC.jl.
(By deving and disabling the sleep I go from 10min to 3sec)
This doesnât make sense to me. The code in question is:
while true
ret = ccall( ($func, "odbc32"), stdcall, SQLRETURN, $args, $(vals...))
ret == SQL_STILL_EXECUTING || break
sleep(0.001)
end
i.e. we perform the ODBC ccall, then only call sleep
if the return code from the ccall is SQL_STILL_EXECUTING
, which from the spec, has this to say:
When the application executes a function with a statement or connection that is enabled for asynchronous processing, the driver performs a minimal amount of processing (such as checking arguments for errors), hands processing to the data source, and returns control to the application with the SQL_STILL_EXECUTING return code. The application then performs other tasks. To determine when the asynchronous function has finished, the application polls the driver at regular intervals by calling the function with the same arguments as it originally used. If the function is still executing, it returns SQL_STILL_EXECUTING; if it has finished executing, it returns the code it would have returned had it executed synchronously, such as SQL_SUCCESS, SQL_ERROR, or SQL_NEED_DATA.
Whether a function executes synchronously or asynchronously is driver specific. For example, suppose the result set metadata is cached in the driver. In this case, it takes very little time to execute SQLDescribeCol and the driver should simply execute the function rather than artificially delay execution. On the other hand, if the driver needs to retrieve the metadata from the data source, it should return control to the application while it is doing this. Therefore, the application must be able to handle a return code other than SQL_STILL_EXECUTING when it first executes a function asynchronously.
Thereâs perhaps something fishy going on w/ the specific driver youâre using and it not doing something correctly perhaps wrt asynchronous execution? But if youâre disabling the sleep, it doesnât seem possible that you can guarantee things are valid, according to spec. It also seems suspicious that the timing is going from 10min to 3sec by disabling a 1millisecond sleep? Something doesnât seem to add up in all this.
With the sleep
Without
I would guess this is a corner-case where the SQL execution takes on the order of 10”s to complete (but is asynchronous) and is called 100,000 times?
I tested it for a couple of queries and all of them are slower and sleep. I used the same snowflake odbc driver in python with the pyodbc PKG and it didnât had these problems
Reminds me of ramp up the sleep time when connecting to the socket by KristofferC · Pull Request #169 · sciapp/gr · GitHub. Doing some ramp up of that sleep time might work well.
The logic in
is pretty much what is needed. Note that the logic there supports failing after a maximum number of retries, which should apply here too. Exponential growth with a 2x factor (or some hard-coded const
factor) sounds reasonable.
Final note, there is a bunch of repeated code as the PR from @Impressium shows. Perhaps it is an opportunity to refactor the retry loop and avoid the almost identical 3 times repeated code.