You’re correct. But what I said is still true for two reasons.
A lot of Go API wrapping libraries are written in Go and thus don’t suffer from the issues where OS threads become blocked (whereas a number of Julia API libraries are wrapping C API’s which do block the OS threads)
Go can create new threads at runtime, so even if all your OS threads become blocked, Go can create new ones (though with a significant performance decrease).
Julia wraps libuv for I/O. Do you think it’s still a problem? (I’ve never written massively concurrent program so I’m just curious.)
But I agree that this would be a problem if you use arbitrary non-Julia libraries via FFI in a concurrent application (it may even deadlock). IIUC Go’s approach is to create a dedicated OS thread to avoid this. Julia takes a bit more manual approach (you need to use Base.@threadcall instead of @ccall) and it’s rather limited ATM.
Julia wraps libuv for I/O. Do you think it’s still a problem?
For IO code written in Julia, not at all. :^) As you pointed out, the problem is using C libraries via FFI (which a good number of Julia libraries for things like Postgres, Mongo, etc. are).
Also, TIL: @threadcall
Unfortunately, this looks less than ideal:
Concurrency is limited by size of the libuv thread pool, which defaults to 4 threads but can be increased by setting the UV_THREADPOOL_SIZE environment variable and restarting the julia process.
The inability to create new threads at runtime is less than ideal. You could probably write your own stuff to spawn new threads using ccall… but… good luck with that nightmare.
IIUC Go’s approach is to create a dedicated OS thread to avoid this
I am not a Go person (I cannot stand the language, it seems to be the antithesis of Julia in a lot of ways), but my understanding is that it doesn’t create a new thread for every ccall but is willing to spin up more threads if ccall code ends up blocking the Go threads.
OK, it makes sense. But this is rather the problem of the C interface of those libraries, right? Wouldn’t any async framework have the same problem? Also, it sounds like using a single thread makes things worse. If you use @spawn you can at least make pure Julia side concurrent and parallel.
Ah, yes, that’s more accurate. I was meant to say use dedicated thread pool and create extra thread if it needs more.
The same is true for Julia - I could re-implement the MongoDb library in Julia, and it would be perfectly non-blocking. The problem lies in the number of third party libraries for Julia, and their implementation details. But hey, when Go was first created, it also had this problem. At the rate Julia’s popularity is increasing, it is just a matter of time.
I agree. Dynamically creating new threads would work, but it would be inefficient.The performance is decreased either because of OS context switches, or because of blocking calls.
Even if I could guarantee that these C lib calls don’t take too much time, they would slow down the program significantly. For example, inserting a single document into mongodb may take just 10msec. But out of that 10msec, filling up the outgoing socket buffer and reading back answer from the incoming buffer is less than 1msec. The remaining 9msec will be spent with waiting for an answer. This network latency cannot be avoided, and it will have a very bad effect on the number of requests that can be processed concurrently. Even if there are no “long running” calls, the performance will never be as good as it ought to be.
One of the reasons I want to switch to Julia is because of its superior performance. But there is this problem with C lib calls and not having properly written third party libraries. The only true solution is to re-write the library. Then (I hope?) I/O performance will be similar to what tornadoweb can do, and I can take advantage of the superior computing performance at the same time.
I have written web app servers in Python and tornadoweb, and I got the best performance when I started multiple single-threaded processes and balanced the load between them with nginx. Instead of starting multiple threads with shared state, it is better to start multiple processes with separated state and independent I/O loops. This avoids context switching, and takes advantage of all CPU cores at the same time. (The only exception would be a memory-bound problem, where the bottleneck is the amount of available memory.) I guess the same would be true for Julia.