Which async web server should I use?

Hello there!

I’m a Python programmer with 20+ years of programming background. I just recently read about Julia and read about 100 pages from the official documentation at once. I’m facinated by Julia’s type system, I really like multiple dispatch and many other things.

At my work, I need to implement a microservice that uses async events and websockets for sending and receiving messages to/from web clients (browsers). I have some experience with tornadoweb and websockets, and could do this in a few days in Python. But I decided that as a learning project, I would like to implement this in Julia. Even if it takes some weeks. I have read about the scheduler built into Julia, and also about channels and tasks. These are realtively low level tools. I wonder if there is a ready to use module in Julia that could be used to quickly create an async web server? Something like tornadoweb or twisted.

I don’t need TLS encryption, servers will be put behind nginx proxies. Also, authentication, management of topics and forum members etc. is already implemented. All I need is a simple service that can use websockets (both as a server and as a client), and send/receive messages asynchronously and efficiently, and check JWT HMAC signatures for validity. (Mongodb connectivity would be a big plus.)

I have come across Websockets.jl ( https://github.com/JuliaWeb/WebSockets.jl ) but it may not be what I need. It seems to be an older version that has diverged from HTTP.jl a long time ago. Also examined HTTP.jl ( https://github.com/JuliaWeb/HTTP.jl#websocket-examples ) which seems to be the official package for the HTTP protocol, but it is not a real framework. Just to mention one example, I don’t see how to route websockets to different handlers based on the HTTP path (sent by the browser before HTTP Connection Upgrade).

So the basic quesiton is this: is there a popular package/framework for this already, or should I build my service from the grounds, using HTTP.jl?

1 Like

There’s https://github.com/GenieFramework/Genie.jl but may be too heavyweight. I don’t know how modular it is. @essenciary can say more if it would be useful

Take a look at Mux.jl?

Although the task is very simple (receive and forward messages), the number of open sockets will be high. We already have about 60k users logged in every day. Even though they are distributed between about 10 application servers, there will still probably be more than 1000 users connected to the same server at the same time. I believe that anything that uses threads instead of async I/O will not work.

Genie does not seem to be async. Maybe it is, but I’m not sure. It seems to be a complete framework, that is for sure. It even has startup project generators. :slight_smile: I can also see that it has support for websockets ( https://genieframework.github.io/Genie.jl/documentation/17--Working_with_Web_Sockets.html ) but it seems to be a very custom code that starts with sending their custom javascript code to the browser. This seems very strange to me, but I’m not going to form opinions before I read more about it.

I also did not see @async or scheduler calls in the Mux examples. Please forgive me if I don’t see something that should be obvious. This is my third day with Julia. But I love her! :-).

HTTP.jl is the de facto Julia web server. As far as I know, all the web oriented packages/frameworks in the Julia ecosystem rely on HTTP.jl. Genie uses it and so does Mux. As far as I know there is no other web server at the moment (there was the defunct WebServer.jl years ago).

The request processing model in Genie is that when a request is received it is being picked up by HTTP.jl (which listens on the port) and then the payload is forwarded to the default handler (which is Genie's entrypoint). HTTP.jl starts a thread for each request. So if that’s what you mean by async, it is running multi threaded. In my tests (on my 2016 MacBook Pro) a Genie app can handle about 1200-1300 concurrent requests per second.

HTTP.jl does support routing but Genie's router is more sophisticated in terms of pattern matching, parameters extraction, type enforcing, etc.

Genie does have built-in support for web sockets, which is built upon HTTP.jl - which means that HTTP.jl also works with web sockets. The custom JS code in Genie is not for web sockets per se, it is for using a custom model for working with web sockets. Basically Genie's model automatically captures web sockets communication within its router/MVC model through the use of WebChannels (which are 100% Genie and have nothing to do with Julia’s Channels). You can opt out of WebChannels and handle the web sockets communication yourself (or rather you need to opt-in by including the extra files, per the docs you mention) but Genie makes it easy to turn the web sockets support on and off from its config.

To conclude, HTTP.jl, Mux, and Genie all run on the HTTP.jl server - but they are on a scale in terms of features. HTTP.jl has plenty (way beyond a basic server, like I said, it has routing for instance), Mux has more, while Genie has the most. Genie is a proper web framework - HTTP.jl is seen more as a web server implementation.

7 Likes

The request processing model in Genie is that when a request is received it is being picked up by HTTP.jl (which listens on the port) and then the payload is forwarded to the default handler (which is Genie's entrypoint). HTTP.jl starts a thread for each request. So if that’s what you mean by async, it is running multi threaded.

Thank you for your detailed answer!

What I meant by async I/o is a single threaded I/o ever loop, where context switch happens explicitly. Similarly to Python’s async await, node js’ await/Promise etc. This is why I came up with tornadoweb and twisted as an example. When context switch is explicit, then it is much easier to write correct programs, and it is much easier to reason about them. You can still start threads or fork processes, but it is optional and not required for concurrency.

By starting multiple preemptive threads, where context is switched at arbitrary locations, one must use reentrant locks, semaphores etc. to protect and guard resources. This is what I want to avoid.

If HTTP.jl can only work in multi threaded mode where threads are true OS threads, and they share program state (memory), then this is not I was looking for.

HTTP.jl is by default 100% async using “green threads” as other languages may call it :wink: In other words, tasks that are schedulated async on the same thread and you switch contexts with an explicit yield

4 Likes

As usual, Simon (@sdanisch) is on the money!

Python uses an event loop, similar to NodeJS (with async/await syntax) because of design constraints. Julia is more in line with Go where the IO is actually non-blocking but appears synchronous because it is executed on green threads (which the Julia runtime takes care of scheduling - by default, there is only one OS thread that Julia will use).

Basically, this means that whenever Julia sees some “blocking” IO operation, it implicitly adds that operation to a queue of pending operations. When it’s done, Julia re-schedules the green thread and the code continues (under the hood, this uses event loops, I believe it’s actually a modified version of libuv from NodeJS).

You kind of get the best and worst of both worlds. The best is that all of this is automatic and doesn’t require extra syntax. The worst is that if you do have a long running computation that doesn’t ever hit something that would cause the green thread to suspend, you can block the underlying event loop (just like in Python or JavaScript).

You can pepper your code with yield()s to force Julia to schedule other green threads that are ready to execute (if there are any) but that’s typically not necessary.

tl;dr: Julia should already be able to deal with LOTS of WebSockets since the underlying IO implementation is async. You won’t be able to rebuild it in userland because essentially any function could (in the abstract) cause the current green thread to suspend (whereas rebuilding asyncio, for example, would require you to know that context switches only happen when you say they happen).

((caveat: this is my best understanding, but I think I have a reasonably good understanding of asyncio in Python and a good enough understanding of Julia’s concurrency model))

2 Likes

This is one thing you lose when going from the asyncio/explicit async/await world. You can kind of emulate this because Julia won’t suspend a thread until it hits an async operation, but you might be setting yourself up for bugs because it’s impossible to know from a function signature whether or not it will cause the thread to suspend.

However, you shouldn’t (I think!) typically need lots of locks on things like data structures because most operations are atomic.

e.g., with true pre-emptive concurrency, you’d need a lock to do

counter->value = counter->value + 1;

since the thread might be interrupted between reading the value and writing the value. However, in Julia, you can just do

counter.value = counter.value + 1

because that assignment won’t (or at least shouldn’t) cause the green thread to suspend during.

You can check out the extended example of the HTTP.Handlers module which provides simple routing functionality.

But yeah, in general I just use HTTP.jl for both http/websocket server/client needs. I have some sample code I could share if people are interested in doing a “threaded handler” where the request gets @spawned to another thread, which can be useful for long-running/heavy request handling, but by default, as other’s have mentioned, requests are handled via green thread.

3 Likes

@travigd > Basically, this means that whenever Julia sees some “blocking” IO operation, it implicitly adds that operation to a queue of pending operations.

Okay, so this works like an “await” except that it is not explicit, and any function call or operator that does blocking I/O can (and probably will) switch context. I have now fully read channels and tasks in “parallel computing” ( https://docs.julialang.org/en/v1/manual/parallel-computing/#Parallel-Computing-1 ) and I have a few more questions, if you don’t mind. :slight_smile:

The examples in the documentation use sleep() to simulate long running code. The official documentation of sleep does not mention anything about (green) threads and switching tasks/contexts. The OS call “sleep” does not do any I/O operation but apparently it switches tasks.

One key part I’m still missing is the ability to tell which built in functions and operators can and cannot switch tasks. This knowledge is needed to be able to write correct programs. For example, if I write a code block that computes values and put them into variables and in-memory objects, then probably Julia’s task scheduler will not switch tasks, because there is no blocking I/O involved. But this is just a hunch on my part. As the example programs have demonstrated, there are function calls like sleep() that do not do I/O at all, and yet they are able to switch contexts.

I’m not saying that I want this to be documented for every possible method that is built into Julia. I’m just asking for a general rule. I don’t want to know for sure if context switch will occur for every single function call, but I need to know something. Without at least some level of certainty, I won’t be confident enough to write concurrent programs in Julia.

The other key question is about third party libs and their relation to Julia’s task scheduler. I wonder at what levels this context switch is implemented? To be more specific, here is an example, MongoDb driver for Julia: https://github.com/felipenoris/Mongoc.jl . It may either use a C library or Julia’s sockets for making network requests to MongoDb backend, or possibly both. If it uses Julia sockets only, then possibly I can use it for cooperative tasks AKA green threads, because it won’t block them for a long time. But if it uses a C library with synchronous function calls, then it may block all tasks. I have the same uncertainty for many other libraries and tools. Most notably, database drivers like PostgreSQL and network message libs like ZeroMq.

PostgreSQL is especially in my interest, I will have to use it all the time. The recommended driver is called “LibPQ.jl”. It uses the libpq C library for sure. I’m not familiar with the low level details of libpq, but I guess that libpq does not expose raw sockets to the users. As far as I know, it has both async and sync functions that can be called, but the driver’s documentation does not have a single word about blocking I/O and switching tasks.

Is it safe to assume, that any decent third party lib written for Julia will be cooperative and won’t block execution when it comes to blocking I/O?

I was wrong, it has some notes about it. :slight_smile:

Yeah, I think that is what I really like better about the async and await syntax in other languages: it is much, much more transparent where and when a yield could occur.

2 Likes

Most of my knowledge is about cooperative concurrency in general and we’re definitely at the edge of my knowledge about how it’s done in Julia. If someone sees errors in what I’m saying, please correct me!

I suspect that the simplest wrappers will block Julia since Julia code will only run on Julia managed threads. The Julia sleep() function will just suspend the green thread (via a call into Julia internals) and wake it back up when the sleep period has ended. However, if you’re calling the C sleep function, that won’t tell Julia to suspend the green thread and execute another (moreover, Julia can’t execute another green thread since the OS thread that it would execute on is blocked).

To see this:

julia> function test_c_sleep()
           start_time = time()
           offset = () -> time() - start_time
           task = @async begin
               println("Task starting at $(offset())")
               sleep(1)
               println("Task finished at $(offset())")
           end
           println("C-sleep starting at $(offset())")
           ccall(:sleep, UInt, (UInt, ), 3)
           println("C-sleep ended at $(offset())")
           wait(task)
           return time() - start_time
       end
test_c_sleep (generic function with 1 method)

julia> test_c_sleep()
C-sleep starting at 3.0994415283203125e-6
C-sleep ended at 3.012233018875122
Task starting at 0.006867170333862305
Task finished at 4.014479160308838
4.01466703414917

We see that nothing is executing while the sleep is running (in fact, this example also shows us that async tasks aren’t executed until the current green thread suspends, which happens at the wait call).

The reason that normal Julia sockets don’t block is because they are implemented using Julia’s internal event loop (which I believe is part of LibUV).


The wait call adds an event listener to the queue. Every time Julia suspends a green thread, Julia polls all of the event listeners to see if they are ready (e.g., the socket has data available in its buffer).

That doesn’t happen in most of the Julia external API wrappers since they’re usually just thin shims over the C APIs and C APIs use OS threads as the idiomatic concurrency scheme.

Looking at Mongoc.jl, mongoc_collection_insert_one (for example) is just defined as a ccall which would block the entire Julia event loop just like the ccall(:sleep, ...) example above.

One key part I’m still missing is the ability to tell which built in functions and operators can and cannot switch tasks.

Yeah, echoing @davidanthoff here, there’s no way to know except to assume conventions. For example, operations that do IO you can assume will suspend. Anything that would otherwise block (sleep, for example) probably suspends. Anything that is a “pure” function shouldn’t suspend. For example, check_is_valid_email(::String) probably shouldn’t suspend… unless it calls out to a foreign API or something (again, it’s impossible to know!).

I think the most idiomatic thing to do in Julia is to just avoid shared mutable state in general since you only start to run into race conditions when more than one (green) thread tries to write to the same state at once. Other than that, you should wrap any relevant data structures with a concurrent-safe shim.

# Non-idiomatic code
my_data_structure = MyDataStructure()
my_lock = ReentrantLock()

function foo()
  lock(my_lock)
  some_mutating_method!(foo)
  unlock(my_lock)
end

# Idiomatic code
struct MyShim
  data::MyDataStructure
  lock::ReentrantLock
end

some_mutating_method!(foo::MyShim) = lock(foo.lock) do
  some_mutating_method!(foo.data)
end

function foo()
  some_mutation_method!(my_data_shim)
end
1 Like

This is what I was affraid of. HTTP.jl may handle 10k concurrent websockets, but if a single one of them starts a heavy aggregation query in MongoDb, then it will block all the others. I’ll have to examine all other libs that I need to use (ZmQ, PostgreSQL etc.)

It might be possible to put these heavy queries into different (true) OS threads, but:

  1. We would have to know which queries will be slow in advance, which is next to impossible. Response times depend on things that cannot be predicted.
  2. Julia’s real (not green) threading is not stable. Probably I don’t want to build a production server on top of that.

The only option I see right now is to write my own MongoDb driver that uses Julia’s event loop. That is challenging. :slight_smile:

There is an interesting article from the author of Motor (a MongoDb client that is integrared with I/O event loops) where he explains how he converted a synchronous library into an async one: https://emptysqua.re/blog/motor-internals-how-i-asynchronized-a-synchronous-library/ He had a big advantage: pymongo (the synchronous version of that library) was already written in pure Python so it was relatively easy to convert it to async, because he had access to the raw sockets and he could replace them with sockets that switch context before the socket would block.

I have the idea of tweaking the official C MongoDb library just a bit: expose the raw sockets and replace send/recv calls with callbacks that can poll the sockets and switch tasks when the socket is not ready to send or receive. Still challenging.

This is neat! I guess these co-operative ReentrantLocks are very lightweight, cheap to use.

If you start Julia with more than one thread and use Threads.@spawn instead of @async, this isn’t an issue since multithreading in Julia uses cooperative green threads. In such a condition you’d have to have $number_of_os_threads starting a heavy calculation or non-yielding task at the same time. Under the covers, the only difference between @async and Threads.@spawn is that the latter doesn’t set the sticky field of the underlying Task to true meaning that it can be run on any OS thread not just the one it was @spawned from. They’re both green threads; just one happens to be pinned to a single os-thread.

I don’t think you’d see any advantage from only using Threads.@spawn for “heavy” tasks rather than for all tasks given above explanation.

Julia’s “real” threading (assuming you mean Threads.@spawn are also green as mentioned above. They are set to be considered stable in upcoming the 1.5 release without any big changes that I’m aware of compared to 1.4, so building upon them shouldn’t be an issue.

1 Like

I’m eager to see that version. It seems that all of my problems will be solved soon. When is it coming? :slight_smile:

If you start Julia with more than one thread and use Threads.@spawn instead of @async , this isn’t an issue since multithreading in Julia uses cooperative green threads.

(emphasis mine)

I disagree. The main benefits of green threads are that they don’t come with the massive cost of kernel-level thread switching. The OS threads that are blocked can’t be used by other green threads, so this would still require a number of OS threads that would (I think!*) degrade performance substantially compared to an implementation that uses green threads for blocking IO. Also, if you saturate the number of OS threads you have, then you’re back to everything being blocked (e.g., launch 10 OS threads and have 10 Mongoc functions running, you’re no longer handling new requests). I would expect Go to dramatically outperform Julia in this area exactly because all of its IO operations use green threads (which are similarly mapped to multiple OS threads).

*I’ve not done any benchmarking here

Both Go and Julia use M:N threading. So, @spawn is pretty much like goroutine. In fact, there was some chance @spawn was called @go.