Trying to better understand how Julia listens on a UNIX domain socket or named pipe

Hi all,

I copied the following code from this Discourse response about linking Julia and node.js. I’m very new to the idea of both tasks and sockets, and although I’ve got the code working perfectly for my use-case, I wanted to gain a better understanding of what is actually happening.

# Some setup code
@async begin
           server = listen(pipe)
           while true
               sock = accept(server)
               @async while isopen(sock)
                   input = JSON.parse(sock)
                   # Some code that does something to input
              end
           end
       end
# Some following code

My understanding is this:

The first @async treats everything in the begin clause as a function, wraps it in a Task and immediately starts executing the Task, as well as allowing the main routine to proceed to any code after the begin clause ends. Within this first Task, a PipeServer is created and stored in the variable server by calling listen on the UNIX domain socket denoted by pipe. Then a while true loop is initiated. I think the purpose of this loop is to make sure that the first Task doesn’t finish, and we keep listening on the UNIX domain socket indefinitely. I think (hope) I understand everything up to this point. But I don’t understand the next bit, even though it works perfectly for me.

On the first iteration of the while true loop, we call accept(server). My understanding is that this creates an open IOStream and stores it in sock. Normally, I would have thought that sock would be open immediately after this call. Is this right? Or is sock only “open” when some data is sent through the domain socket?

If it is open immediately after the call, then I definitely don’t understand what happens next. We create a second Task that iterates on while isopen(sock). This task will keep running, since sock is open, and since it is an @async task, our routine can also continue on to the next iteration of the while true loop, where we create another sock? This doesn’t sound right at all. Also, how does the code in that second Task, in the while isopen(sock) line, know when data is coming through and it has to run, versus when no data is coming through and it shouldn’t run?

So maybe sock is not open immediately after the accept(server) call, but this also defies everything I currently understand about how an IOStream works.

As you can see, I’m going in circles here.

I apologize for the rather wordy post, but if anyone can take the time to explain what is actually happening here in simple terms I would be very appreciative.

Cheers and thanks,

Colin

2 Likes

listen tells the OS you want to receive TCP/IP connectios on a port. accept waits for a connection to come in. While it is waiting other tasks get to run. So the reason to put the whole block in an @async element is to allow you to do something else while waiting for a connection. If you don’t want to do anything (including enter commands in REPL) then you don’t need the first @async.

The while true loop is just there so you can accept the next connection and the next and the next. In this way you can have multiple connections active at once. The inner @async is there so you can go back and wait for another connection. Normally there is a backlog of about 16 connections, (it may be more these days). That means that 16 additional connections can come in and the OS will hold them for you until you call accept again. The 17 connection will be rejected, so you need to get back and call accept as fast as you can. Also if you don’t accept soon enough the connection could time out and the remote end will give up. So the inner @async is to create a task to handle the connection so the main task can go back and accept the next connection.

A connection can drop at any time. Someone could pull a network cable, a computer can crash, a router can crash, a program can crash. The isopen() ensures that connection is open “now”. In this case it also appears to be so that the client could send multiple JSON objects before it closes the connection. So if the client sent one JSON object then closed the connection, you would read one JSON object the while would be false, and you are done. If the client sent two you would repeat the while twice then the connection would be closed and you would be done.

Reading and writing to the network, screen, or disk are all tasks that “take a long time” in computer terms. Basically you are waiting for hardware to do something and the CPU is idle. With Julia when you do these operations the current task gives up the CPU so another task can run.

So with this example the outer @async task gets to run when when there is a connection and the inner @async tasks are waiting. The inner @async tasks run when they actually need the CPU and nobody else is doing anything with it. Where this will all break down is if you have calculate something that takes a long time, like say pie to the millionth decimal point. At that point threads need to get involved, or you need to add “yields/sleeps” into the code so that the CPU intensive task will allow the other tasks a chance to run.

4 Likes

You are a wonderful human. This is really clear. In particular:

really clarified a lot of things for me in an instant. Hopefully this helps other newbies to the topic too.

Cheers and thanks again,

Colin

1 Like