Client/Server TCP Hidden Error - is Julia reliable?

I have a server and a client, both written in Julia. I’m using the Sockets module from the standard Julia library.

I have an issue on the client where I connect to a server and read from a server. I read from the socket in a separate process via a remotecall. In the main thread I then send data to the server.

I experience a soft lock after writing. It appears as though the culprit is calling eof() in the remote call (which blocks) but the result is odd. There doesn’t seem to be an exception or error. It also doesn’t seem to be blocked by any IO as it appears to send() successfully. It just seems to halt directly after I read (and then print some debug info)

For the sake of this error I have simplified the implementation to demonstrate what’s happening

This is the serverside. I basically send data every 0.5 seconds

import Sockets
import Future
import Distributed

function main()

    server = Sockets.listen(50009)
    sock = Sockets.accept(server)
    dt = 0.5
    startTime = time();

    while (true)

          
        elapsed = time() - startTime

        if elapsed > dt
            write(sock, 1)

            startTime = time()
        end

    end

end

main()

Then client side, where I read and send data every 0.5 seconds. I read via a remotecall

import Sockets
import Future
import Distributed

function ReadAsync(sock::Sockets.TCPSocket)

    try
        println("   ASYNC Begin.")
        timestamp = 0.0
        funcVal = 0.0

        arr = Float64[]
        while !eof(sock)
        # push!(arr, read(sock, Float64))
        end

        println("   ASYNC END.")
        return timestamp, funcVal
    catch e
        println("caught it $e")
    end
    
end

function main()

    clientTime = 0.0
    lastHeardTimeout = 10.0
    dt = 0.5
    timestamp = 0.0
    funcVal = 0.0

    sock = Sockets.connect("localhost", 50009);
    println("Ready and waiting...")

    # Start a thread that checks for reads.
    r = remotecall(ReadAsync, 1, sock)

    startTime = time();
    lastHeardTime = time();
    try
    
        number = 0.0
    
        while true
    
            elapsed = time() - startTime

            if elapsed > dt
            
                clientTime = clientTime + dt

                elapsedHeardTime = time() - lastHeardTime
        
                if (elapsedHeardTime < lastHeardTimeout && iswritable(sock))
                    println("Socket Writable.")
                    numBytes = write(sock, clientTime)
                    println("Socket Wrritten.")
                    println("   IT HALTS HERE !!!!")
                    println("Continues")
                else 
                    println("I haven't heard from the server in awhile...")
                end
    
                startTime = time() # the sync waits for the previous async to be finished before starting the next iteration   
            end
        end
    
    catch e
        println("caught it $e")
        close(sock)
    end

    close(sock)
end

main()

I’m obviously doing something wrong here. But my lack of experience and lack of immediate error is making me a bit stuck.

Why did you comment out the read call?

I did this just to demonstrate the issue. With the comment, I would expect this to not yield a result, however I would not expect it to cause a halt in the main process (in an unexpected way), which is currently what happens and I’m not sure as to why

First idea: I think you’re missing a yield in:
while !eof(sock)

Isn’t the case that you have an infinite loop with no yield points (eof returns false immidiately).

Insert a yield() in there should fix it.

Would not having yield cause unexpected results? I get that the remote call will never yield a value in this circumstance, but why does that affect the main process.

The halt occurs after, the write(). It HALTS HERE is never reached. Is this expected to happen with a lack of yield in a remote call?

           numBytes = write(sock, clientTime)
                    println("Socket Wrritten.")
                    println("   IT HALTS HERE !!!!")

Because you schedule the remotecall on the main process…
You’d need to do addprocs(1) and then schedule on 2

The comment

# Start a thread that checks for reads.

Seems to be a part of the misunderstanding. You are not spawning a new thread, you simply start a task (co-routine) on the main thread. If that task is spinning in a loop, nothing else in the thread can progress.

Ah okay, I think I misunderstood remote calls. How would I go about implementing this then? Since eof() will block right?

eof can block but if it blocks it will yield to the scheduler so another task can progress.

So the scheduler “skips” over IO functionality that blocks?
Edit: Also shouldn’t eof() yield?

Not sure what you mean. IO that blocks will yield to the scheduler so that if there are other tasks in flight, they can take over and do progress.

In the case that there is data available in the stream it immediately returns false. I don’t know if it would be better to unconditionally yield in eof.

In the case that there is data available in the stream it immediately returns false . I don’t know if it would be better to unconditionally yield in eof .

In this case, there is data in the stream, thus it should return false, which means this coroutine is yielding right?

edit: ah wait sorry it’s in a while.

extra edit: So how would I actually check data is available to be read in socket?

There is bytesavailable. But why do you need that?

Because according to the documentation eof() will block to wait for more data. So I need to check if the stream is empty before checking it’s the end of the file, otherwise in the event my server has sent me nothing, i’ll be stuck in my coroutine again, blocked by eof()

You will not be stuck because it will yield to another task which can progress. Sure, the particular task will be blocked, waiting for data but that doesn’t mean your other tasks are stuck. That’s kind of the point of asynchronous code, different parts of the code can be blocked on different things while other tasks are working. When the tasks get something to read, the scheduler will resume the task and it will continue.

1 Like

It’s only asynchronous in that the scheduler keeps track of the point of execution. Tasks are still only processed one after the other (atleast on the same thread). The former creates an level of complexity which can be quite troublesome. Because then I need to ensure tasks become synchronized (in order to maintain the order of my data that I receive and sen). I’m sure there is the API to do this, but I can achieve the same outcome, serially, if I had the ability to have non-blocked IO.

That is what asynchronous (or concurrent execution) means (see e.g multithreading - How to articulate the difference between asynchronous and parallel programming? - Stack Overflow). Perhaps you want to run things in parallel. Then, you need to (as already been said) start workers with addprocs and start tasks on the new workers with remotecall(f, pid) with pid not being 1.