Timeout for connect (Socket)

Hi,

I have a problem with a local server that sometimes is down and in turn

  connect("ip-of-server", port)

is blocking infinitely. I would like to have some timeout that throws an exception that I could catch, e.g. after a second. Is there some way to accomplish this?

Thanks,

Tobi

I have come up with this:

function Sockets.connect(ip, port, timeout_ms, N=100)
  t = @async connect(ip, port)
  for i=1:N
    if istaskdone(t)
      return fetch(t)
    end
    sleep(timeout_ms / N / 1000)
  end
  @async Base.throwto(t, EOFError())
  error("could not connect to $(ip) on port $(port)!")
end

not sure if that is correct though.

I’ll give my usual warning: if you are attempting to deal with time-outs, you are probably setting up your application logic wrong, which may lead to headaches, or worse, could accidentally create packet storms that DoS your network and host (plus, it doesn’t actually block infinitely, since your operating system already came with a connect timeout that you can probably configure. I think a typical default is to continue retrying, with the necessary exponential backoff, for about 2 minutes).

But it is quite simple: the wait-able objects in Julia are expected to support the close operation and can be used as cancellation tokens:

julia> t = TCPSocket();

julia> Timer(_ -> close(t), 5);

julia> connect(t, "1.2.3.4", 8000)
ERROR: IOError: connect: operation canceled (ECANCELED)
Stacktrace:
 [1] wait_connected(x::TCPSocket)
   @ Sockets /data/vtjnash/julia/usr/share/julia/stdlib/v1.7/Sockets/src/Sockets.jl:532
 [2] connect(::TCPSocket, ::String, ::Int64)
   @ Sockets /data/vtjnash/julia/usr/share/julia/stdlib/v1.7/Sockets/src/Sockets.jl:567
 [3] top-level scope
   @ REPL[13]:1

julia> # TODO: using the same cancellation concept as before, we need close(timer) here too
4 Likes

Thank you Jameson, this looks cleaner than my version.

And you can be reassured that my application logic is not wrong. I have a local device being directly connected to my computer and when the user has forgotten to switch on the device I don’t want to wait 2 minutes (it actually feels much longer than that) just to inform the user.

1 Like

So, based on @jameson feedback, this is my solution:

function Sockets.connect(host, port::Integer, timeout::Number)
  s = TCPSocket()
  t = Timer(_ -> close(s), timeout)
  try
    connect(s, host, port)
  catch e
    error("Could not connect to $(host) on port $(port) 
            since the operations was timed out after $(timeout) seconds!")
  finally
    close(t)
  end
  return s
end

Why do you want to error and restart the whole program though, rather than tell the user to turn on the device, and let it finish the connection attempt as soon as it responds on the network?

There are some reasons for that. Most importantly the function doing the connection is very local, and I cannot open a Gtk dialog within that low-level function. What is quite common in that situation is to throw an exception and at an appropriate layer in the stack handle it. This is exactly what I am doing here.

Second reason is: even if I would implement what you suggested and let the connection be going (asynchronously while asking the user to do something) I would still need to handle the case that the user did not switch on the device. And by the way, its not just switching on the device, there can also be other errors in the device that need more time to be investigated before the system should restart.

1 Like