Questions on parallelism, blocking, and a simple web server

question

#1

I’m working on a simple “web” server (to be run only locally) which allows other programs running locally to request solutions to optimization problems.

Firstly, and most importantly, solving these optimization problems can take a prohibitively long time. If an optimization problem runs for too long, it’s best to just give up on it.

Question 1: What’s the idiomatic way to start a job (be it via @async or @spawn or something else) and wait until either that job is completed or a certain amount of time has elapsed, aborting the job if it has not completed in time?

I see two ways, both of which feel clunky and therefore likely wrong:

  1. Have the master process start the job, e.g. with @spawn, and then sleep in short chunks. In each wake between sleeps, check of the job is completed and if so exit the sleep loop; also exit the loop if the total sleep time exceeds the limit.

  2. Have the master process wait on a Condition. The worker process triggers the condition when work is complete, and an additional process sleeps the allotted amount of time before triggering the condition.

Either way, a long-running job is aborted with Base.interrupt?


I have a simple version using HttpServer running with approximately the following code:

using HttpServer

function process_request(req::Request, res::Response)
    x = extract_data_from_request
    y = solve_optimization_problem(x) # takes a few seconds
    return Response(200, answer_to_json(y))
end

server = Server(process_request)
run(server, 8000)

This works well as long as requests aren’t coming in too fast, at which point things really slow down. I’ve tried code like the following, which seems to help.

@everywhere function solve_optimization_problem(x) ...

function process_request(req::Request, res::Response)
    x = extract_data_from_request
    y = @fetch solve_optimization_problem(x)) # takes a few seconds
    return Response(200, answer_to_json(y))
end

server = Server(process_request)
run(server, 8000)

Question 2: is this the ‘right’ way?


Intuitively, it seems like we want the master process to be able to say to a worker “you answer this request,” which would free the master up for other requests; the master never has to revisit the current request. Something like

@everywhere function optimize_and_package(x)
    y = solve_optimization_problem(x)
    return Response(200, answer_to_json(y))
end

function process_request(req::Request, res::Response)
    x = extract_data_from_request
    @return_when_complete @spawn optimize_and_package(x)
end

Question 3: Is there a way to do this / is it even desirable?


Preemptive thanks for the help!


#2

This probably doesn’t directly answer your questions, but you might be interested in looking at what Nanosoldier.jl does. Take anything you see in that package with a large grain of salt; I feel that most of the code needs to be refactored. For example, Nanosoldier is a quite a low volume server, and there are a bunch of places where severe bottlenecks would occur if the rate of requests rose significantly. That being said, the basic strategy employed might be useful to you.

For background, Nanosoldier.jl is the codebase for Julia’s “Nanosoldier” performance testing service, where users submit “benchmark jobs” to the server via GitHub comments with special syntax. When the master node’s event handler is called, it parses the given data into an appropriate job type, and then pushes the object into a local queue (note that a high volume server would probably want the parsing to happen on the workers).

Here’s the bit that might be useful to you: The server cooperatively schedules tasks on the master node in order to feed jobs from the queue to each worker node. A simpler implementation of this idea can be found in Julia’s pmap function.

While cooperative task scheduling is really convenient, you should note that Julia’s tasks aren’t thread-backed, so there’s a bunch of opportunities for blocking (e.g. the job-feeding loop can be blocked by calls to the server’s event handler). Things could thus be improved by leveraging Julia’s shared memory parallelism, or better yet, utilizing Julia’s multithreading capabilities, once it’s non-experimental and has a more fine-grained API.

You might also want to look into other libraries for message routing and queueing (which base Julia doesn’t facilitate so well, IMO), like ZMQ.jl.

EDIT: For clarity’s sake, I should point out that this strategy is one way of tackling this part of your post:

Intuitively, it seems like we want the master process to be able to say to a worker “you answer this request,” which would free the master up for other requests; the master never has to revisit the current request.