Julia's deployment in a production environment (100k~200k QPS)

perchouli · October 27, 2023, 2:03am

Two years ago we rewrote our original Java service with Julia and got no less performance than before, but we ran into some problems.

Our system has Nginx as load balancing, which forwards requests to Julia (HTTP.jl) for processing, with each request querying the Redis cluster and calculating whether to return or not. Some requests also need to request external services, so a package like HTTP.jl that includes Server and Client is more convenient.

We enabled 160 Julia processes on 40 [8Core, 32G] machines, 4 processes per machine, which is an average of 650 QPS per process and 2600 QPS per machine. here is our startup bash script:

for i in $(seq $OFFSET_PROC $((CPU_CORE_NUM-1+OFFSET_PROC)))
do
    IMG_ARG=""
    if [ -e $SYS_IMAGE ]; then
        echo "Using $SYS_IMAGE..."
        IMG_ARG="-J$SYS_IMAGE"
    fi
    julia  --threads 1 --color=yes --project=@. -q $IMG_ARG -- $(dirname $0)/../bootstrap.jl "$@" -pi $i &
done

And main fuction:

function main()

    port = 8000
    proc_index = 0

    if length(ARGS) > 1 && getindex(ARGS, length(ARGS) - 1) == "-pi"
        proc_index = parse(Int, getindex(ARGS, length(ARGS)))
    end

    function start_server(i)
        access_log_formatter = init_logging(i)
        routers = setup_router()

        println("Running on http://127.0.0.1:8000+$i/")
        schedule(@task run(pipeline(`python3 scripts/compile_routes.py -i $i`)))
        HTTP.serve(routers, "0.0.0.0", 8000 + i; access_log=access_log_formatter)
    end

    start_server(proc_index)
end

The problem we encountered was:

Beginning in Julia 1.9, the main server loop is spawned on the interactive threadpool by default, so we limited " --threads 1", because with " --threads auto", the CPU usage becomes high and the QPS processed is not significantly improved, even more likely to get crash.
Our main function calls a Python script with a thread requesting all the routes. However, there will still be uncompiled functions, resulting in slower processing of requests at first. Although load balancing allows us to control it by forwarding a small number of requests first and then adding more requests after it’s all compiled, it would be better if it was ready to go right out of the box at startup.

Looking forward to getting your advice if anyone else is experiencing similar issues.

Other than that, Julia is performing well, we are using 1.10-beta in our production environment and it’s working fine, thanks!

mbauman · October 27, 2023, 2:44am

Welcome! You may find this topic from a few weeks ago relevant, particularly with regards to multithreading:

Topic		Replies	Views
Evaluation of a Julia web server for 10k concurrent connections Community announcement	12	4093	September 26, 2018
Debugging Julia HTTP Package Performance bottleneck Performance	1	765	July 8, 2020
Web server with latency-sensitive and compute-heavy task separate thread pools Web Stack multithreading , latency	12	998	February 10, 2023
Peformance Benckmark of HTTP.jl Performance package , http	1	818	February 19, 2021
Achieving parallel web request handling with Julia (httpserver.jl or other) General Usage	15	4175	March 13, 2020

Julia's deployment in a production environment (100k~200k QPS)

Related topics