[Help Wanted] gRPCClient2.jl: Production Grade gRPC Client

Hello Julia Community,

I have been working on a new gRPC client with an emphasis on production grade performance and reliability. The client is already thread / async safe and uses the fast and up to date 1.0 version of ProtoBuf.jl. There is zero extra memory copying or buffering between the client and libCURL and optimizations to reduce overhead of having many small streams over multiplexed HTTP/2.

Repo: GitHub - csvance/gRPCClient2.jl: Production Grade gRPC Client for Julia
Docs: gRPCClient2.jl Β· gRPCClient2

The name of the package is just a placeholder while it is under rapid development. The client borrows some code from gRPCClient.jl and Downloads.jl, so thanks to the maintainers/contributors of those packages for helping bootstrap this effort.

Looking for collaborators in general but right now I need the following:

  • general usage testing / feedback on interfaces and API
  • more test coverage / test against more gRPC servers than just Python
  • streaming RPC support: needs to be done in a way that does not negatively impact performance for the unary RPC case

Of course I am working through most these myself but I would appreciate any help if you are interested in having a production grade gRPC client in Julia.

The latency / overhead / resource usage is currently quite minimal. Some benchmarks bellow (API not final):

// Benchmark a unary RPC with small protobufs to demonstrate overhead per request
// subset of grpc_predict_v2.proto for testing

syntax = "proto3";
package inference;

message ServerReadyRequest {}

message ServerReadyResponse
{
  // True if the inference server is ready, false if not ready.
  bool ready = 1;
}

service GRPCInferenceService
{
  // The ServerReady API indicates if the server is ready for inferencing.
  rpc ServerReady(ServerReadyRequest) returns (ServerReadyResponse) {}
}
using ProtoBuf
using BenchmarkTools
using gRPCClient2
using Base.Threads

include("grpc_predict_v2_pb.jl")

const grpc = gRPCCURL()

function bench_ready(n)
    @sync begin

        requests = Vector{gRPCRequest}()
        for i in 1:n
            request = ServerReadyRequest()
            # once we generate bindings from the service definition this will be much cleaner
            req = grpc_unary_async_request(grpc, "grpc://rpctest.local:8001/inference.GRPCInferenceService/ServerReady", request)
            push!(requests, req)
        end

        for req in requests
            response = grpc_unary_async_await(grpc, req, ServerReadyResponse)
        end
    end
end
# Sync usage (must wait for response before sending next request)
@benchmark bench_ready(1)

BenchmarkTools.Trial: 6821 samples with 1 evaluation per sample.
 Range (min … max):  370.152 ΞΌs …   6.193 ms  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     520.608 ΞΌs               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   722.812 ΞΌs Β± 671.093 ΞΌs  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–‚β–†β–ˆβ–†β–…β–„β–ƒβ–‚β–‚β–                                                    ▁
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–…β–†β–„β–„β–…β–…β–„β–†β–„β–„β–…β–†β–†β–†β–…β–‚β–„β–…β–…β–…β–ƒβ–…β–…β–„β–…β–…β–…β–…β–„β–…β–„β–„β–†β–†β–†β–†β–†β–†β–‡β–‡β–‡β–†β–…β–„β–„β–„β–„β–‚ β–ˆ
  370 ΞΌs        Histogram: log(frequency) by time        3.9 ms <

 Memory estimate: 5.36 KiB, allocs estimate: 109.
# Async usage (send all requests as fast as possible and then wait for all responses)
@benchmark bench_ready(1000)

BenchmarkTools.Trial: 32 samples with 1 evaluation per sample.
 Range (min … max):  149.132 ms … 181.694 ms  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     157.729 ms               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   160.711 ms Β±   7.917 ms  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

       ▁        β–„β–ˆβ–    ▁                 ▁                       
  β–†β–β–β–†β–β–ˆβ–†β–β–†β–†β–β–β–β–†β–ˆβ–ˆβ–ˆβ–†β–β–β–†β–ˆβ–†β–†β–β–β–β–†β–β–β–†β–β–β–β–β–β–†β–β–β–ˆβ–β–β–†β–β–†β–β–β–β–β–β–β–β–β–β–†β–β–β–β–β–β–† ▁
  149 ms           Histogram: frequency by time          182 ms <

 Memory estimate: 2.37 MiB, allocs estimate: 54179.

Dividing by 1000 we get a mean of 160.711 ΞΌs down from 722.812 ΞΌs in the sync case, around a 4.5x speedup not having to wait for a response before sending the next request. The ICMP RTT to this server is ~300 ΞΌs from my computer on the LAN.

4 Likes

Opened a pull request to add support for RPC client stub code generation with ProtoBuf.jl. Depending on how fast I’m able to get services support fully worked out and merged, we could have the v0.1 release in the next few weeks. Tests and CI/precompile infrastructure have been setup and considerable work has been done to smooth out rough edges in terms of having useful exception messages, fixing memory/handle leaks, etc.

I also cleaned up the public interface / API:

using gRPCClient2

# Include the protobuf definitions and RPC client stubs
include("gen/proto/test_pb.jl")

# Initialize the gRPC package - grpc_shutdown() does the opposite for use with Revise.
grpc_init()

# Create a client from the generated client stub
client = TestService_TestRPC_Client("localhost", 8001; secure=false)

# Sync API
test_response = grpc_sync_request(client, TestRequest(1))

# Async API
requests = Vector{gRPCRequest}()
for i in 1:10
    push!(
        requests, 
        grpc_async_request(client, TestRequest(1))
    )
end

for request in requests
    response = grpc_async_await(client, request)
end

Now that the API is relatively stable I’m going to start writing documentation and will continue to stress test the client and fix any remaining undiscovered bugs.

Looks like some neat work!

As someone who is specifically not a fan of xyz2, xyz3 packages, are you interested in talking to the gRPCClient.jl folks about potentially replacing it with your package?

1 Like

Nice to see some interest and movement in the gRPC support for Julia. This has long been a stumbling block when integrating Julia services in larger projects, and there arenβ€˜t that many alternatives really, e.g. there is also no AMQP v1 support in the ecosystem either (although I did make a start there, maybe I need to request for help there too). So it is nice to see that someone is willing to push this domain further. Maybe someday weβ€˜ll also have a gRPC server in Julia.

Nice to see some basic integration testing set up as well. Maybe I could help (if I get some spare time, lol) with setting up test servers in a few more languages to test against. At least JS/TS and Go would be nice to catch any inconsistencies in implementations (I heard that there can be some).

1 Like

Good idea. Once we are a little farther along with documentation and testing we can open the discussion. I just didn’t want to bother them until it was clear how serious this effort is :sweat_smile:

Indeed, when I was first trying to adopt Julia for a project at work this ended up being such a large roadblock I almost gave up on the language. So it will be good if no one else ever has to go through that again :sweat_smile:

gRPC server in Julia is on my radar currently, it may be possible to do with nghttp2 which already has a JLL package. Once the client initiative is complete I will look into it more.

That would be much appreciated! Get me some spare time too :grinning_face_with_smiling_eyes: