[ANN] ProtoBuf.jl 1.0.0

ProtoBuf.jl 1.0.0

At RelationalAI, we created a new Julia package for working with the Protocol Buffers format and I’m happy to announce that this complete rewrite will soon be registered as a (very) breaking version of the current ProtoBuf.jl, that is 1.0.0. This new release brings following benefits:

  • The generated Julia structs and codec methods often result in noticeably faster serialization with lower memory overhead.*
  • We dropped protoc_jll dependency, which currently carries a large (100+MB) binary. Time to first protojl (our variant of protoc) is also much lower.
  • Enumerations are now generated using the EnumX.jl package, meaning they are a proper subtypes of Base.Enum.

* For example, see the following benchmark from issue #179 β€œDeserialization is extremely slow for messages with many small sub-messages.”

Here’s the pre-1.0 ProtoBuf.jl:

julia> BenchmarkTools.@benchmark(read_example_proto()) |> display
BenchmarkTools.Trial: 71 samples with 1 evaluation.
 Range (min … max):  66.590 ms … 76.673 ms  β”Š GC (min … max): 0.00% … 10.80%
 Time  (median):     71.134 ms              β”Š GC (median):    5.37%
 Time  (mean Β± Οƒ):   70.467 ms Β±  2.704 ms  β”Š GC (mean Β± Οƒ):  4.42% Β±  3.48%

  β–ˆ ▁▆ ▁▁                      β–†   ▁  ▁  β–ƒ  β–ƒ
  β–ˆβ–„β–ˆβ–ˆβ–‡β–ˆβ–ˆβ–„β–„β–β–β–β–„β–β–β–β–β–β–β–β–β–β–β–β–„β–„β–„β–‡β–‡β–ˆβ–„β–„β–„β–ˆβ–‡β–β–ˆβ–‡β–„β–ˆβ–‡β–‡β–ˆβ–„β–‡β–‡β–β–β–β–β–β–β–β–β–β–β–β–β–„ ▁
  66.6 ms         Histogram: frequency by time        75.8 ms <

 Memory estimate: 33.81 MiB, allocs estimate: 625475.

and here is ProtoBuf.jl 1.0.0:

julia> BenchmarkTools.@benchmark(read_example_proto_new()) |> display
BenchmarkTools.Trial: 741 samples with 1 evaluation.
 Range (min … max):  6.541 ms …   9.074 ms  β”Š GC (min … max): 0.00% … 26.07%
 Time  (median):     6.647 ms               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   6.748 ms Β± 386.710 ΞΌs  β”Š GC (mean Β± Οƒ):  1.16% Β±  4.39%

  β–„β–†β–ˆβ–‡β–…β–ƒβ–‚β–β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–„β–…β–…β–‡β–‡β–†β–…β–β–β–β–„β–„β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–„β–β–β–„β–…β–„β–β–…β–„β–„β–β–„β–„β–†β–…β–„β–‡ β–‡
  6.54 ms      Histogram: log(frequency) by time      8.65 ms <

 Memory estimate: 1.85 MiB, allocs estimate: 2668.

As mentioned earlier, this release is very breaking, migrating to 1.0.0 will require some effort. Here are some differences to the pre-1.0.0 version that you should take into account:

  • Services and RPCs are not yet implemented. We will focus on these in near future as a part of our effort to build native gRPC libraries for Julia.
  • All generated structs are immutable and don’t share a common abstract type. By default, no convenience constructors are generated for these structs, but you might use add_kwarg_contructors=true option in protojl.
  • oneof fields are now translated to OneOf{T} types with fields name::Symbol and val::T containing the name and the value of the chosen member.
  • Nested definitions (e.g. message Parent { message Child {} }) would previously be named like Parent_Child, now the Child message would be translated to a struct called var"Parent.Child".
  • When translating proto files with a package directive, protojl will generate a directory structure that copies the levels of said package. In the future, we want to support generation of full blown Julia packages that are easy to register in (private) registries.

Please see the docs for more information about the package.

Also please note that in this case the 1.0.0 version was not meant to signal stability of the package, there are no planned breaking changes, but there might be some rough edges here and there, so please report any issues you encounter.

29 Likes

Any particular reason to not call the package ProtocolBuffers.jl?

2 Likes

Huh, I just noticed that the package name is spelled ProtoBuf, not ProtoBuff. If you’re going to abbreviate Protocol Buffer, I would have expected it to be spelled ProtoBuff.

1 Like

protobuf is how it’s usually shortened: GitHub - protocolbuffers/protobuf: Protocol Buffers - Google's data interchange format

10 Likes

FWIW, during its development, the package was called ProtocolBuffers.jl, but since ProtoBuf.jl was already established, we chose not to fragment the ecosystem with a competing implementation.

2 Likes

How does it compare to the reference C++ implementation?

Would it make sense to merge this into the existing ProtoBuf.jl package as v2.0?

1 Like

How does it compare to the reference C++ implementation?

Not sure! The google repo does contain some benchmarking code I was able to run today, but I’m not sure if I can easily make an apples to apples comparison with our code, the setup they use seemed quite involved… I’ll look into it again at some point.

Would it make sense to merge this into the existing ProtoBuf.jl package as v2.0?

We bumped the existing ProtoBuf.jl package from 0.11.5 β†’ 1.0.0, sorry if that was not clear!
Edit: Assuming you brought this up for semver/compat reasons, this version change is as breaking as a bump to 2.0 would be.

1 Like