At RelationalAI, we created a new Julia package for working with the Protocol Buffers format and Iβm happy to announce that this complete rewrite will soon be registered as a (very) breaking version of the current ProtoBuf.jl, that is 1.0.0. This new release brings following benefits:
The generated Julia structs and codec methods often result in noticeably faster serialization with lower memory overhead.*
We dropped protoc_jll dependency, which currently carries a large (100+MB) binary. Time to first protojl (our variant of protoc) is also much lower.
Enumerations are now generated using the EnumX.jl package, meaning they are a proper subtypes of Base.Enum.
* For example, see the following benchmark from issue #179βDeserialization is extremely slow for messages with many small sub-messages.β
Hereβs the pre-1.0 ProtoBuf.jl:
julia> BenchmarkTools.@benchmark(read_example_proto()) |> display
BenchmarkTools.Trial: 71 samples with 1 evaluation.
Range (min β¦ max): 66.590 ms β¦ 76.673 ms β GC (min β¦ max): 0.00% β¦ 10.80%
Time (median): 71.134 ms β GC (median): 5.37%
Time (mean Β± Ο): 70.467 ms Β± 2.704 ms β GC (mean Β± Ο): 4.42% Β± 3.48%
β ββ ββ β β β β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
66.6 ms Histogram: frequency by time 75.8 ms <
Memory estimate: 33.81 MiB, allocs estimate: 625475.
and here is ProtoBuf.jl 1.0.0:
julia> BenchmarkTools.@benchmark(read_example_proto_new()) |> display
BenchmarkTools.Trial: 741 samples with 1 evaluation.
Range (min β¦ max): 6.541 ms β¦ 9.074 ms β GC (min β¦ max): 0.00% β¦ 26.07%
Time (median): 6.647 ms β GC (median): 0.00%
Time (mean Β± Ο): 6.748 ms Β± 386.710 ΞΌs β GC (mean Β± Ο): 1.16% Β± 4.39%
βββββ ββββ
ββββββββββββ β ββββ ββββββββββββββββββββββββββββββ βββ βββββββ ββ β
6.54 ms Histogram: log(frequency) by time 8.65 ms <
Memory estimate: 1.85 MiB, allocs estimate: 2668.
As mentioned earlier, this release is very breaking, migrating to 1.0.0 will require some effort. Here are some differences to the pre-1.0.0 version that you should take into account:
Services and RPCs are not yet implemented. We will focus on these in near future as a part of our effort to build native gRPC libraries for Julia.
All generated structs are immutable and donβt share a common abstract type. By default, no convenience constructors are generated for these structs, but you might use add_kwarg_contructors=true option in protojl.
oneof fields are now translated to OneOf{T} types with fields name::Symbol and val::T containing the name and the value of the chosen member.
Nested definitions (e.g. message Parent { message Child {} }) would previously be named like Parent_Child, now the Child message would be translated to a struct called var"Parent.Child".
When translating proto files with a package directive, protojl will generate a directory structure that copies the levels of said package. In the future, we want to support generation of full blown Julia packages that are easy to register in (private) registries.
Please see the docs for more information about the package.
Also please note that in this case the 1.0.0 version was not meant to signal stability of the package, there are no planned breaking changes, but there might be some rough edges here and there, so please report any issues you encounter.
Huh, I just noticed that the package name is spelled ProtoBuf, not ProtoBuff. If youβre going to abbreviate Protocol Buffer, I would have expected it to be spelled ProtoBuff.
FWIW, during its development, the package was called ProtocolBuffers.jl, but since ProtoBuf.jl was already established, we chose not to fragment the ecosystem with a competing implementation.
How does it compare to the reference C++ implementation?
Not sure! The google repo does contain some benchmarking code I was able to run today, but Iβm not sure if I can easily make an apples to apples comparison with our code, the setup they use seemed quite involvedβ¦ Iβll look into it again at some point.
Would it make sense to merge this into the existing ProtoBuf.jl package as v2.0?
We bumped the existing ProtoBuf.jl package from 0.11.5 β 1.0.0, sorry if that was not clear!
Edit: Assuming you brought this up for semver/compat reasons, this version change is as breaking as a bump to 2.0 would be.
Hello, Thanks for working on this package. I am a user of this package and I noticed that encoding of unpacked representations is withheld (commented out here β Allow unpacked repeated primitives). Is there any reason its not included yet. I can see it needs test set. Any other reasons ?
This would be better to discuss in an issue, but the reason is that I donβt understand the use case β it is strictly more efficient to encode the array as packed. Maybe Iβm missing something.
When supporting an existing protocol which uses unpacked representation for encoding and decoding we need it. If it is strictly encoding and decoding at the other end it is not important. But before decoding if hash is used to verify the object received in remote session then we would get into issues. The hash of the encoded representation will be different and will be rejected in our use case. For now I had to make a fork with these changes and continue using it. It would be ideal if changes are in the upstream. I was assuming you guys must have a reason and didnβt bother to raise an issue. I hope you are convinced about the use-case. If I am the protocol designer and implementer I would definitely avoid unpacked representation but to adhere to existing protocol it would make sense to support it.
Thanks for the pointer and thanks for the update @FireCrumb.
Maybe I am missing something but I would like to parse a JSON file into proto and not a protobuf text file. We are using json as the human readable intermediary and several systems are already dumping dependent on it, I checked the protobuf text format but it is quite different then json.
Do you have a suggestion for reading/parsing a json file (or a Dictionary object) into a ProtoType object directly or via the binary encoding in Julia?
Ah, there is not automatic mapping between JSON and ProtoBuf (that I know of), the closest to that was a recent proposal to integrate tightly with StructTypes.jl. In the end we decided that this would be best implemented in a separate package.
Thanks @drvi ! Yeah, I was mislead by other protobuf APIs having JSON parser/writers and c++ api referring to JSON as βproto3 JSON formatβ but they are all custom parsers I guess.