UPBKit explores integrating
upb with
the Google Objective-C Protocol Buffers library. (I'll refer to the Google
Objective-C Protocol Buffers library as "GPB" from now on — GPB is the
three-letter Objective-C prefix that the library uses for its classes, and is
likely an abbreviation for Google Protocol Buffers.) It leverages upb's fast
protobuf parsing
speed
(a.k.a. decoding/deserialization) to parse protos faster — while keeping the
familiar Objective-C protobuf API that folks are used to. If your application
deserializes large proto objects, you could see a significant speedup. This
works by using upb to decode the serialized protobuf, then copying the data from
the decoded upb_Message to a standard GPBMessage object. The copying sounds
slow, but current benchmarks show that it's often faster.
UPBKit is a research project, and isn't intended for production use at the moment.
Cavaeat: memory usage will probably be higher due to both the upb_Message and
the GPBMessage Objective-C object graph both being present. However, as with
all things performance, it depends what tradeoffs are important to you.
UPBKit can also create submessages lazily, which can dramatically speed up parse time (10× faster isn't out of the question). This is similar to Swift's lazy stored properties. Taking this example from the proto3 language guide:
message SearchResponse {
repeated Result results = 1; // `results` is an NSArray<GPBMessage *> object
}
message Result {
string url = 1; // `url` is an NSString object
string title = 2; // `title` is an NSString object
repeated string snippets = 3; // `snippets` is an NSArray<NSString *> object
}Here, the only Objective-C object that would be created is the top-level
SearchResponse object. UPBKit can create the results sub-object as a special
"lazy object" that is fully created only when it's used. Compare this to the
full object graph that would be instantiated from a normal parse:
SearchResponse, the NSArray for results, the NSStrings for url and
title, and the NSArray of NSStrings for snippets. The parse-time savings
become greater as protos become larger, since submessages, strings and bytes
appear frequently.
Note that upb still does a full parse of the serialized proto, including validating all strings as UTF-8 where the protobuf specification requires it. It's only the Objective-C object graph that's lazily created.
UPBKit uses Bazel as its primary build system. Xcode projects are also provided in the source code repository for convenience (and are generated by Bazel).
The author uses Visual Studio Code with the following extensions for development:
- Bazel
- clangd
- vscode-proto3
We use compile_commands.json to integrate with VSCode's semantic language features. To refresh the compile_commands database, run
bazel run @hedron_compile_commands//:refresh_allYou probably want to install SwiftProtobuf to tinker with the Swift bindings. On macOS:
brew install swift-protobufWe use rules_xcodeproj to generate Xcode projects from Bazel's BUILD files. To refresh the Xcode projects:
bazel run //GPBExtensions/tests:benchmark_xcodeproj
bazel run //GPBExtensions/tests:GPBMessage_UPBDecodingTest_xcodeprojIf you're surprised that Bazel needs Internet access to do a build, this may help:
bazel fetch //...See CONTRIBUTING.md for details.
Apache 2.0; see LICENSE for details.
This project is not an official Google project. It is not supported by Google and Google specifically disclaims all warranties as to its quality, merchantability, or fitness for a particular purpose.