Skip to content

Conversation

@khaledh
Copy link

@khaledh khaledh commented Oct 22, 2025

What is the purpose of the change

Flink currently uses protobuf-java 3.x, causing compatibility issues for applications requiring protobuf 4.x. This upgrade to protobuf 4.32.1 enables:

  • Applications to use protobuf 4.x features (e.g., Protobuf Editions)
  • Resolution of dependency conflicts and forward compatibility as the protobuf ecosystem moves toward 4.x

The parquet-protobuf integration required a compatibility patch (PatchedProtoWriteSupport) because upstream parquet-java 1.15.2 still uses protobuf 3.x APIs. This patch can be removed once parquet-java adds native protobuf 4.x support (apache/parquet-java#3175).

Brief change log

  • Upgrade protobuf-java from 3.x to 4.32.1
  • Add PatchedProtoWriteSupport to maintain compatibility with parquet-java 1.15.2 (which still uses protobuf 3.x APIs)
  • Replace enum-based syntax detection (removed in protobuf 4.x) with string-based detection
  • All changes are internal - no public API changes

Verifying this change

This change added tests and can be verified as follows:

  • All existing tests pass (no regressions)
  • New test suite PatchedProtoWriteSupportTest with 6 tests:
    • 4 unit tests validating proto2/proto3 syntax detection with direct API
    • 2 integration tests validating production code path through ParquetProtoWriters
  • Tests confirm round-trip write/read integrity for both proto2 and proto3 messages

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): yes
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@flinkbot
Copy link
Collaborator

flinkbot commented Oct 22, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

<py4j.version>0.10.9.7</py4j.version>
<beam.version>2.54.0</beam.version>
<protoc.version>3.21.7</protoc.version>
<protoc.version>4.32.1</protoc.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This upgrade sounds like a great idea. I think we should update the docs to draw users attention to this new support.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I'll update the relevant docs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the protobuf format docs and add a section to the 2.1 release ntoes.

@github-actions github-actions bot added the community-reviewed PR has been reviewed by the community. label Oct 24, 2025
vulnerability found in [CVE-2025-30065](https://nvd.nist.gov/vuln/detail/CVE-2025-30065).
vulnerability found in [CVE-2025-30065](https://nvd.nist.gov/vuln/detail/CVE-2025-30065).

#### Upgrade Protocol Buffers to 4.32.1
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be moved to a new flink-2.2.md release notes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaik the release notes are generated by the release manager when doing the actual release based on the "release notes" section you put into the jira ticket when closing it.

It's definitely worthwhile mentioning this change so please put it into the ticket release notes so the it's showing up in the 2.2 release notes :)

@MartijnVisser MartijnVisser requested a review from fapaul October 26, 2025 21:38
Copy link
Contributor

@fapaul fapaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review. I just came back from PTO. Thanks a lot for doing this upgrade and great idea with the patched proto writer supports.

Overall looks good to me left two inline comments.

vulnerability found in [CVE-2025-30065](https://nvd.nist.gov/vuln/detail/CVE-2025-30065).
vulnerability found in [CVE-2025-30065](https://nvd.nist.gov/vuln/detail/CVE-2025-30065).

#### Upgrade Protocol Buffers to 4.32.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaik the release notes are generated by the release manager when doing the actual release based on the "release notes" section you put into the jira ticket when closing it.

It's definitely worthwhile mentioning this change so please put it into the ticket release notes so the it's showing up in the 2.2 release notes :)

* <p>The original source can be found here:
* https://github.com/apache/parquet-java/blob/apache-parquet-1.15.2/parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoWriteSupport.java
*/
class PatchedProtoWriteSupport<T extends MessageOrBuilder> extends WriteSupport<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you mark the code blocks you have patched in the code to easier distinguish what is copied.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants