Skip to content

Conversation

@bufdev
Copy link
Member

@bufdev bufdev commented Nov 8, 2025

@github-actions
Copy link
Contributor

github-actions bot commented Nov 8, 2025

The latest Buf updates on your PR. Results from workflow Buf CI / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedNov 8, 2025, 4:25 PM

@hudlow
Copy link

hudlow commented Nov 8, 2025

@bufdev

I don't understand why I have to come up with a unique ID...

Unique IDs are critical for code receiving and interpreting validation failures to have a stable method to remediate or regurgitate a failure in a context-appropriate way.

...and why I have to pay the mental overhead of coming up with a human-readable message when all the context I need is right in the expression.

For many consumers of validation failures, CEL is an implementation detail, and not something they should ever have to understand. Moreover, even if the semantic of the failure seems clear from the expression, I don't think it's correct to conflate the two.

I would consider the logic, the stable identifier, and the semantic message to be table stakes for every validation rule.

This is actually making me ponder if we ought to have supported custom messages for standard rules vs assuming the semantic is also standardized. Consider something like this:

message Committee {
  uint32 size = 1 [(buf.validate.field).uint32 = {
    in: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31]
  }];
}

The semantic in this case is probably something along the lines of "The headcount for a committee must be a prime number less than 32." Of course, you can always make a custom rule with a semantic message, but should you have to?

But I digress — the net is that I think the linter is correct as written in requiring an ID and message for a CEL-based rule. If you had to remove the requirement for one of them, I would make it the message — because at least with a stable ID, the processor of a validation failure may be able to refer to documentation and reconstruct a semantic message in contextually appropriate terms.

@bufdev
Copy link
Member Author

bufdev commented Nov 8, 2025

Unique IDs are critical for code receiving and interpreting validation failures to have a stable method to remediate or regurgitate a failure in a context-appropriate way.

Not critical - the only thing users are reading is the error message, I do not think we have a case of someone programatically interpreting a rule ID. If it is critical in some scenario for a specific Rule, then the ID can be added, but I have no evidence of this being used. I have never seen a user programmatically evaluate a rule ID.

For many consumers of validation failures, CEL is an implementation detail, and not something they should ever have to understand. Moreover, even if the semantic of the failure seems clear from the expression, I don't think it's correct to conflate the two.

The CEL expression is very clear 99% of the time. It's basically English, which is supposed to be one of the points of CEL - it's supposed to be very simple. For the 1% of the time where it's not, then I can write a custom message, however in the 2+ years of using Protovalidate, I have yet to find a situation where I want to write a custom message (even if I have the option to).

The overhead of how I have to write CEL rules in protovalidate (with an ID and message that basically just convey the information I have in the expression, usually in an even less-readable way ironically) makes it so I don't even want to do it most of the time, it's a chore. It should be super easy to use this feature.

When all I want to do is this (which is 98% of all I ever want to do with custom CEL rules):

message Invoice {
  option (buf.validate.message).cel_expression = "this.foo.size > 10";
} 

It is more maintenance, less clear, and more overhead to have to write:

message Invoice {
  // This is not a toy example.
  // 98 out of 100 custom CEL Rules I write in Protovalidate are this simple and of this form.
  option (buf.validate.message).cel = {
    id: "foo_size_gt_10"
    message: "foo's size must be greater than 10"
    expression: "this.foo.size > 10"
  };

The resulting error message foo's size must be greater than 10 is significantly less clear to me than "this.foo.size > 10" returned false. Not only did I have to make up an ID that basically mimicked the expression, and write in English something that is even more clear as a CEL expression when I read it as a human, but I ended up with less information in the English text than I did with the CEL expression. In the CEL expression, I know exactly what field is the problem.

All the common complicated rules are already covered in the builtin rules. And great, make custom messages for them - we already did. For all the simple ones - which represent the vast majority of custom Protovalidate Rules - do not force users to do something that adds negative value.

message Committee {
  uint32 size = 1 [(buf.validate.field).uint32 = {
    in: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31]
  }];
}

I have no real-world feature where something like this would make sense. The closest I've seen is validating that a value is 1 of 3 enum values, but that was used from a builtin Rule, and no message was added - again, showing that such a message or ID is not critical.

@bufdev
Copy link
Member Author

bufdev commented Nov 8, 2025

Taking a step further: return values of string from a CEL expression are failures, and this is how we do the vast majority of our custom messages for predefined rules, furthering the lack of a need to enforce or expect the message field. Example:

  repeated float not_in = 7 [(predefined).cel = {
    id: "float.not_in"
    expression: "this in rules.not_in ? 'value must not be in list %s'.format([rules.not_in]) : ''"
  }];

@hudlow
Copy link

hudlow commented Nov 9, 2025

For posterity: I think @bufdev mostly convinced me of the pragmatic merits of this change (if not the entire philosophy) in offline conversation.

@timostamm
Copy link
Member

bufbuild/protovalidate-go#288 makes two changes:

  1. If the rule does not define Rule.message, synthesize a message using the expression string (Rule.expression).
  2. If a rule doesn't define an ID (Rule.id), do not append " []" to the error message.

I don't see any downsides to those changes.

Does that mean that buf lint should never complain about missing Rule.id / Rule.message? IMO this needs more investigation. If a field has multiple custom predefined and standard rules, it can become tedious to understand a violation. Violation.path points to the rule, but it's difficult to access.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants