-
Notifications
You must be signed in to change notification settings - Fork 4.5k
xds: client outlier detection metrics [A91] #8437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8437 +/- ##
==========================================
- Coverage 82.26% 82.22% -0.05%
==========================================
Files 414 414
Lines 40410 40454 +44
==========================================
+ Hits 33243 33262 +19
- Misses 5796 5814 +18
- Partials 1371 1378 +7
🚀 New features to boost your workflow:
|
endpoints: resolver.NewEndpointMap[*endpointInfo](), | ||
metricsRecorder: cc.MetricsRecorder(), | ||
target: bOpts.Target.String(), | ||
backendService: "", // Will be set in UpdateClientConnState |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed since when nothing is set, fields take default zero value which is an empty string. The comment can be put when writing the definition of the struct here
Also please add release notes. |
Co-authored-by: eshitachandwani <[email protected]>
Co-authored-by: eshitachandwani <[email protected]>
Co-authored-by: eshitachandwani <[email protected]>
Adding @dfawley for a second review. |
|
Hi @PardhuKonakanchi, the test failure is due to a recent bug introduced in the master branch. The fix has been merged in #8451. I'll re-trigger CI. |
@dfawley Gentle ping for a review on this one, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thank you very much. I have just a couple minor comments and test requests inline.
pickerUpdateCh: buffer.NewUnbounded(), | ||
channelzParent: bOpts.ChannelzParent, | ||
endpoints: resolver.NewEndpointMap[*endpointInfo](), | ||
metricsRecorder: cc.MetricsRecorder(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: since we have cc
captured, maybe we don't also want to save the MetricsRecorder
in another field? Less state seems better.
// extractBackendService extracts the backend service from resolver state attributes. | ||
// This is a placeholder implementation - the actual extraction logic should be | ||
// implemented based on the specific resolver attributes available. | ||
func extractBackendService(resolver.State) string { | ||
// TODO: Implement backend service extraction from resolver attributes per A89 and A75 | ||
// For now, return empty string as this is optional | ||
return "" | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please remove this? If it's not clear this needs to be done as part of A75/A89 implementation work, then you could file a bug so that we don't forget about it. Thanks!
channelzParent channelz.Identifier | ||
metricsRecorder estats.MetricsRecorder | ||
target string | ||
backendService string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can also be removed along with the above.
if float64(b.numEndpointsEjected)/float64(b.endpoints.Len())*100 >= float64(b.cfg.MaxEjectionPercent) { | ||
// Record unenforced ejection due to max ejection percentage | ||
ejectionsUnenforcedMetric.Record(b.metricsRecorder, 1, b.target, "success_rate", "max_ejection_overflow", b.backendService) | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add tests that include the cases that are currently missed? I.e. ejection did not occur for various reasons.
Implements A91 in grpc-go
Two counters have been added according to the gRFC within the outlier detection algorithm on either ejections or unenforced ejection.
Currently,
grpc.lb.backend_service
is an optional label with an empty string as per the design until A89 and A75 are both implemented in grpc-go, after which the string can be updated for these metrics.RELEASE NOTES:
grpc.lb.outlier_detection.ejections_enforced
) and unenforced (grpc.lb.outlier_detection.ejections_unenforced
) outlier ejections.Tests
Updating existing outlier detection balancer tests to check if ejection stats are measured