Skip to content

Conversation

@zenador
Copy link
Contributor

@zenador zenador commented Nov 7, 2025

What this PR does

Start to migrate to NHCB:

  • enable NHCB conversion in microservices docker-compose
  • make more dashboard panels use the classic/native histogram toggle
  • make more recording rules record for both classic and native histograms
  • add a native histogram version of some alerts

Which issue(s) this PR fixes or relates to

N/A

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]. If changelog entry is not needed, please add the changelog-not-needed label to the PR.
  • about-versioning.md updated with experimental features.

@zenador
Copy link
Contributor Author

zenador commented Nov 7, 2025

So far, focused on converting the below functions to their native histogram versions:

  • qpsPanel
  • latencyPanel
  • latencyRecordingRulePanel

In the following .libsonnet files:

  • operations/mimir-mixin/dashboards/compactor.libsonnet
  • operations/mimir-mixin/dashboards/dashboard-utils.libsonnet
  • operations/mimir-mixin/dashboards/queries.libsonnet

And updating associated recording rules, plus recording native histogram versions when we can in operations/mimir-mixin/recording_rules.libsonnet. Also added a native histogram version of an alert that uses one of the metrics updated above.

This affects the following dashboards, but there may still be panels in these dashboards that need updating:

  • Alertmanager
  • Compactor
  • Queries
  • Reads
  • Remote Ruler Reads
  • Ruler
  • Writes

It covered the following metrics (non-exhaustive list):

  • cortex_querier_request_duration_seconds
  • cortex_storegateway_client_request_duration_seconds
  • cortex_ingester_client_request_duration_seconds
  • cortex_query_frontend_retries
  • cortex_query_frontend_queue_duration_seconds
  • cortex_ingester_queried_series
  • cortex_ingester_queried_samples
  • cortex_ingester_queried_exemplars
  • cortex_kv_request_duration_seconds
  • cortex_compactor_meta_sync_duration_seconds

},
},
{
alert: $.alertName('KVStoreFailureNativeHistogram'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at MimirRequestErrors, no need to add suffix "NativeHistogram" , but we do need another label called histogram with classic or native as value as appropriate. Labels will make it possible to route the alerts as we need.

# Conflicts:
#	development/mimir-microservices-mode/docker-compose.jsonnet
#	development/mimir-microservices-mode/docker-compose.yml
#	operations/mimir-mixin/dashboards/dashboard-utils.libsonnet
Signed-off-by: György Krajcsovits <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants