-
Notifications
You must be signed in to change notification settings - Fork 48
Closed
Labels
- Developer -Torrust Improvement ExperienceTorrust Improvement ExperienceNeeds ResearchWe Need to Know More About ThisWe Need to Know More About This
Description
I'm working on this issue:
In the beginning, I only wanted to refactor stats a little bit:
- To make it easier to add more metrics without breaking changes in the data structure.
- To align the in-memory format with the Prometheus format.
It's not a complex work, however I might be reinventing the wheel. There are many crates to handle metrics. After a preliminary research it seems there are two main ways to handle metrics:
- Protocol-agnostic instrumentation: (https://github.com/metrics-rs/metrics)
- Similar to log/tracing
- Based on the prometheus(https://github.com/tikv/rust-prometheus)
There is also a crate to be able to use both in the same project: https://github.com/instrumentisto/metrics-prometheus-rs
metrics can export to TCP and Prometheus:
- metrics-exporter-tcp - outputs metrics to clients over TCP
- metrics-exporter-prometheus - serves a Prometheus scrape endpoint
So in theory it does what I'm trying to implement.
Our requirements are:
- Expose via REST API in JSON format.
- Expose via REST API in Prometheus format.
- Expose via GraphQL API in the future.
- Extendable without breaking changes.
- Zero overhead when disabled.
- Allow merging metrics.
- Metrics metadata (metrics name) or labels (Prometheus name)
My first impressions are:
- I should finish the current "stats overhaul" epic with our custom solution based on events. I think it won't take me long. After that we can plan a proper migration to one of this crates.
- I would use the generic one
metricsif there are no drawbacks comparing to using Prometheus directly. - I guess, if we use
metricswe have to build our recorder if we want to expose the metrics in our APIs. However, we should review the idea of exposing the metrics with independent APIs. There could be other crates to expose metrics via REST of GrapqhQL. For example: https://github.com/naamancurtis/async_graphql_telemetry_extension - Event if we end up using a third-party crate, events are going to be useful for other things or to let people build their own stats. Events are not decoupled from stats, so it would be easy to migrate to a metrics crate and keep events.
- I don't know how easy it would be to test with this crates using macros. My experience with tracing is that it's hard because they also use global recorders. They also have local recorders (like tracing) but for tracing I could not make it work. See Unfinished attempt to use the
tracing-testcrate #1147
I will:
- Read metrics documentation.
- Compare the three alternatives:
- Custom metrics
metricscrateprometeouscrate
And share my conclusions.
cc @da2ce7
Metadata
Metadata
Assignees
Labels
- Developer -Torrust Improvement ExperienceTorrust Improvement ExperienceNeeds ResearchWe Need to Know More About ThisWe Need to Know More About This