Skip to content

Commit 0ba830b

Browse files
markmclk-chen
authored andcommitted
[V1][Metrics] Fix http metrics middleware (vllm-project#15894)
1 parent 7d2da84 commit 0ba830b

File tree

2 files changed

+29
-18
lines changed

2 files changed

+29
-18
lines changed

docs/source/design/v1/metrics.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,17 @@ See [the PR which added this Dashboard](gh-pr:2316) for interesting and useful b
8686

8787
Prometheus support was initially added [using the aioprometheus library](gh-pr:1890), but a switch was made quickly to [prometheus_client](gh-pr:2730). The rationale is discussed in both linked PRs.
8888

89+
With the switch to `aioprometheus`, we lost a `MetricsMiddleware` to track HTTP metrics, but this was reinstated [using prometheus_fastapi_instrumentator](gh-pr:15657):
90+
91+
```bash
92+
$ curl http://0.0.0.0:8000/metrics 2>/dev/null | grep -P '^http_(?!.*(_bucket|_created|_sum)).*'
93+
http_requests_total{handler="/v1/completions",method="POST",status="2xx"} 201.0
94+
http_request_size_bytes_count{handler="/v1/completions"} 201.0
95+
http_response_size_bytes_count{handler="/v1/completions"} 201.0
96+
http_request_duration_highr_seconds_count 201.0
97+
http_request_duration_seconds_count{handler="/v1/completions",method="POST"} 201.0
98+
```
99+
89100
### Multi-process Mode
90101

91102
In v0, metrics are collected in the engine core process and we use multi-process mode to make them available in the API server process. See <gh-pr:7279>.

vllm/entrypoints/openai/api_server.py

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -310,33 +310,33 @@ def mount_metrics(app: FastAPI):
310310
# We need to set PROMETHEUS_MULTIPROC_DIR environment variable
311311
# before prometheus_client is imported.
312312
# See https://prometheus.github.io/client_python/multiprocess/
313-
from prometheus_client import (CollectorRegistry, make_asgi_app,
313+
from prometheus_client import (REGISTRY, CollectorRegistry, make_asgi_app,
314314
multiprocess)
315315
from prometheus_fastapi_instrumentator import Instrumentator
316316

317+
registry = REGISTRY
318+
317319
prometheus_multiproc_dir_path = os.getenv("PROMETHEUS_MULTIPROC_DIR", None)
318320
if prometheus_multiproc_dir_path is not None:
319321
logger.debug("vLLM to use %s as PROMETHEUS_MULTIPROC_DIR",
320322
prometheus_multiproc_dir_path)
321323
registry = CollectorRegistry()
322324
multiprocess.MultiProcessCollector(registry)
323-
Instrumentator(
324-
excluded_handlers=[
325-
"/metrics",
326-
"/health",
327-
"/load",
328-
"/ping",
329-
"/version",
330-
"/server_info",
331-
],
332-
registry=registry,
333-
).add().instrument(app).expose(app)
334-
335-
# Add prometheus asgi middleware to route /metrics requests
336-
metrics_route = Mount("/metrics", make_asgi_app(registry=registry))
337-
else:
338-
# Add prometheus asgi middleware to route /metrics requests
339-
metrics_route = Mount("/metrics", make_asgi_app())
325+
326+
Instrumentator(
327+
excluded_handlers=[
328+
"/metrics",
329+
"/health",
330+
"/load",
331+
"/ping",
332+
"/version",
333+
"/server_info",
334+
],
335+
registry=registry,
336+
).add().instrument(app).expose(app)
337+
338+
# Add prometheus asgi middleware to route /metrics requests
339+
metrics_route = Mount("/metrics", make_asgi_app(registry=registry))
340340

341341
# Workaround for 307 Redirect for /metrics
342342
metrics_route.path_regex = re.compile("^/metrics(?P<path>.*)$")

0 commit comments

Comments
 (0)