Skip to content

Commit 6e2527a

Browse files
authored
[Doc] Update documentation on Tensorizer (#5471)
1 parent cdab68d commit 6e2527a

File tree

3 files changed

+14
-1
lines changed

3 files changed

+14
-1
lines changed

docs/source/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ Documentation
8181
serving/env_vars
8282
serving/usage_stats
8383
serving/integrations
84+
serving/tensorizer
8485

8586
.. toctree::
8687
:maxdepth: 1

docs/source/serving/tensorizer.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
.. _tensorizer:
2+
3+
Loading Models with CoreWeave's Tensorizer
4+
==========================================
5+
vLLM supports loading models with `CoreWeave's Tensorizer <https://docs.coreweave.com/coreweave-machine-learning-and-ai/inference/tensorizer>`_.
6+
vLLM model tensors that have been serialized to disk, an HTTP/HTTPS endpoint, or S3 endpoint can be deserialized
7+
at runtime extremely quickly directly to the GPU, resulting in significantly
8+
shorter Pod startup times and CPU memory usage. Tensor encryption is also supported.
9+
10+
For more information on CoreWeave's Tensorizer, please refer to
11+
`CoreWeave's Tensorizer documentation <https://github.com/coreweave/tensorizer>`_. For more information on serializing a vLLM model, as well a general usage guide to using Tensorizer with vLLM, see
12+
the `vLLM example script <https://docs.vllm.ai/en/stable/getting_started/examples/tensorize_vllm_model.html>`_.

vllm/engine/arg_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -230,7 +230,7 @@ def add_cli_args(
230230
'* "dummy" will initialize the weights with random values, '
231231
'which is mainly for profiling.\n'
232232
'* "tensorizer" will load the weights using tensorizer from '
233-
'CoreWeave. See the Tensorize vLLM Model script in the Examples'
233+
'CoreWeave. See the Tensorize vLLM Model script in the Examples '
234234
'section for more information.\n'
235235
'* "bitsandbytes" will load the weights using bitsandbytes '
236236
'quantization.\n')

0 commit comments

Comments
 (0)