File tree Expand file tree Collapse file tree 3 files changed +14
-4
lines changed Expand file tree Collapse file tree 3 files changed +14
-4
lines changed Original file line number Diff line number Diff line change @@ -293,6 +293,7 @@ steps:
293293 parallelism : 4
294294
295295- label : PyTorch Compilation Unit Tests
296+ torch_nightly : true
296297 source_file_dependencies :
297298 - vllm/
298299 - tests/compile
@@ -302,6 +303,7 @@ steps:
302303 - pytest -v -s compile/test_sequence_parallelism.py
303304
304305- label : PyTorch Fullgraph Smoke Test # 9min
306+ torch_nightly : true
305307 source_file_dependencies :
306308 - vllm/
307309 - tests/compile
@@ -312,6 +314,7 @@ steps:
312314 - pytest -v -s compile/piecewise/test_toy_llama.py
313315
314316- label : PyTorch Fullgraph Test # 18min
317+ torch_nightly : true
315318 source_file_dependencies :
316319 - vllm/
317320 - tests/compile
@@ -436,6 +439,7 @@ steps:
436439# #### models test #####
437440
438441- label : Basic Models Test # 24min
442+ torch_nightly : true
439443 source_file_dependencies :
440444 - vllm/
441445 - tests/models
Original file line number Diff line number Diff line change @@ -23,5 +23,11 @@ runai-model-streamer-s3==0.11.0
2323tensorizer>=2.9.0
2424lm-eval==0.4.8
2525buildkite-test-collector==0.1.9
26-
2726lm-eval[api]==0.4.8 # required for model evaluation test
27+
28+ # required for quantization test
29+ bitsandbytes>=0.45.3
30+
31+ # required for minicpmo_26 test
32+ vector_quantize_pytorch
33+ vocos
Original file line number Diff line number Diff line change @@ -186,9 +186,9 @@ class SamplingParams(
186186 logits_processors: list of functions that modify logits based on
187187 previously generated tokens, and optionally prompt tokens as
188188 a first argument.
189- truncate_prompt_tokens: If set to -1, will use the truncation size
190- supported by the model. If set to an integer k, will use only
191- the last k tokens from the prompt (i.e., left truncation).
189+ truncate_prompt_tokens: If set to -1, will use the truncation size
190+ supported by the model. If set to an integer k, will use only
191+ the last k tokens from the prompt (i.e., left truncation).
192192 Defaults to None (i.e., no truncation).
193193 guided_decoding: If provided, the engine will construct a guided
194194 decoding logits processor from these parameters. Defaults to None.
You can’t perform that action at this time.
0 commit comments