Feel free to reach me at: π« [email protected]
- π§© PR #147 β Contributed to
runpod-workers/worker-vllm
Added integration and environment support for thebitsandbytes
quantization option based on vLLM docs:- Updated
requirements.txt
and resolved compatibility issues - Modified
README.md
andworker-config.json
to reflect the new option - Fixed compatibility issue with
typing-extensions
- Added automatic
quantization
fallback logic inengine_args.py
- Deployed and tested the feature successfully in a RunPod serverless environment
- Updated
|
|
![]() |