Skip to content

Error loading models since versions 0.6.1xxx #8745

@IdoAmit198

Description

@IdoAmit198

Your current environment

PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.6 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
Clang version: Could not collect
CMake version: version 3.30.0
Libc version: glibc-2.31

Python version: 3.11.7 (main, Dec 15 2023, 18:12:31) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.4.0-187-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: 12.3.107
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: 
GPU 0: NVIDIA A40
GPU 1: NVIDIA A40
GPU 2: NVIDIA A40
GPU 3: NVIDIA A40

Nvidia driver version: 535.183.01
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Byte Order:                         Little Endian
Address sizes:                      46 bits physical, 57 bits virtual
CPU(s):                             96
On-line CPU(s) list:                0-95
Thread(s) per core:                 2
Core(s) per socket:                 24
Socket(s):                          2
NUMA node(s):                       2
Vendor ID:                          GenuineIntel
CPU family:                         6
Model:                              106
Model name:                         Intel(R) Xeon(R) Gold 6336Y CPU @ 2.40GHz
Stepping:                           6
CPU MHz:                            800.012
BogoMIPS:                           4800.00
Virtualization:                     VT-x
.
.
.

How you are installing vllm

pip install -U vllm

It seems like in the recent few weeks a lot of crucial updates has been made to properly use vllm, which exist in version 0.6.1 but lacks in version 0.6.1.post2. However, the available version through pip is the old 0.6.1.post2.

For example, #8157 possibly fix issue #8553, which I am also having.


An update - After installing version 0.6.1 via pip, I am still having the error in issue #8553 when I try to initiate the model (which I downloaded already using Huggingface interface)

Traceback (most recent call last): 
  File "<string>", line 1, in <module> 
  File "/home/ido.amit/miniconda3/envs/benchmark/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main 
    exitcode = _main(fd, parent_sentinel)                                                            
               ^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                      
  File "/home/ido.amit/miniconda3/envs/benchmark/lib/python3.11/multiprocessing/spawn.py", line 132, in _main                  
    self = reduction.pickle.load(from_parent)                                                                                  
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                 
ModuleNotFoundError: No module named 'transformers_modules.microsoft.Phi-3'                                                   
ERROR 09-24 00:33:22 multiproc_worker_utils.py:120] Worker VllmWorkerProcess pid 3840118 died, exit code: 1                
INFO 09-24 00:33:22 multiproc_worker_utils.py:123] Killing local vLLM worker processes   

Thanks in advance for the help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions