Skip to content

CUDA Error 801: Operation Not Supported #870

@t19cs045-sub

Description

@t19cs045-sub

I encountered a CUDA error while running a script that uses the Llama model. The error message is “CUDA error 801 at ggml-cuda.cu:6799: operation not supported”. The current device is 0.

Code Snippet:

def question(message):
# LLM setup
llm = Llama(model_path="./japanese-stablelm-instruct-gamma-7b-q8_0.gguf",
n_gpu_layers=32)

  # Run inference
  output = llm(
      prompt,
      temperature=1,
      top_p=0.95,
      stop=["指示:", "入力:", "応答:"],
      echo=False,
      max_tokens=1024
  )

Error Message:
llm_load_tensors: ggml ctx size = 0.11 MB
llm_load_tensors: using CUDA for GPU acceleration
llm_load_tensors: mem required = 132.92 MB
llm_load_tensors: offloading 32 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 35/35 layers to GPU
llm_load_tensors: VRAM used: 7205.83 MB
...................................................................................................
llama_new_context_with_model: n_ctx = 512
llama_new_context_with_model: freq_base = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init: offloading v cache to GPU
llama_kv_cache_init: offloading k cache to GPU
llama_kv_cache_init: VRAM kv self = 64.00 MB
llama_new_context_with_model: kv self size = 64.00 MB
llama_build_graph: non-view tensors processed: 740/740
llama_new_context_with_model: compute buffer total size = 79.63 MB
llama_new_context_with_model: VRAM scratch buffer: 73.00 MB
llama_new_context_with_model: total VRAM used: 7342.83 MB (model: 7205.83 MB, context: 137.00 MB)
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |

CUDA error 801 at ggml-cuda.cu:6799: operation not supported
current device: 0

Environment:

NVIDIA-SMI 545.23.06
Driver Version: 545.23.06
CUDA Version: 12.3
GPU: Nvidia Quadro M4000 8GB
Any help in resolving this issue would be greatly appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions