Skip to content

Gibberish output with Llama-2-7b-chat-hf-q4f32_1 #356

@beaufortfrancois

Description

@beaufortfrancois

Chrome Version: 125.0.6283.3
OS: ChromeOS
GPU: Intel(R) Graphics (ADL GT2) - Intel open-source Mesa driver: Mesa 23.3.0 (git-5cb3f1e4fa)
Dawn Backend: Vulkan

What steps will reproduce the problem?

  1. Go to https://webllm.mlc.ai/#chat-demo
  2. Select Llama-2-7b-chat-hf-q4f32_1
  3. Enter What color is the dress?

What is the expected result?
Some text that at least makes sense.

What happens instead?
Some gibberish text appears.
DevTools JavaScript console contains the following logs:

llm_chat.ts:150 Using prefillChunkSize:  1024
llm_chat.ts:180 Using maxWindowLength:  4096
llm_chat.ts:202 Using Paged KVCache
15vkAllocateMemory failed with VK_ERROR_OUT_OF_DEVICE_MEMORY
    at CheckVkOOMThenSuccessImpl (..<URL>)

15vkAllocateMemory failed with VK_ERROR_OUT_OF_DEVICE_MEMORY
    at CheckVkOOMThenSuccessImpl (..<URL>)

Then I enter "What color is the dress?"

97[Invalid Buffer (unlabeled)] is invalid.
 - While validating entries[0] as a Buffer.
Expected entry layout: { type: BufferBindingType::Storage, hasDynamicOffset: 0, minBindingSize: 0 }
 - While validating [BindGroupDescriptor] against [BindGroupLayout (unlabeled)]
 - While calling [Device].CreateBindGroup([BindGroupDescriptor]).

162[Invalid BindGroup (unlabeled)] is invalid.
 - While encoding [ComputePassEncoder (unlabeled)].SetBindGroup(0, [Invalid BindGroup (unlabeled)], 0, ...).

161[Invalid CommandBuffer] is invalid.
 - While calling [Queue].Submit([[Invalid CommandBuffer]])

97[Invalid Buffer (unlabeled)] is invalid.
 - While validating entries[0] as a Buffer.
Expected entry layout: { type: BufferBindingType::Storage, hasDynamicOffset: 0, minBindingSize: 0 }
 - While validating [BindGroupDescriptor] against [BindGroupLayout (unlabeled)]
 - While calling [Device].CreateBindGroup([BindGroupDescriptor]).

65[Invalid Buffer (unlabeled)] is invalid.
 - While validating entries[0] as a Buffer.
Expected entry layout: { type: BufferBindingType::ReadOnlyStorage, hasDynamicOffset: 0, minBindingSize: 0 }
 - While validating [BindGroupDescriptor] against [BindGroupLayout (unlabeled)]
 - While calling [Device].CreateBindGroup([BindGroupDescriptor]).

162[Invalid BindGroup (unlabeled)] is invalid.
 - While encoding [ComputePassEncoder (unlabeled)].SetBindGroup(0, [Invalid BindGroup (unlabeled)], 0, ...).

161[Invalid CommandBuffer] is invalid.
 - While calling [Queue].Submit([[Invalid CommandBuffer]])

65[Invalid Buffer (unlabeled)] is invalid.
 - While validating entries[0] as a Buffer.
Expected entry layout: { type: BufferBindingType::ReadOnlyStorage, hasDynamicOffset: 0, minBindingSize: 0 }
 - While validating [BindGroupDescriptor] against [BindGroupLayout (unlabeled)]
 - While calling [Device].CreateBindGroup([BindGroupDescriptor]).

WebGPU: too many warnings, no more warnings will be reported to the console for this GPUDevice.
/#chat-demo:1 WebGPU: too many warnings, no more warnings will be reported to the console for this GPUDevice.

Note
It does work properly with the following f16 variants: Llama-2-7b-chat-hf-q4f16_1 and Llama-2-7b-chat-hf-q4f16_1-1k
I can reproduce with Llama-2-13b-chat-hf-q4f16_1

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions