Skip to content

benchmark for chatglm2-6b failed #138

@elinx

Description

@elinx

I convert chagglm2-6b model and run fine with the build command:

python3 build.py --model_dir=${model_dir} \
                 --dtype float16 \
                 --use_gpt_attention_plugin float16 \
                 --use_gemm_plugin float16

but benchmark failed with the following command:

../../cpp/build/benchmarks/gptSessionBenchmark --duration 30 --model chatglm2-6b --engine_dir /code/tensorrt_llm/examples/chatglm2-6b/trtModel --batch_size 1 --input_output_len 32,1

error message:

[TensorRT-LLM][ERROR] [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/code/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139)
1       0x561e1c97c6ee tensorrt_llm::common::throwRuntimeError(char const*, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 100
2       0x7f6be25bd53b tensorrt_llm::runtime::TllmRuntime::setInputTensors(int, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<tensorrt_llm::runtime::ITensor>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<tensorrt_llm::runtime::ITensor> > > > const&) + 1867
3       0x7f6be2587453 tensorrt_llm::runtime::GptSession::generateSingleBatch(tensorrt_llm::runtime::GenerationOutput&, tensorrt_llm::runtime::GenerationInput const&, tensorrt_llm::runtime::SamplingConfig const&) + 2211
4       0x561e1c980537 ../../cpp/build/benchmarks/gptSessionBenchmark(+0x17537) [0x561e1c980537]
5       0x7f6ba4edcd90 /usr/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f6ba4edcd90]
6       0x7f6ba4edce40 __libc_start_main + 128
7       0x561e1c981fe5 ../../cpp/build/benchmarks/gptSessionBenchmark(+0x18fe5) [0x561e1c981fe5]

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingtriagedIssue has been triaged by maintainers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions