Skip to content

[Feature]: Decouple the benchmark script from the components of vLLM #5586

@zhyncs

Description

@zhyncs

🚀 The feature, motivation and pitch

Currently, the benchmark script of vLLM supports multiple backends, and the overall functionality is also relatively rich.

And it relies on backend_request_func and get_tokenizer. The backend_request_func is independent and is a separate file but if we want to use get_tokenizer, we need to clone the repository or install Python package.

from backend_request_func import (ASYNC_REQUEST_FUNCS, RequestFuncInput,
RequestFuncOutput)
from tqdm.asyncio import tqdm
from transformers import PreTrainedTokenizerBase
from vllm.transformers_utils.tokenizer import get_tokenizer

def get_tokenizer(

When we typically use the vLLM script to benchmark other backends, we do not want to rely on vLLM components. We don't want to clone the repository or install a Python package.

May I submit a PR to extract the function get_tokenizer into backend_request_func? Do you think this is okay or do you have any other suggestions? Thanks.

@ywang96 @simon-mo

Alternatives

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions