-
-
Notifications
You must be signed in to change notification settings - Fork 10.7k
Closed as not planned
Closed as not planned
Copy link
Labels
Description
Your current environment
The output of `python collect_env.py`
How would you like to use vllm
I need to extend the context length of gemma2-9b model along also with other models like llama3.1-8b
can we do it with ROPE SCALING? if so how to use these args --rope-scaling
& --rope-theta
?
plus does these configs has different things to considered for rope scaling? I need to extend up to 128k tokens.
$ cat models--google--gemma-2-9b-it/config.json
{
"architectures": [
"Gemma2ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"attn_logit_softcapping": 50.0,
"bos_token_id": 2,
"cache_implementation": "hybrid",
"eos_token_id": 1,
"final_logit_softcapping": 30.0,
"head_dim": 256,
"hidden_act": "gelu_pytorch_tanh",
"hidden_activation": "gelu_pytorch_tanh",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 8192,
"model_type": "gemma2",
"num_attention_heads": 16,
"num_hidden_layers": 42,
"num_key_value_heads": 8,
"pad_token_id": 0,
"query_pre_attn_scalar": 256,
"rms_norm_eps": 1e-06,
"rope_theta": 10000.0,
"sliding_window": 4096,
"sliding_window_size": 4096,
"torch_dtype": "bfloat16",
"transformers_version": "4.42.0.dev0",
"use_cache": true,
"vocab_size": 256000
}
***$ cat NousResearch--Meta-Llama-3-8B-Instruct/config.json ***
{
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"eos_token_id": 128009,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 8192,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 500000.0,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.0.dev0",
"use_cache": true,
"vocab_size": 128256
}
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.