-
Notifications
You must be signed in to change notification settings - Fork 31k
Closed
Description
System Info
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
transformersversion: 4.29.2- Platform: Linux-3.10.0-1160.42.2.el7.x86_64-x86_64-with-glibc2.35
- Python version: 3.9.16
- Huggingface_hub version: 0.14.1
- Safetensors version: not installed
- PyTorch version (GPU?): 2.0.1+cu118 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
code:
from transformers import AutoTokenizer, LlamaTokenizer
auto_tokenizer = AutoTokenizer.from_pretrained("huggyllama/llama-7b", add_eos_token=True, use_fast=True)
llama_tokenizer = LlamaTokenizer.from_pretrained("huggyllama/llama-7b", add_eos_token=True, use_fast=True)
print(auto_tokenizer.decode(auto_tokenizer.encode("auto_tokenizer", add_special_tokens = True)))
print(llama_tokenizer.decode(llama_tokenizer.encode("llama_tokenizer", add_special_tokens = True)))results:
<s> auto_tokenizer
<s> llama_tokenizer</s>Expected behavior
add eos token like:
<s> auto_tokenizer</s>
<s> llama_tokenizer</s>Metadata
Metadata
Assignees
Labels
No labels