- 
                Notifications
    You must be signed in to change notification settings 
- Fork 31k
Closed
Description
System Info
transformers 4.34.0.dev0. Running this on tpu v4-8. Might happen on other plattforms as well.
Who can help?
Reproduction
Decoding is extremely slow using Transformers 4.34.0.dev0.
A small script to reproduce:
import argparse, time
from transformers import AutoTokenizer
def measure_tokenization_speed(tokenizer, sentences):
    start_time = time.time()
    outputs = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt")
    end_time = time.time()
    print(f"Time taken for encoding: {end_time - start_time} seconds")
    return outputs["input_ids"]
def measure_detokenization_speed(tokenizer, input_ids):
    start_time = time.time()
    decoded_sentences = tokenizer.batch_decode(input_ids)
    end_time = time.time()
    print(f"Time taken for decoding: {end_time - start_time} seconds")
def main(args):
    tokenizer = AutoTokenizer.from_pretrained("openai/whisper-medium", use_fast=True)
    # Create an array of 1000 sentences
    sentences = ["This is a sample sentence."] * 1000
    input_ids = measure_tokenization_speed(tokenizer, sentences)
    measure_detokenization_speed(tokenizer, input_ids)
if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Measure the speed of HuggingFace tokenizer.")
    args = parser.parse_args()
    main(args)
tpu v4-8 (transformers 4.34.0.dev0)
Time taken for encoding: 1.1659502983093262 seconds
Time taken for decoding: 39.807389974594116 seconds
tpu v4-8 (transformers 4.30.1)
Time taken for encoding: 1.2527313232421875 seconds
Time taken for decoding: 1.8215229511260986 seconds
Expected behavior
Decoder should take approximately as long as encoding.
Metadata
Metadata
Assignees
Labels
No labels