diff --git a/benchmarks/cpp/README.md b/benchmarks/cpp/README.md index 569da3dcfce..0b89bae6029 100644 --- a/benchmarks/cpp/README.md +++ b/benchmarks/cpp/README.md @@ -41,7 +41,7 @@ python3 prepare_dataset.py \ ``` For datasets that don't have prompt key, set --dataset-prompt instead. -Take [cnn_dailymail dataset](https://huggingface.co/datasets/cnn_dailymail) for example: +Take [cnn_dailymail dataset](https://huggingface.co/datasets/abisee/cnn_dailymail) for example: ``` python3 prepare_dataset.py \ --tokenizer \ diff --git a/examples/models/contrib/baichuan/README.md b/examples/models/contrib/baichuan/README.md index 757af510845..13e3b01e889 100644 --- a/examples/models/contrib/baichuan/README.md +++ b/examples/models/contrib/baichuan/README.md @@ -30,7 +30,7 @@ The script accepts an argument named model_version, whose value should be `v1_7b In addition, there are two shared files in the folder [`examples`](../../../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * FP16 diff --git a/examples/models/contrib/bloom/README.md b/examples/models/contrib/bloom/README.md index 44d1969860e..f4f738c35e1 100644 --- a/examples/models/contrib/bloom/README.md +++ b/examples/models/contrib/bloom/README.md @@ -24,7 +24,7 @@ The TensorRT-LLM BLOOM implementation can be found in [tensorrt_llm/models/bloom In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * FP16 diff --git a/examples/models/contrib/chatglm-6b/README.md b/examples/models/contrib/chatglm-6b/README.md index bc8381508da..fbe463b4c5c 100644 --- a/examples/models/contrib/chatglm-6b/README.md +++ b/examples/models/contrib/chatglm-6b/README.md @@ -34,7 +34,7 @@ The TensorRT-LLM ChatGLM example code is located in [`examples/models/contrib/ch In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix diff --git a/examples/models/contrib/chatglm2-6b/README.md b/examples/models/contrib/chatglm2-6b/README.md index 0626c48ad9a..30fc3ce3933 100644 --- a/examples/models/contrib/chatglm2-6b/README.md +++ b/examples/models/contrib/chatglm2-6b/README.md @@ -34,7 +34,7 @@ The TensorRT-LLM ChatGLM example code is located in [`examples/models/contrib/ch In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix diff --git a/examples/models/contrib/chatglm3-6b-32k/README.md b/examples/models/contrib/chatglm3-6b-32k/README.md index 831b7da6543..211844d95e4 100644 --- a/examples/models/contrib/chatglm3-6b-32k/README.md +++ b/examples/models/contrib/chatglm3-6b-32k/README.md @@ -34,7 +34,7 @@ The TensorRT-LLM ChatGLM example code is located in [`examples/models/contrib/ch In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix diff --git a/examples/models/contrib/deepseek_v1/README.md b/examples/models/contrib/deepseek_v1/README.md index 530520111e9..3e18c3a7da2 100755 --- a/examples/models/contrib/deepseek_v1/README.md +++ b/examples/models/contrib/deepseek_v1/README.md @@ -32,7 +32,7 @@ The TensorRT-LLM Deepseek-v1 implementation can be found in [tensorrt_llm/models In addition, there are three shared files in the parent folder [`examples`](../../../) can be used for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the model inference output by given an input text. -* [`../../../summarize.py`](../../../summarize.py) to summarize the article from [cnn_dailmail](https://huggingface.co/datasets/cnn_dailymail) dataset, it can running the summarize from HF model and TensorRT-LLM model. +* [`../../../summarize.py`](../../../summarize.py) to summarize the article from [cnn_dailmail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset, it can running the summarize from HF model and TensorRT-LLM model. * [`../../../mmlu.py`](../../../mmlu.py) to running score script from https://github.com/declare-lab/instruct-eval to compare HF model and TensorRT-LLM model on the MMLU dataset. ## Support Matrix diff --git a/examples/models/contrib/deepseek_v2/README.md b/examples/models/contrib/deepseek_v2/README.md index df3f2298ab7..b26ba54fadf 100644 --- a/examples/models/contrib/deepseek_v2/README.md +++ b/examples/models/contrib/deepseek_v2/README.md @@ -34,7 +34,7 @@ The TensorRT-LLM Deepseek-v2 implementation can be found in [tensorrt_llm/models In addition, there are three shared files in the parent folder [`examples`](../../../) can be used for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the model inference output by given an input text. -* [`../../../summarize.py`](../../../summarize.py) to summarize the article from [cnn_dailmail](https://huggingface.co/datasets/cnn_dailymail) dataset, it can running the summarize from HF model and TensorRT-LLM model. +* [`../../../summarize.py`](../../../summarize.py) to summarize the article from [cnn_dailmail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset, it can running the summarize from HF model and TensorRT-LLM model. * [`../../../mmlu.py`](../../../mmlu.py) to running score script from https://github.com/declare-lab/instruct-eval to compare HF model and TensorRT-LLM model on the MMLU dataset. ## Support Matrix diff --git a/examples/models/contrib/falcon/README.md b/examples/models/contrib/falcon/README.md index 7a28a9615c9..613def2eb0b 100644 --- a/examples/models/contrib/falcon/README.md +++ b/examples/models/contrib/falcon/README.md @@ -25,7 +25,7 @@ The TensorRT-LLM Falcon implementation can be found in [tensorrt_llm/models/falc In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * FP16 @@ -193,7 +193,7 @@ If the engines are built successfully, you will see output like (falcon-rw-1b as ### 4. Run summarization task with the TensorRT engine(s) The `../../../summarize.py` script can run the built engines to summarize the articles from the -[cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +[cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ```bash # falcon-rw-1b diff --git a/examples/models/contrib/gptj/README.md b/examples/models/contrib/gptj/README.md index 56c682ca621..35d6e1cc52c 100644 --- a/examples/models/contrib/gptj/README.md +++ b/examples/models/contrib/gptj/README.md @@ -26,7 +26,7 @@ code is located in [`examples/models/contrib/gptj`](./). There is one main file: In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * FP16 @@ -238,7 +238,7 @@ python3 ../../../run.py --max_output_len=50 --engine_dir=gptj_engine --tokenizer ## Summarization using the GPT-J model The following section describes how to run a TensorRT-LLM GPT-J model to summarize the articles from the -[cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. For each summary, the script can compute the +[cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. For each summary, the script can compute the [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores and use the `ROUGE-1` score to validate the implementation. The script can also perform the same summarization using the HF GPT-J model. diff --git a/examples/models/contrib/gptneox/README.md b/examples/models/contrib/gptneox/README.md index 500c18862f2..5c0a7289947 100644 --- a/examples/models/contrib/gptneox/README.md +++ b/examples/models/contrib/gptneox/README.md @@ -27,7 +27,7 @@ The TensorRT-LLM GPT-NeoX implementation can be found in [`tensorrt_llm/models/g In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * FP16 @@ -118,7 +118,7 @@ trtllm-build --checkpoint_dir ./gptneox/20B/trt_ckpt/int8_wo/2-gpu/ \ ### 4. Summarization using the GPT-NeoX model The following section describes how to run a TensorRT-LLM GPT-NeoX model to summarize the articles from the -[cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. For each summary, the script can compute the +[cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. For each summary, the script can compute the [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores and use the `ROUGE-1` score to validate the implementation. The script can also perform the same summarization using the HF GPT-NeoX model. diff --git a/examples/models/contrib/grok/README.md b/examples/models/contrib/grok/README.md index 2e2bfede269..0e6f228ffa7 100644 --- a/examples/models/contrib/grok/README.md +++ b/examples/models/contrib/grok/README.md @@ -29,7 +29,7 @@ The TensorRT-LLM Grok-1 implementation can be found in [tensorrt_llm/models/grok In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * INT8 Weight-Only diff --git a/examples/models/contrib/internlm/README.md b/examples/models/contrib/internlm/README.md index 6cb30640d04..b9a063caafa 100644 --- a/examples/models/contrib/internlm/README.md +++ b/examples/models/contrib/internlm/README.md @@ -24,7 +24,7 @@ The TensorRT-LLM InternLM example code lies in [`examples/models/contrib/internl In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * FP16 / BF16 diff --git a/examples/models/contrib/jais/README.md b/examples/models/contrib/jais/README.md index b110c9d840b..5c54d4631bd 100644 --- a/examples/models/contrib/jais/README.md +++ b/examples/models/contrib/jais/README.md @@ -23,7 +23,7 @@ The TensorRT-LLM support for Jais is based on the GPT model, the implementation In addition, there are two shared files in the parent folder [`examples`](../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix The tested configurations are: diff --git a/examples/models/contrib/mpt/README.md b/examples/models/contrib/mpt/README.md index 80bbfcbc0a4..8223fc7acc0 100644 --- a/examples/models/contrib/mpt/README.md +++ b/examples/models/contrib/mpt/README.md @@ -29,7 +29,7 @@ The TensorRT-LLM MPT implementation can be found in [`tensorrt_llm/models/mpt/mo In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * FP16 diff --git a/examples/models/contrib/opt/README.md b/examples/models/contrib/opt/README.md index f2b2b9b52de..c2cb288ff46 100644 --- a/examples/models/contrib/opt/README.md +++ b/examples/models/contrib/opt/README.md @@ -25,7 +25,7 @@ The TensorRT-LLM OPT implementation can be found in [`tensorrt_llm/models/opt/mo In addition, there are two shared files in the parent folder [`examples`](../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * FP16 @@ -127,7 +127,7 @@ trtllm-build --checkpoint_dir ./opt/66B/trt_ckpt/fp16/4-gpu/ \ ### 4. Summarization using the OPT model The following section describes how to run a TensorRT-LLM OPT model to summarize the articles from the -[cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. For each summary, the script can compute the +[cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. For each summary, the script can compute the [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores and use the `ROUGE-1` score to validate the implementation. The script can also perform the same summarization using the HF OPT model. diff --git a/examples/models/contrib/skywork/README.md b/examples/models/contrib/skywork/README.md index 72c4127ad27..ff3f7032ef2 100644 --- a/examples/models/contrib/skywork/README.md +++ b/examples/models/contrib/skywork/README.md @@ -12,7 +12,7 @@ The TensorRT-LLM Skywork example code lies in [`examples/models/contrib/skywork` In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * FP16 & BF16 @@ -78,7 +78,7 @@ trtllm-build --checkpoint_dir ./skywork-13b-base/trt_ckpt/bf16 \ ### 4. Summarization using the Engines -After building TRT engines, we can use them to perform various tasks. TensorRT-LLM provides handy code to run summarization on [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset and get [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores. The `ROUGE-1` score can be used to validate model implementations. +After building TRT engines, we can use them to perform various tasks. TensorRT-LLM provides handy code to run summarization on [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset and get [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores. The `ROUGE-1` score can be used to validate model implementations. ```bash # fp16 diff --git a/examples/models/contrib/smaug/README.md b/examples/models/contrib/smaug/README.md index a7070161013..736151e8cf5 100644 --- a/examples/models/contrib/smaug/README.md +++ b/examples/models/contrib/smaug/README.md @@ -11,7 +11,7 @@ The TensorRT-LLM support for Smaug-72B-v0.1 is based on the LLaMA model, the imp In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`../../../run.py`](../../../run.py) to run the inference on an input text; -* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`../../../summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix @@ -43,7 +43,7 @@ trtllm-build --checkpoint_dir ./tllm_checkpoint_8gpu_tp8 \ ### Run Summarization -After building TRT engine, we can use it to perform various tasks. TensorRT-LLM provides handy code to run summarization on [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset and get [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores. The `ROUGE-1` score can be used to validate model implementations. +After building TRT engine, we can use it to perform various tasks. TensorRT-LLM provides handy code to run summarization on [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset and get [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores. The `ROUGE-1` score can be used to validate model implementations. ```bash mpirun -n 8 -allow-run-as-root python ../../../summarize.py \ diff --git a/examples/models/core/commandr/README.md b/examples/models/core/commandr/README.md index 43381f92ca1..3bffe933cce 100644 --- a/examples/models/core/commandr/README.md +++ b/examples/models/core/commandr/README.md @@ -26,7 +26,7 @@ The TensorRT-LLM Command-R example code is located in [`examples/models/core/com In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`run.py`](../../../run.py) to run the inference on an input text; -* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix diff --git a/examples/models/core/gemma/README.md b/examples/models/core/gemma/README.md index 2ade71b3c3f..bffb254dc25 100644 --- a/examples/models/core/gemma/README.md +++ b/examples/models/core/gemma/README.md @@ -81,7 +81,7 @@ trtllm-build --checkpoint_dir ${UNIFIED_CKPT_PATH} \ We provide three examples to run inference `run.py`, `summarize.py` and `mmlu.py`. `run.py` only run inference with `input_text` and show the output. -`summarize.py` runs summarization on [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset and evaluate the model by [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores and use the `ROUGE-1` score to validate the implementation. +`summarize.py` runs summarization on [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset and evaluate the model by [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores and use the `ROUGE-1` score to validate the implementation. `mmlu.py` runs MMLU to evaluate the model by accuracy. diff --git a/examples/models/core/glm-4-9b/README.md b/examples/models/core/glm-4-9b/README.md index 02626bbc940..04988e59e82 100644 --- a/examples/models/core/glm-4-9b/README.md +++ b/examples/models/core/glm-4-9b/README.md @@ -34,7 +34,7 @@ The TensorRT-LLM ChatGLM example code is located in [`examples/models/core/glm-4 In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`run.py`](../../../run.py) to run the inference on an input text; -* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix diff --git a/examples/models/core/gpt/README.md b/examples/models/core/gpt/README.md index dd74fad2bb9..376839b3c4a 100644 --- a/examples/models/core/gpt/README.md +++ b/examples/models/core/gpt/README.md @@ -44,7 +44,7 @@ The TensorRT-LLM GPT implementation can be found in [`tensorrt_llm/models/gpt/mo In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`run.py`](../../../run.py) to run the inference on an input text; -* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * FP16 @@ -222,7 +222,7 @@ Input [Text 0]: "Born in north-east France, Soyer trained as a" Output [Text 0 Beam 0]: " chef before moving to London in the early" ``` -The [`summarize.py`](../../../summarize.py) script can run the built engines to summarize the articles from the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +The [`summarize.py`](../../../summarize.py) script can run the built engines to summarize the articles from the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. For each summary, the script can compute the [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores and use the `ROUGE-1` score to validate the implementation. By passing `--test_trt_llm` flag, the script will evaluate TensorRT-LLM engines. You may also pass `--test_hf` flag to evaluate the HF model. diff --git a/examples/models/core/internlm2/README.md b/examples/models/core/internlm2/README.md index 2cbb12d1650..7073999b850 100644 --- a/examples/models/core/internlm2/README.md +++ b/examples/models/core/internlm2/README.md @@ -14,7 +14,7 @@ The TensorRT-LLM InternLM2 example code lies in [`examples/models/core/internlm2 In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`run.py`](../../../run.py) to run the inference on an input text; -* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * FP16 / BF16 diff --git a/examples/models/core/llama/README.md b/examples/models/core/llama/README.md index 61d25158419..cdf660035c2 100644 --- a/examples/models/core/llama/README.md +++ b/examples/models/core/llama/README.md @@ -47,7 +47,7 @@ The TensorRT-LLM LLaMA implementation can be found in [tensorrt_llm/models/llama In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`run.py`](../../../run.py) to run the inference on an input text; -* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * BF16/FP16 diff --git a/examples/models/core/mamba/README.md b/examples/models/core/mamba/README.md index 63f31956cb8..325935737f8 100644 --- a/examples/models/core/mamba/README.md +++ b/examples/models/core/mamba/README.md @@ -20,7 +20,7 @@ The TensorRT-LLM Mamba implementation can be found in [`tensorrt_llm/models/mamb In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`run.py`](../../../run.py) to run the inference on an input text; -* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix @@ -177,7 +177,7 @@ If `paged_state` is disabled, engine will be built with the contiguous stage cac ### 4. Run summarization task with the TensorRT engine(s) The following section describes how to run a TensorRT-LLM Mamba model to summarize the articles from the -[cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. For each summary, the script can compute the +[cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. For each summary, the script can compute the [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores and use the `ROUGE-1` score to validate the implementation. ```bash diff --git a/examples/models/core/nemotron/README.md b/examples/models/core/nemotron/README.md index 8c6e187094f..c99d17435a0 100644 --- a/examples/models/core/nemotron/README.md +++ b/examples/models/core/nemotron/README.md @@ -19,7 +19,7 @@ The TensorRT-LLM Nemotron implementation is based on the GPT model, which can be In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`run.py`](../../../run.py) to run the inference on an input text; -* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix * FP16/BF16 @@ -157,7 +157,7 @@ trtllm-build --checkpoint_dir nemotron-3-8b/trt_ckpt/int4_awq/1-gpu \ ### Run Inference The `summarize.py` script can run the built engines to summarize the articles from the -[cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +[cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ```bash # single gpu diff --git a/examples/models/core/phi/README.md b/examples/models/core/phi/README.md index b3b61d61626..3a3543c3536 100644 --- a/examples/models/core/phi/README.md +++ b/examples/models/core/phi/README.md @@ -21,7 +21,7 @@ The TensorRT-LLM Phi implementation can be found in [`tensorrt_llm/models/phi/mo In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`run.py`](../../../run.py) to run the inference on an input text; -* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix @@ -83,7 +83,7 @@ trtllm-build \ ### 3. Summarization using the Phi model -The following section describes how to run a TensorRT-LLM Phi model to summarize the articles from the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. For each summary, the script can compute the [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores and use the `ROUGE-1` score to validate the implementation. +The following section describes how to run a TensorRT-LLM Phi model to summarize the articles from the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. For each summary, the script can compute the [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores and use the `ROUGE-1` score to validate the implementation. The script can also perform the same summarization using the HF Phi model. As previously explained, the first step is to build the TensorRT engine as described above using HF weights. You also have to install the requirements: diff --git a/examples/models/core/qwen/README.md b/examples/models/core/qwen/README.md index 669ae274a02..3e589dc3e17 100644 --- a/examples/models/core/qwen/README.md +++ b/examples/models/core/qwen/README.md @@ -39,7 +39,7 @@ The TensorRT-LLM Qwen implementation can be found in [models/qwen](../../../../t In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`run.py`](../../../run.py) to run the inference on an input text; -* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix | Model Name | FP16/BF16 | FP8 | WO | AWQ | GPTQ | SQ | TP | PP | Arch | diff --git a/examples/models/core/recurrentgemma/README.md b/examples/models/core/recurrentgemma/README.md index 713af6607c3..c3c398f6ec0 100644 --- a/examples/models/core/recurrentgemma/README.md +++ b/examples/models/core/recurrentgemma/README.md @@ -11,7 +11,7 @@ The TensorRT-LLM RecurrentGemma implementation can be found in [`tensorrt_llm/mo In addition, there are two shared files in the parent folder [`examples`](../../../) for inference and evaluation: * [`run.py`](../../../run.py) to run the inference on an input text; -* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset. +* [`summarize.py`](../../../summarize.py) to summarize the articles in the [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. ## Support Matrix | Checkpoint type | FP16 | BF16 | FP8 | INT8 SQ | INT4 AWQ | TP | @@ -171,7 +171,7 @@ trtllm-build --checkpoint_dir ${UNIFIED_CKPT_2B_IT_FLAX_PATH} \ We provide three examples to run inference `run.py`, `summarize.py` and `mmlu.py`. `run.py` only run inference with `input_text` and show the output. -`summarize.py` runs summarization on [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset and evaluate the model by [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores and use the `ROUGE-1` score to validate the implementation. +`summarize.py` runs summarization on [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset and evaluate the model by [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) scores and use the `ROUGE-1` score to validate the implementation. `mmlu.py` runs MMLU to evaluate the model by accuracy. diff --git a/examples/models/core/whisper/README.md b/examples/models/core/whisper/README.md index 78016abdd4b..4a5d5652cfb 100755 --- a/examples/models/core/whisper/README.md +++ b/examples/models/core/whisper/README.md @@ -20,7 +20,7 @@ The TensorRT-LLM Whisper example code is located in [`examples/models/core/whisp * [`convert_checkpoint.py`](./convert_checkpoint.py) to convert weights from OpenAI Whisper format to TRT-LLM format. * `trtllm-build` to build the [TensorRT](https://developer.nvidia.com/tensorrt) engine(s) needed to run the Whisper model. - * [`run.py`](./run.py) to run the inference on a single wav file, or [a HuggingFace dataset](https://huggingface.co/datasets/librispeech_asr) [\(Librispeech test clean\)](https://www.openslr.org/12). + * [`run.py`](./run.py) to run the inference on a single wav file, or [a HuggingFace dataset](https://huggingface.co/datasets/openslr/librispeech_asr) [\(Librispeech test clean\)](https://www.openslr.org/12). ## Support Matrix * FP16