Skip to content

Conversation

darraghdog
Copy link
Contributor

@darraghdog darraghdog commented Jun 3, 2025

Support the ReDrafter functionalities used in AIMO2,

  • With fp8 base model.
  • With Qwen base model, and add qwen-7b example in README.
  • Break convert_checkpoint.py into two steps. The idea is to give more flexibility around the base model used.
    1/ convert or quantise base model with normal convert_model.py or quantise.py specific to the base model.
    2/ convert redrafter and attach to converted base model
  • [SOLVED] Issue using bfloat16 in redrafter build - if we get a base model with a bfloat16 layer, covert to float16, this is switched to float16.
  • Move main ReDrafter logic to a mixin
  • Add new classes, ReDrafterForLLaMALM, ReDrafterForQWenLM... more can be added as needed, just need to test each one - I may add some more later, eg. Deepseek.
  • Updated redrafter pytests accordingly for the above.

Tests done on README examples, attaching logs from tests here :
redrafter_pr_test_logs.txt
.

Update on tests draft_len 6, beam_width 6:
redrafter_pr_beam6_draftlen_6.txt

[FIXED] Update2 on tests draft_len 6, beam_width 6:
redrafter_pr_beam6_base_beam1.txt

@rakib-hasan rakib-hasan self-requested a review June 5, 2025 23:46
Signed-off-by: darraghdog <[email protected]>
Signed-off-by: darraghdog <[email protected]>
Signed-off-by: darraghdog <[email protected]>
@darraghdog
Copy link
Contributor Author

/bot run

@darraghdog
Copy link
Contributor Author

/bot run

1 similar comment
@rakib-hasan
Copy link
Collaborator

/bot run

@rakib-hasan rakib-hasan enabled auto-merge (squash) June 11, 2025 21:47
@tensorrt-cicd
Copy link
Collaborator

PR_Github #8552 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8552 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #6203 completed with status: 'FAILURE'

auto-merge was automatically disabled June 25, 2025 12:21

Head branch was pushed to by a user without write access

@darraghdog
Copy link
Contributor Author

/bot run

@rakib-hasan
Copy link
Collaborator

/bot run

@rakib-hasan rakib-hasan enabled auto-merge (squash) June 25, 2025 15:57
@tensorrt-cicd
Copy link
Collaborator

PR_Github #9897 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9897 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7304 completed with status: 'FAILURE'

auto-merge was automatically disabled June 27, 2025 13:02

Head branch was pushed to by a user without write access

@darraghdog
Copy link
Contributor Author

darraghdog commented Jun 27, 2025

@rakib-hasan the build_redrafter_engines.py should be fixed now.
Would it be possible to rebase the branch and run CI ?

@rakib-hasan
Copy link
Collaborator

/bot run

@rakib-hasan rakib-hasan enabled auto-merge (squash) June 27, 2025 15:56
@tensorrt-cicd
Copy link
Collaborator

PR_Github #10170 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10170 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7508 completed with status: 'SUCCESS'

@rakib-hasan rakib-hasan merged commit 5437075 into NVIDIA:main Jun 27, 2025
3 checks passed
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 9, 2025
Signed-off-by: darraghdog <[email protected]>
Signed-off-by: Darragh Hanley <[email protected]>
Co-authored-by: rakib-hasan <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
Signed-off-by: darraghdog <[email protected]>
Signed-off-by: Darragh Hanley <[email protected]>
Co-authored-by: rakib-hasan <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
Signed-off-by: darraghdog <[email protected]>
Signed-off-by: Darragh Hanley <[email protected]>
Co-authored-by: rakib-hasan <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
Signed-off-by: darraghdog <[email protected]>
Signed-off-by: Darragh Hanley <[email protected]>
Co-authored-by: rakib-hasan <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
Signed-off-by: darraghdog <[email protected]>
Signed-off-by: Darragh Hanley <[email protected]>
Co-authored-by: rakib-hasan <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
Signed-off-by: darraghdog <[email protected]>
Signed-off-by: Darragh Hanley <[email protected]>
Co-authored-by: rakib-hasan <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
Signed-off-by: darraghdog <[email protected]>
Signed-off-by: Darragh Hanley <[email protected]>
Co-authored-by: rakib-hasan <[email protected]>
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
Signed-off-by: darraghdog <[email protected]>
Signed-off-by: Darragh Hanley <[email protected]>
Co-authored-by: rakib-hasan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants