BUG fix in _prepare_assistant_input_ids #14

jmamou · 2025-01-09T15:26:10Z

The token added by the target during the validation step is decoded as a string, which the draft always interprets as the beginning of a new word.

keyboardAnt

I added a few comments. My primary concern is merging untested changes, as they will be automatically added to our open PR to upstream. Has this branch been tested? What are the expected speedups we will achieve after merging it into the existing code?

src/transformers/generation/candidate_generator.py

Co-authored-by: Nadav Timor <[email protected]>

jmamou · 2025-01-12T14:11:22Z

I added a few comments. My primary concern is merging untested changes, as they will be automatically added to our open PR to upstream. Has this branch been tested? What are the expected speedups we will achieve after merging it into the existing code?

It has been tested, need to add relevant automatic tests.

Here is the impact on model pairs using different space signs comparing with previous benchmark
microsoft/Phi-3-medium-128k-instruct | Qwen/Qwen2-0.5B-Instruct 0.63x -> 0.78x
codellama/CodeLlama-13b-Instruct-hf | bigcode/tiny_starcoder_py 0.74x -> 0.82x

My primary question was about the implementation. WDYT about it?

keyboardAnt

.

keyboardAnt

LGTM.

keyboardAnt · 2025-01-12T15:27:51Z

@jmamou, it seems like the CI tests fail:

jmamou · 2025-01-12T15:35:08Z

it seems that the merge occurred before the last commit that renamed target_to_assistant_input_id to target_to_assistant_input_ids.

* move `TestAssistedCandidateGeneratorDifferentTokenizers` into a new testing file * refactor * NOTHING. add space to rerun github actions tests * remove it... * `UniversalSpeculativeDecodingGenerator` * Use `UniversalSpeculativeDecodingGenerator` when `generation_config.do_sample=True` * assistant tokenizes only the target's new suffix * formatting * fix code * fix code * formatting * add `TestGenerateWithDifferentModels` * `TestGenerateWithDifferentModels` parameterize on `do_sample` * `AssistantVocabMapping` & `AssistantVocabMappingCache` * formatting * `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits` * improve `_get_assistant_to_target_input_ids` & formatting * renaming * WIP: debugging `min_new_tokens` * fix get_target_ids * `UniversalSpeculativeDecodingGenerator` * assistant tokenizes only the target's new suffix * formatting * fix code * fix code * formatting * `TestGenerateWithDifferentModels` parameterize on `do_sample` * `AssistantVocabMapping` & `AssistantVocabMappingCache` * formatting * `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits` * improve `_get_assistant_to_target_input_ids` & formatting * renaming * WIP: debugging `min_new_tokens` * fix get_target_ids * fix device issue * fix get_assistant_input_ids * add `TestAssistedCandidateGeneratorDifferentTokenizers` * formatting * `AssistantVocabTranslatorCache` refactor & tests * revert changes in `src/transformers/generation/logits_process.py` * refactor `AssistedCandidateGenerator` * refactor `AssistedCandidateGeneratorDifferentTokenizers` * formatting * refactor `UniversalSpeculativeDecodingGenerator` * fix negative value for max_new_tokens * fix generation length target + attention_mask vs. assistant + attent * fix device * fix negative max_new_tokens bug * fix UAG * minor * formatting * `AssistedCandidateGeneratorDifferentTokenizers` `lookbehind`s init * resolve conflict & formatting * rerun CI tests * remove space... * remove old code * fix candidate_input_ids device * minor * formatting * Fix prepare + apply (#7) * fix prepare + apply * move to cpu * simplity suppress_tokens * fix bugs and refacatoring * device move * handle self.config.vocab_size > len(target_tokenizer.get_vocab()) * no need to normalize in candidate_generator * address Nadav's comments + minor * optimize device move + SuppressTokensLogitsProcessor * AssistantToTargetTranslator, SuppressTokensLogitsProcessor and tokenizers mapping improvements * padding size * padding improvement * fix and simplify get_target_logits * renaming in get_target_logits * minor * add filter_value and suppress_tokens_id * style + rename * remove TODO * restore original SelectTokensLogitsProcessor with modification * fix style * fix _update_past_and_masks and optimize code * remove assistant_vocab_size arg * fix attention_mask * call _prepare_attention_mask also if not has_past_key_values * handling attention mask for first generation * comment * restore test * remove SelectTokensLogitsProcessor * _update_past_and_masks implementation for USD * Add unittests for Universal Assisted generation * fix style * update tests * Remove unused import and fix `test_speculation_depth` test * exclude special and reserved tokens from tokenizer for UAG * mv `test_universal_assisted_generation.py` to `generation/test_candidate_generator.py` * Remove unused imports and fix style using `make style` (#9) * formatting * Swap gated `meta-llama/llama-3.2` with `allenai/llama` (#10) * Fix space sign disagreement (#12) * default values for AssistantToTargetTranslator fileds * fix space sign * minor * fix test + style * Default values for some fields of assistant to target translator (#11) * default values for AssistantToTargetTranslator fileds * fix * add support to empty logit_processors * Update candidate_generator.py (#15) fix typo * BUG fix in _prepare_assistant_input_ids (#14) * fix _prepare_assistant_input_ids * target_to_assistant_input_ids * Update src/transformers/generation/candidate_generator.py Co-authored-by: Nadav Timor <[email protected]> --------- Co-authored-by: Nadav Timor <[email protected]> * typo (`target_to_assistant_input_ids`) * formatting * merge upstream/main * Fix minor review comments (#16) * Fix: `token_ids.to(torch.int64)` (#18) * tok ids to `torch.int64` (reference: https://huggingface.co/docs/transformers.js/en/api/tokenizers) * `LongTensor` * fix dtype * `assistant_input_ids.to(dtype=torch.long)` * Remove unused import from test_candidate_generator.py * Remove unused import from test_candidate_generator.py * Remove `numpy` import * resolve pr comments (#19) * `AssistantToTargetTranslator` docstring * (per gante's comment) `filter_value` and `suppress_tokens_id` to class constants * update `AssistantToTargetTranslator` docstring * (gante's comment) replace `match-case` * formatting * Fix Joao's comments (#21) * remove threading * fix logits_processor * fix test device * fix style (#23) * Move atm (#24) * move AssistantToTargetTranslator * fixup * fix logit_processor * add atm_translator test * refactor test * remove threading from test * add require_torch in tests * move AssistantVocabTranslatorCache + add tests * ruff fix --------- Co-authored-by: jmamou <[email protected]> Co-authored-by: Gaurav <[email protected]> Co-authored-by: Gaurav Jain <[email protected]> Co-authored-by: gauravjain14 <[email protected]>

jmamou requested review from gauravjain14 and keyboardAnt January 9, 2025 15:26

fix _prepare_assistant_input_ids

d20c488

jmamou force-pushed the fix_prepare_assistant_input_ids branch from 3c5f39c to d20c488 Compare January 12, 2025 13:16

target_to_assistant_input_ids

afe69d8

keyboardAnt reviewed Jan 12, 2025

View reviewed changes

Update src/transformers/generation/candidate_generator.py

8dde8c2

Co-authored-by: Nadav Timor <[email protected]>

keyboardAnt reviewed Jan 12, 2025

View reviewed changes

keyboardAnt self-requested a review January 12, 2025 15:16

keyboardAnt approved these changes Jan 12, 2025

View reviewed changes

keyboardAnt merged commit a556947 into usd Jan 12, 2025

keyboardAnt deleted the fix_prepare_assistant_input_ids branch January 12, 2025 15:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BUG fix in _prepare_assistant_input_ids #14

BUG fix in _prepare_assistant_input_ids #14

Uh oh!

jmamou commented Jan 9, 2025

Uh oh!

keyboardAnt left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jmamou commented Jan 12, 2025

Uh oh!

keyboardAnt left a comment •

edited

Loading

Uh oh!

keyboardAnt left a comment

Uh oh!

keyboardAnt commented Jan 12, 2025

Uh oh!

jmamou commented Jan 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

BUG fix in _prepare_assistant_input_ids #14

BUG fix in _prepare_assistant_input_ids #14

Uh oh!

Conversation

jmamou commented Jan 9, 2025

Uh oh!

keyboardAnt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jmamou commented Jan 12, 2025

Uh oh!

keyboardAnt left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keyboardAnt left a comment

Choose a reason for hiding this comment

Uh oh!

keyboardAnt commented Jan 12, 2025

Uh oh!

jmamou commented Jan 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

keyboardAnt left a comment •

edited

Loading