Skip to content

Conversation

@AnastasiyaKukharska
Copy link

Here I simplify getting the number of embedding tokens according to discussions on PR 26024

AnastasiyaKukharska and others added 12 commits October 13, 2023 01:11
* fix

---------

Co-authored-by: ydshieh <[email protected]>
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>
* Fix backward compatibility of Conversation

I ran into a case where an external library was depending on the `new_user_input` field of Conversation. https://github.com/SeldonIO/MLServer/blob/release/1.4.x/runtimes/huggingface/mlserver_huggingface/codecs/utils.py#L37 

This field was deprecated as part of the refactor, but if `transformers` wants to maintain backwards compatibility for now (which is mentioned in a few comments) then there's a good argument for supporting it. Some comments referred to it as an "internal" property, but it didn't start with `_` as is Python convention, so I think it's reasonable that other libraries were referencing it directly.

It's not difficult to add it to the other supported backwards-compatible properties. In addition, the implementation of `past_user_inputs` didn't actually match the past behavior (it would contain the most recent message as well) so I updated that as well.

* make style

---------

Co-authored-by: Matt <[email protected]>
* llm prompting guide

* updated code examples

* an attempt to fix the code example tests

* set seed in examples

* added a doctest comment

* added einops to the doc_test_job

* string formatting

* string formatting, again

* added the toc to slow_documentation_tests.txt

* minor list fix

* string formatting + pipe renamed

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* replaced max_length with max_new_tokens and updated the outputs to match

* minor formatting fix

* removed einops from circleci config

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <[email protected]>

* removed einops and trust_remote_code parameter

---------

Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: Lysandre Debut <[email protected]>
* Remove UniSpeechConfig

* Remove , at the end otherwise check_docstring changes order

* Auto add new docstring

* Update docstring for UniSpeechConfig

* Remove from check_docstrings

* Remove UniSpeechSatConfig and UniSpeechSatForCTC from check_docstrings

* Remove , at the end

* Fix docstring

* Update docstring for Wav2Vec2ForCTC

* Update Wav2Vec2ForCTC docstring

Co-authored-by: Yih-Dar <[email protected]>

* fix style

---------

Co-authored-by: Yih-Dar <[email protected]>
* [DOCS] Update docstrings for  and  tokenizer

* [DOCS] add pad_token argument to whisper tokenizer docstring

* [FIX] Reword pad_token description

* [CHORE] Apply style formatting

---------

Co-authored-by: jmcdonnell <[email protected]>
* [docstring] Remove 'BertGenerationConfig' from OBJECTS_TO_IGNORE

* [docstring] Fix docstring for 'BertGenerationConfig' (#26638)
Skip `TrainerIntegrationFSDP::test_basic_run_with_cpu_offload` if `torch < 2.1` (#26764)

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants