Skip to content

Commit 2d0e384

Browse files
mishig25Tylersuardxenovazspoyounesbelkada
authored
merge main (#23866)
* Debug example code for MegaForCausalLM (#23382) * Debug example code for MegaForCausalLM set ignore_mismatched_sizes=True in model loading code * Fix up * Remove erroneous `img` closing tag (#23646) See #23625 * Fix tensor device while attention_mask is not None (#23538) * Fix tensor device while attention_mask is not None * Fix tensor device while attention_mask is not None * Fix accelerate logger bug (#23650) * fix logger bug * Update tests/mixed_int8/test_mixed_int8.py Co-authored-by: Zachary Mueller <[email protected]> * import `PartialState` --------- Co-authored-by: Zachary Mueller <[email protected]> * Muellerzr fix deepspeed (#23657) * Fix deepspeed recursion * Better fix * Bugfix: LLaMA layer norm incorrectly changes input type and consumers lots of memory (#23535) * Fixed bug where LLaMA layer norm would change input type. * make fix-copies --------- Co-authored-by: younesbelkada <[email protected]> * Fix wav2vec2 is_batched check to include 2-D numpy arrays (#23223) * Fix wav2vec2 is_batched check to include 2-D numpy arrays * address comment * Add tests * oops * oops * Switch to np array Co-authored-by: Sanchit Gandhi <[email protected]> * Switch to np array * condition merge * Specify mono channel only in comment * oops, add other comment too * make style * Switch list check from falsiness to empty --------- Co-authored-by: Sanchit Gandhi <[email protected]> * changing the requirements to a cpu torch version that works (#23483) * Fix SAM tests and use smaller checkpoints (#23656) * Fix SAM tests and use smaller checkpoints * Override test_model_from_pretrained to use sam-vit-base as well * make fixup * Update all no_trainer with skip_first_batches (#23664) * Update workflow files (#23658) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * [image-to-text pipeline] Add conditional text support + GIT (#23362) * First draft * Remove print statements * Add conditional generation * Add more tests * Remove scripts * Remove BLIP specific linkes * Add support for pix2struct * Add fast test * Address comment * Fix style * small fix to remove unused eos in processor when it's not used. (#23408) * Bump requests from 2.27.1 to 2.31.0 in /examples/research_projects/decision_transformer (#23673) Bump requests in /examples/research_projects/decision_transformer Bumps [requests](https://github.com/psf/requests) from 2.27.1 to 2.31.0. - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](psf/requests@v2.27.1...v2.31.0) --- updated-dependencies: - dependency-name: requests dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump requests from 2.22.0 to 2.31.0 in /examples/research_projects/visual_bert (#23670) Bump requests in /examples/research_projects/visual_bert Bumps [requests](https://github.com/psf/requests) from 2.22.0 to 2.31.0. - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](psf/requests@v2.22.0...v2.31.0) --- updated-dependencies: - dependency-name: requests dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump requests from 2.22.0 to 2.31.0 in /examples/research_projects/lxmert (#23668) Bump requests in /examples/research_projects/lxmert Bumps [requests](https://github.com/psf/requests) from 2.22.0 to 2.31.0. - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](psf/requests@v2.22.0...v2.31.0) --- updated-dependencies: - dependency-name: requests dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Add PerSAM [bis] (#23659) * Add PerSAM args * Make attn_sim optional * Rename to attention_similarity * Add docstrigns * Improve docstrings * Fix typo in a parameter name for open llama model (#23637) * Update modeling_open_llama.py Fix typo in `use_memorry_efficient_attention` parameter name * Update configuration_open_llama.py Fix typo in `use_memorry_efficient_attention` parameter name * Update configuration_open_llama.py Take care of backwards compatibility ensuring that the previous parameter name is taken into account if used * Update configuration_open_llama.py format to adjust the line length * Update configuration_open_llama.py proper code formatting using `make fixup` * Update configuration_open_llama.py pop the argument not to let it be set later down the line * Fix PyTorch SAM tests (#23682) fix Co-authored-by: ydshieh <[email protected]> * Making `safetensors` a core dependency. (#23254) * Making `safetensors` a core dependency. To be merged later, I'm creating the PR so we can try it out. * Update setup.py * Remove duplicates. * Even more redundant. * 🌐 [i18n-KO] Translated `tasks/monocular_depth_estimation.mdx` to Korean (#23621) docs: ko: `tasks/monocular_depth_estimation` Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Gabriel Yang <[email protected]> Co-authored-by: Wonhyeong Seo <[email protected]> Co-authored-by: Jungnerd <[email protected]> * Fix a `BridgeTower` test (#23694) fix Co-authored-by: ydshieh <[email protected]> * [`SAM`] Fixes pipeline and adds a dummy pipeline test (#23684) * add a dummy pipeline test * change test name * TF version compatibility fixes (#23663) * New TF version compatibility fixes * Remove dummy print statement, move expand_1d * Make a proper framework inference function * Make a proper framework inference function * ValueError -> TypeError * [`Blip`] Fix blip doctest (#23698) fix blip doctest * is_batched fix for remaining 2-D numpy arrays (#23309) * Fix is_batched code to allow 2-D numpy arrays for audio * Tests * Fix typo * Incorporate comments from PR #23223 * Skip `TFCvtModelTest::test_keras_fit_mixed_precision` for now (#23699) fix Co-authored-by: ydshieh <[email protected]> * fix: load_best_model_at_end error when load_in_8bit is True (#23443) Ref: huggingface/peft#394 Loading a quantized checkpoint into non-quantized Linear8bitLt is not supported. call module.cuda() before module.load_state_dict() * Fix some docs what layerdrop does (#23691) * Fix some docs what layerdrop does * Update src/transformers/models/data2vec/configuration_data2vec_audio.py Co-authored-by: Sylvain Gugger <[email protected]> * Fix more docs --------- Co-authored-by: Sylvain Gugger <[email protected]> * add GPTJ/bloom/llama/opt into model list and enhance the jit support (#23291) Signed-off-by: Wang, Yi A <[email protected]> * 4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479) * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added fix for fp32 layer norms and bf16 compute in LLaMA. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Fixing issues for PR #23479. * Added fix for fp32 layer norms and bf16 compute in LLaMA. * Reverted variable name change. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Added missing tests. * Fixup changes. * Added fixup changes. * Missed some variables to rename. * revert trainer tests * revert test trainer * another revert * fix tests and safety checkers * protect import * simplify a bit * Update src/transformers/trainer.py * few fixes * add warning * replace with `load_in_kbit = load_in_4bit or load_in_8bit` * fix test * fix tests * this time fix tests * safety checker * add docs * revert torch_dtype * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * multiple fixes * update docs * version checks and multiple fixes * replace `is_loaded_in_kbit` * replace `load_in_kbit` * change methods names * better checks * oops * oops * address final comments --------- Co-authored-by: younesbelkada <[email protected]> Co-authored-by: Younes Belkada <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> * Paged Optimizer + Lion Optimizer for Trainer (#23217) * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. --------- Co-authored-by: younesbelkada <[email protected]> * Export to ONNX doc refocused on using optimum, added tflite (#23434) * doc refocused on using optimum, tflite * minor updates to fix checks * Apply suggestions from code review Co-authored-by: regisss <[email protected]> * TFLite to separate page, added links * Removed the onnx list builder * make style * Update docs/source/en/serialization.mdx Co-authored-by: regisss <[email protected]> --------- Co-authored-by: regisss <[email protected]> * fix: use bool instead of uint8/byte in Deberta/DebertaV2/SEW-D to make it compatible with TensorRT (#23683) * Use bool instead of uint8/byte in DebertaV2 to make it compatible with TensorRT TensorRT cannot accept onnx graph with uint8/byte intermediate tensors. This PR uses bool tensors instead of unit8/byte tensors to make the exported onnx file can work with TensorRT. * fix: use bool instead of uint8/byte in Deberta and SEW-D --------- Co-authored-by: Yuxian Qiu <[email protected]> * fix gptj could not jit.trace in GPU (#23317) Signed-off-by: Wang, Yi A <[email protected]> * Better TF docstring types (#23477) * Rework TF type hints to use | None instead of Optional[] for tf.Tensor * Rework TF type hints to use | None instead of Optional[] for tf.Tensor * Don't forget the imports * Add the imports to tests too * make fixup * Refactor tests that depended on get_type_hints * Better test refactor * Fix an old hidden bug in the test_keras_fit input creation code * Fix for the Deit tests * Minor awesome-transformers.md fixes (#23453) Minor docs fixes * TF SAM memory reduction (#23732) * Extremely small change to TF SAM dummies to reduce memory usage on build * remove debug breakpoint * Debug print statement to track array sizes * More debug shape printing * More debug shape printing * Now remove the debug shape printing * make fixup * make fixup * fix: delete duplicate sentences in `document_question_answering.mdx` (#23735) fix: delete duplicate sentence * fix: Whisper generate, move text_prompt_ids trim up for max_new_tokens calculation (#23724) move text_prompt_ids trimming to top * Overhaul TF serving signatures + dummy inputs (#23234) * Let's try autodetecting serving sigs * Don't clobber existing sigs * Change shapes for multiplechoice models * Make default dummy inputs smarter too * Fix missing f-string * Let's YOLO a serving output too * Read __class__.__name__ properly * Don't just pass naked lists in there and expect it to be okay * Code cleanup * Update default serving sig * Clearer error messages * Further updates to the default serving output * make fixup * Update the serving output a bit more * Cleanups and renames, raise errors appropriately when we can't infer inputs * More renames * we're building in a functional context again, yolo * import DUMMY_INPUTS from the right place * import DUMMY_INPUTS from the right place * Support cross-attention in the dummies * Support cross-attention in the dummies * Complete removal of dummy/serving overrides in BERT * Complete removal of dummy/serving overrides in RoBERTa * Obliterate lots and lots of serving sig and dummy overrides * merge type hint changes * Fix for token_type_ids with vocab_size 1 * Add missing property decorator * Fix T5 and hopefully some models that take conv inputs * More signature pruning * Fix T5's signature * Fix Wav2Vec2 signature * Fix LongformerForMultipleChoice input signature * Fix BLIP and LED * Better default serving output error handling * Fix BART dummies * Fix dummies for cross-attention, esp encoder-decoder models * Fix visionencoderdecoder signature * Fix BLIP serving output * Small tweak to BART dummies * Cleanup the ugly parameter inspection line that I used in a few places * committed a breakpoint again * Move the text_dims check * Remove blip_text serving_output * Add decoder_input_ids to the default input sig * Remove all the manual overrides for encoder-decoder model signatures * Tweak longformer/led input sigs * Tweak default serving output * output.keys() -> output * make fixup * [Whisper] Reduce batch size in tests (#23736) * Fix the regex in `get_imports` to support multiline try blocks and excepts with specific exception types (#23725) * fix and test get_imports for multiline try blocks, and excepts with specific errors * fixup * add some more tests * add license * Fix sagemaker DP/MP (#23681) * Check for use_sagemaker_dp * Add a check for is_sagemaker_mp when setting _n_gpu again. Should be last broken thing * Try explicit check? * Quality * Enable prompts on the Hub (#23662) * Enable prompts on the Hub * Update src/transformers/tools/prompts.py Co-authored-by: amyeroberts <[email protected]> * Address review comments --------- Co-authored-by: amyeroberts <[email protected]> * Remove the last few TF serving sigs (#23738) Remove some more serving methods that (I think?) turned up while this PR was open * Fix `pip install --upgrade accelerate` command in modeling_utils.py (#23747) Fix command in modeling_utils.py * Add LlamaIndex to awesome-transformers.md (#23484) * Fix psuh_to_hub in Trainer when nothing needs pushing (#23751) * Revamp test selection for the example tests (#23737) * Revamp test selection for the example tests * Rename old XLA test and fake modif in run_glue * Fixes * Fake Trainer modif * Remove fake modifs * [LongFormer] code nits, removed unused parameters (#23749) * remove unused parameters * remove unused parameters in config * Fix is_ninja_available() (#23752) * Fix is_ninja_available() search ninja using subprocess instead of importlib. * Fix style * Fix doc * Fix style * Bump tornado from 6.0.4 to 6.3.2 in /examples/research_projects/lxmert (#23766) Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.0.4 to 6.3.2. - [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst) - [Commits](tornadoweb/tornado@v6.0.4...v6.3.2) --- updated-dependencies: - dependency-name: tornado dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump tornado from 6.0.4 to 6.3.2 in /examples/research_projects/visual_bert (#23767) Bump tornado in /examples/research_projects/visual_bert Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.0.4 to 6.3.2. - [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst) - [Commits](tornadoweb/tornado@v6.0.4...v6.3.2) --- updated-dependencies: - dependency-name: tornado dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [`Nllb-Moe`] Fix nllb moe accelerate issue (#23758) fix nllb moe accelerate issue * [OPT] Doc nit, using fast is fine (#23789) small doc nit * Fix RWKV backward on GPU (#23774) * Update trainer.mdx class_weights example (#23787) class_weights tensor should follow model's device * no_cuda does not take effect in non distributed environment (#23795) Signed-off-by: Wang, Yi <[email protected]> * Fix no such file or directory error (#23783) * Fix no such file or directory error * Address comment * Fix formatting issue * Log the right train_batch_size if using auto_find_batch_size and also log the adjusted value seperately. (#23800) * Log right bs * Log * Diff message * Enable code-specific revision for code on the Hub (#23799) * Enable code-specific revision for code on the Hub * invalidate old revision * [Time-Series] Autoformer model (#21891) * ran `transformers-cli add-new-model-like` * added `AutoformerLayernorm` and `AutoformerSeriesDecomposition` * added `decomposition_layer` in `init` and `moving_avg` to config * added `AutoformerAutoCorrelation` to encoder & decoder * removed caninical self attention `AutoformerAttention` * added arguments in config and model tester. Init works! 😁 * WIP autoformer attention with autocorrlation * fixed `attn_weights` size * wip time_delay_agg_training * fixing sizes and debug time_delay_agg_training * aggregation in training works! 😁 * `top_k_delays` -> `top_k_delays_index` and added `contiguous()` * wip time_delay_agg_inference * finish time_delay_agg_inference 😎 * added resize to autocorrelation * bug fix: added the length of the output signal to `irfft` * `attention_mask = None` in the decoder * fixed test: changed attention expected size, `test_attention_outputs` works! * removed unnecessary code * apply AutoformerLayernorm in final norm in enc & dec * added series decomposition to the encoder * added series decomp to decoder, with inputs * added trend todos * added autoformer to README * added to index * added autoformer.mdx * remove scaling and init attention_mask in the decoder * make style * fix copies * make fix-copies * inital fix-copies * fix from #22076 * make style * fix class names * added trend * added d_model and projection layers * added `trend_projection` source, and decomp layer init * added trend & seasonal init for decoder input * AutoformerModel cannot be copied as it has the decomp layer too * encoder can be copied from time series transformer * fixed generation and made distrb. out more robust * use context window to calculate decomposition * use the context_window for decomposition * use output_params helper * clean up AutoformerAttention * subsequences_length off by 1 * make fix copies * fix test * added init for nn.Conv1d * fix IGNORE_NON_TESTED * added model_doc * fix ruff * ignore tests * remove dup * fix SPECIAL_CASES_TO_ALLOW * do not copy due to conv1d weight init * remove unused imports * added short summary * added label_length and made the model non-autoregressive * added params docs * better doc for `factor` * fix tests * renamed `moving_avg` to `moving_average` * renamed `factor` to `autocorrelation_factor` * make style * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: NielsRogge <[email protected]> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: NielsRogge <[email protected]> * fix configurations * fix integration tests * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <[email protected]> * fixing `lags_sequence` doc * Revert "fixing `lags_sequence` doc" This reverts commit 21e3491. * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <[email protected]> * Apply suggestions from code review Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <[email protected]> * model layers now take the config * added `layer_norm_eps` to the config * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <[email protected]> * added `config.layer_norm_eps` to AutoformerLayernorm * added `config.layer_norm_eps` to all layernorm layers * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <[email protected]> * fix variable names * added inital pretrained model * added use_cache docstring * doc strings for trend and use_cache * fix order of args * imports on one line * fixed get_lagged_subsequences docs * add docstring for create_network_inputs * get rid of layer_norm_eps config * add back layernorm * update fixture location * fix signature * use AutoformerModelOutput dataclass * fix pretrain config * no need as default exists * subclass ModelOutput * remove layer_norm_eps config * fix test_model_outputs_equivalence test * test hidden_states_output * make fix-copies * Update src/transformers/models/autoformer/configuration_autoformer.py Co-authored-by: amyeroberts <[email protected]> * removed unused attr * Update tests/models/autoformer/test_modeling_autoformer.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/autoformer/modeling_autoformer.py Co-authored-by: amyeroberts <[email protected]> * use AutoFormerDecoderOutput * fix formatting * fix formatting --------- Co-authored-by: Kashif Rasul <[email protected]> Co-authored-by: NielsRogge <[email protected]> Co-authored-by: amyeroberts <[email protected]> * add type hint in pipeline model argument (#23740) * add type hint in pipeline model argument * add pretrainedmodel and tfpretainedmodel type hint * make type hints string * TF SAM shape flexibility fixes (#23842) SAM shape flexibility fixes for compilation * fix Whisper tests on GPU (#23753) * move input features to GPU * skip these tests because undefined behavior * unskip tests * 🌐 [i18n-KO] Translated `fast_tokenizers.mdx` to Korean (#22956) * docs: ko: fast_tokenizer.mdx content - translated Co-Authored-By: Gabriel Yang <[email protected]> Co-Authored-By: Nayeon Han <[email protected]> Co-Authored-By: Hyeonseo Yun <[email protected]> Co-Authored-By: Sohyun Sim <[email protected]> Co-Authored-By: Jungnerd <[email protected]> Co-Authored-By: Wonhyeong Seo <[email protected]> * Update docs/source/ko/fast_tokenizers.mdx Co-authored-by: Sohyun Sim <[email protected]> * Update docs/source/ko/fast_tokenizers.mdx Co-authored-by: Sohyun Sim <[email protected]> * Update docs/source/ko/fast_tokenizers.mdx Co-authored-by: Sohyun Sim <[email protected]> * Update docs/source/ko/fast_tokenizers.mdx Co-authored-by: Sohyun Sim <[email protected]> * Update docs/source/ko/fast_tokenizers.mdx Co-authored-by: Sohyun Sim <[email protected]> * Update docs/source/ko/fast_tokenizers.mdx Co-authored-by: Sohyun Sim <[email protected]> * Update docs/source/ko/fast_tokenizers.mdx Co-authored-by: Hyeonseo Yun <[email protected]> * Update fast_tokenizers.mdx * Update fast_tokenizers.mdx * Update fast_tokenizers.mdx * Update fast_tokenizers.mdx * Update _toctree.yml --------- Co-authored-by: Gabriel Yang <[email protected]> Co-authored-by: Nayeon Han <[email protected]> Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Jungnerd <[email protected]> Co-authored-by: Wonhyeong Seo <[email protected]> Co-authored-by: Hyeonseo Yun <[email protected]> * [i18n-KO] Translated video_classification.mdx to Korean (#23026) * task/video_classification translated Co-Authored-By: Hyeonseo Yun <[email protected]> Co-Authored-By: Gabriel Yang <[email protected]> Co-Authored-By: Sohyun Sim <[email protected]> Co-Authored-By: Nayeon Han <[email protected]> Co-Authored-By: Wonhyeong Seo <[email protected]> Co-Authored-By: Jungnerd <[email protected]> * Update docs/source/ko/tasks/video_classification.mdx Co-authored-by: Jungnerd <[email protected]> * Update docs/source/ko/tasks/video_classification.mdx Co-authored-by: Jungnerd <[email protected]> * Update docs/source/ko/tasks/video_classification.mdx Co-authored-by: Jungnerd <[email protected]> * Update docs/source/ko/tasks/video_classification.mdx Co-authored-by: Jungnerd <[email protected]> * Update docs/source/ko/tasks/video_classification.mdx Co-authored-by: Jungnerd <[email protected]> * Update docs/source/ko/tasks/video_classification.mdx Co-authored-by: Jungnerd <[email protected]> * Update docs/source/ko/tasks/video_classification.mdx Co-authored-by: Jungnerd <[email protected]> * Update docs/source/ko/tasks/video_classification.mdx Co-authored-by: Jungnerd <[email protected]> * Update docs/source/ko/tasks/video_classification.mdx Co-authored-by: Sohyun Sim <[email protected]> * Update docs/source/ko/tasks/video_classification.mdx Co-authored-by: Sohyun Sim <[email protected]> * Apply suggestions from code review Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Jungnerd <[email protected]> Co-authored-by: Gabriel Yang <[email protected]> * Update video_classification.mdx * Update _toctree.yml * Update _toctree.yml * Update _toctree.yml * Update _toctree.yml --------- Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Gabriel Yang <[email protected]> Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Nayeon Han <[email protected]> Co-authored-by: Wonhyeong Seo <[email protected]> Co-authored-by: Jungnerd <[email protected]> Co-authored-by: Hyeonseo Yun <[email protected]> * 🌐 [i18n-KO] Translated `troubleshooting.mdx` to Korean (#23166) * docs: ko: troubleshooting.mdx * revised: fix _toctree.yml #23112 * feat: nmt draft `troubleshooting.mdx` * fix: manual edits `troubleshooting.mdx` * revised: resolve suggestions troubleshooting.mdx Co-authored-by: Sohyun Sim <[email protected]> --------- Co-authored-by: Sohyun Sim <[email protected]> * Adds a FlyteCallback (#23759) * initial flyte callback * lint * logs should still be saved to Flyte even if pandas isn't install (unlikely) * cr - flyte team * add docs for Flytecallback * fix doc string - cr sgugger * Apply suggestions from code review cr - sgugger fix doc strings Co-authored-by: Sylvain Gugger <[email protected]> --------- Co-authored-by: Sylvain Gugger <[email protected]> * Update collating_graphormer.py (#23862) * [LlamaTokenizerFast] nit update `post_processor` on the fly (#23855) * Update the processor when changing add_eos and add_bos * fixup * update * add a test * fix failing tests * fixup * #23388 Issue: Update RoBERTa configuration (#23863) * [from_pretrained] imporve the error message when `_no_split_modules` is not defined (#23861) * Better warning * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <[email protected]> * format line --------- Co-authored-by: Sylvain Gugger <[email protected]> --------- Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Wang, Yi A <[email protected]> Signed-off-by: Wang, Yi <[email protected]> Co-authored-by: Tyler <[email protected]> Co-authored-by: Joshua Lochner <[email protected]> Co-authored-by: zspo <[email protected]> Co-authored-by: Younes Belkada <[email protected]> Co-authored-by: Zachary Mueller <[email protected]> Co-authored-by: Tim Dettmers <[email protected]> Co-authored-by: younesbelkada <[email protected]> Co-authored-by: LWprogramming <[email protected]> Co-authored-by: Sanchit Gandhi <[email protected]> Co-authored-by: sshahrokhi <[email protected]> Co-authored-by: Matt <[email protected]> Co-authored-by: Yih-Dar <[email protected]> Co-authored-by: ydshieh <[email protected]> Co-authored-by: NielsRogge <[email protected]> Co-authored-by: Nicolas Patry <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Alex <[email protected]> Co-authored-by: Nayeon Han <[email protected]> Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Gabriel Yang <[email protected]> Co-authored-by: Wonhyeong Seo <[email protected]> Co-authored-by: Jungnerd <[email protected]> Co-authored-by: 小桐桐 <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Wang, Yi <[email protected]> Co-authored-by: Maria Khalusova <[email protected]> Co-authored-by: regisss <[email protected]> Co-authored-by: uchuhimo <[email protected]> Co-authored-by: Yuxian Qiu <[email protected]> Co-authored-by: pagarsky <[email protected]> Co-authored-by: Connor Henderson <[email protected]> Co-authored-by: Daniel King <[email protected]> Co-authored-by: amyeroberts <[email protected]> Co-authored-by: Eric J. Wang <[email protected]> Co-authored-by: Ravi Theja <[email protected]> Co-authored-by: Arthur <[email protected]> Co-authored-by: 玩火 <[email protected]> Co-authored-by: amitportnoy <[email protected]> Co-authored-by: Ran Ran <[email protected]> Co-authored-by: Eli Simhayev <[email protected]> Co-authored-by: Kashif Rasul <[email protected]> Co-authored-by: Samin Yasar <[email protected]> Co-authored-by: Matthijs Hollemans <[email protected]> Co-authored-by: Kihoon Son <[email protected]> Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: peridotml <[email protected]> Co-authored-by: Clémentine Fourrier <[email protected]> Co-authored-by: Vijeth Moudgalya <[email protected]>
1 parent c9f3cff commit 2d0e384

File tree

320 files changed

+9944
-8114
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

320 files changed

+9944
-8114
lines changed

.circleci/config.yml

Lines changed: 7 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,12 @@ jobs:
4343
else
4444
touch test_preparation/test_list.txt
4545
fi
46+
- run: |
47+
if [ -f examples_test_list.txt ]; then
48+
mv examples_test_list.txt test_preparation/examples_test_list.txt
49+
else
50+
touch test_preparation/examples_test_list.txt
51+
fi
4652
- run: |
4753
if [ -f doctest_list.txt ]; then
4854
cp doctest_list.txt test_preparation/doctest_list.txt
@@ -62,19 +68,6 @@ jobs:
6268
else
6369
touch test_preparation/filtered_test_list.txt
6470
fi
65-
- run: python utils/tests_fetcher.py --filters tests examples | tee examples_tests_fetched_summary.txt
66-
- run: |
67-
if [ -f test_list.txt ]; then
68-
mv test_list.txt test_preparation/examples_test_list.txt
69-
else
70-
touch test_preparation/examples_test_list.txt
71-
fi
72-
- run: |
73-
if [ -f filtered_test_list_cross_tests.txt ]; then
74-
mv filtered_test_list_cross_tests.txt test_preparation/filtered_test_list_cross_tests.txt
75-
else
76-
touch test_preparation/filtered_test_list_cross_tests.txt
77-
fi
7871
- store_artifacts:
7972
path: test_preparation/test_list.txt
8073
- store_artifacts:
@@ -111,7 +104,7 @@ jobs:
111104
- run: |
112105
mkdir test_preparation
113106
echo -n "tests" > test_preparation/test_list.txt
114-
echo -n "tests" > test_preparation/examples_test_list.txt
107+
echo -n "all" > test_preparation/examples_test_list.txt
115108
echo -n "tests/repo_utils" > test_preparation/test_repo_utils.txt
116109
- run: |
117110
echo -n "tests" > test_list.txt

.circleci/create_circleci_config.py

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -342,7 +342,6 @@ def job_name(self):
342342
"pip install .[sklearn,torch,sentencepiece,testing,torch-speech]",
343343
"pip install -r examples/pytorch/_tests_requirements.txt",
344344
],
345-
tests_to_run="./examples/pytorch/",
346345
)
347346

348347

@@ -355,7 +354,6 @@ def job_name(self):
355354
"pip install .[sklearn,tensorflow,sentencepiece,testing]",
356355
"pip install -r examples/tensorflow/_tests_requirements.txt",
357356
],
358-
tests_to_run="./examples/tensorflow/",
359357
)
360358

361359

@@ -367,7 +365,6 @@ def job_name(self):
367365
"pip install .[flax,testing,sentencepiece]",
368366
"pip install -r examples/flax/_tests_requirements.txt",
369367
],
370-
tests_to_run="./examples/flax/",
371368
)
372369

373370

@@ -551,7 +548,17 @@ def create_circleci_config(folder=None):
551548

552549
example_file = os.path.join(folder, "examples_test_list.txt")
553550
if os.path.exists(example_file) and os.path.getsize(example_file) > 0:
554-
jobs.extend(EXAMPLES_TESTS)
551+
with open(example_file, "r", encoding="utf-8") as f:
552+
example_tests = f.read().split(" ")
553+
for job in EXAMPLES_TESTS:
554+
framework = job.name.replace("examples_", "").replace("torch", "pytorch")
555+
if example_tests == "all":
556+
job.tests_to_run = [f"examples/{framework}"]
557+
else:
558+
job.tests_to_run = [f for f in example_tests if f.startswith(f"examples/{framework}")]
559+
560+
if len(job.tests_to_run) > 0:
561+
jobs.append(job)
555562

556563
doctest_file = os.path.join(folder, "doctest_list.txt")
557564
if os.path.exists(doctest_file):

.github/workflows/self-push.yml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -195,6 +195,10 @@ jobs:
195195
git checkout ${{ env.CI_SHA }}
196196
echo "log = $(git log -n 1)"
197197
198+
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
199+
working-directory: /transformers
200+
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
201+
198202
- name: Echo folder ${{ matrix.folders }}
199203
shell: bash
200204
# For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
@@ -284,6 +288,10 @@ jobs:
284288
git checkout ${{ env.CI_SHA }}
285289
echo "log = $(git log -n 1)"
286290
291+
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
292+
working-directory: /transformers
293+
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
294+
287295
- name: Echo folder ${{ matrix.folders }}
288296
shell: bash
289297
# For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
@@ -373,6 +381,10 @@ jobs:
373381
git checkout ${{ env.CI_SHA }}
374382
echo "log = $(git log -n 1)"
375383
384+
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
385+
working-directory: /workspace/transformers
386+
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
387+
376388
- name: Remove cached torch extensions
377389
run: rm -rf /github/home/.cache/torch_extensions/
378390

@@ -459,6 +471,10 @@ jobs:
459471
git checkout ${{ env.CI_SHA }}
460472
echo "log = $(git log -n 1)"
461473
474+
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
475+
working-directory: /workspace/transformers
476+
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
477+
462478
- name: Remove cached torch extensions
463479
run: rm -rf /github/home/.cache/torch_extensions/
464480

.github/workflows/self-scheduled.yml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,10 @@ jobs:
119119
working-directory: /transformers
120120
run: git fetch && git checkout ${{ github.sha }}
121121

122+
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
123+
working-directory: /transformers
124+
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
125+
122126
- name: NVIDIA-SMI
123127
run: |
124128
nvidia-smi
@@ -176,6 +180,10 @@ jobs:
176180
working-directory: /transformers
177181
run: git fetch && git checkout ${{ github.sha }}
178182

183+
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
184+
working-directory: /transformers
185+
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
186+
179187
- name: NVIDIA-SMI
180188
run: |
181189
nvidia-smi
@@ -221,6 +229,10 @@ jobs:
221229
working-directory: /transformers
222230
run: git fetch && git checkout ${{ github.sha }}
223231

232+
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
233+
working-directory: /transformers
234+
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
235+
224236
- name: NVIDIA-SMI
225237
run: |
226238
nvidia-smi
@@ -268,6 +280,10 @@ jobs:
268280
working-directory: /transformers
269281
run: git fetch && git checkout ${{ github.sha }}
270282

283+
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
284+
working-directory: /transformers
285+
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
286+
271287
- name: NVIDIA-SMI
272288
run: |
273289
nvidia-smi
@@ -315,6 +331,10 @@ jobs:
315331
run: |
316332
git fetch && git checkout ${{ github.sha }}
317333
334+
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
335+
working-directory: /transformers
336+
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
337+
318338
- name: NVIDIA-SMI
319339
run: |
320340
nvidia-smi
@@ -361,6 +381,10 @@ jobs:
361381
working-directory: /workspace/transformers
362382
run: git fetch && git checkout ${{ github.sha }}
363383

384+
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
385+
working-directory: /workspace/transformers
386+
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
387+
364388
- name: Remove cached torch extensions
365389
run: rm -rf /github/home/.cache/torch_extensions/
366390

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -292,6 +292,7 @@ Current number of checkpoints: ![](https://img.shields.io/endpoint?url=https://h
292292
1. **[ALIGN](https://huggingface.co/docs/transformers/model_doc/align)** (from Google Research) released with the paper [Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision](https://arxiv.org/abs/2102.05918) by Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig.
293293
1. **[AltCLIP](https://huggingface.co/docs/transformers/model_doc/altclip)** (from BAAI) released with the paper [AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities](https://arxiv.org/abs/2211.06679) by Chen, Zhongzhi and Liu, Guang and Zhang, Bo-Wen and Ye, Fulong and Yang, Qinghong and Wu, Ledell.
294294
1. **[Audio Spectrogram Transformer](https://huggingface.co/docs/transformers/model_doc/audio-spectrogram-transformer)** (from MIT) released with the paper [AST: Audio Spectrogram Transformer](https://arxiv.org/abs/2104.01778) by Yuan Gong, Yu-An Chung, James Glass.
295+
1. **[Autoformer](https://huggingface.co/docs/transformers/main/model_doc/autoformer)** (from Tsinghua University) released with the paper [Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting](https://arxiv.org/abs/2106.13008) by Haixu Wu, Jiehui Xu, Jianmin Wang, Mingsheng Long.
295296
1. **[BART](https://huggingface.co/docs/transformers/model_doc/bart)** (from Facebook) released with the paper [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461) by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer.
296297
1. **[BARThez](https://huggingface.co/docs/transformers/model_doc/barthez)** (from École polytechnique) released with the paper [BARThez: a Skilled Pretrained French Sequence-to-Sequence Model](https://arxiv.org/abs/2010.12321) by Moussa Kamal Eddine, Antoine J.-P. Tixier, Michalis Vazirgiannis.
297298
1. **[BARTpho](https://huggingface.co/docs/transformers/model_doc/bartpho)** (from VinAI Research) released with the paper [BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese](https://arxiv.org/abs/2109.09701) by Nguyen Luong Tran, Duong Minh Le and Dat Quoc Nguyen.

README_es.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -267,6 +267,7 @@ Número actual de puntos de control: ![](https://img.shields.io/endpoint?url=htt
267267
1. **[ALIGN](https://huggingface.co/docs/transformers/model_doc/align)** (from Google Research) released with the paper [Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision](https://arxiv.org/abs/2102.05918) by Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig.
268268
1. **[AltCLIP](https://huggingface.co/docs/transformers/model_doc/altclip)** (from BAAI) released with the paper [AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities](https://arxiv.org/abs/2211.06679) by Chen, Zhongzhi and Liu, Guang and Zhang, Bo-Wen and Ye, Fulong and Yang, Qinghong and Wu, Ledell.
269269
1. **[Audio Spectrogram Transformer](https://huggingface.co/docs/transformers/model_doc/audio-spectrogram-transformer)** (from MIT) released with the paper [AST: Audio Spectrogram Transformer](https://arxiv.org/abs/2104.01778) by Yuan Gong, Yu-An Chung, James Glass.
270+
1. **[Autoformer](https://huggingface.co/docs/transformers/main/model_doc/autoformer)** (from Tsinghua University) released with the paper [Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting](https://arxiv.org/abs/2106.13008) by Haixu Wu, Jiehui Xu, Jianmin Wang, Mingsheng Long.
270271
1. **[BART](https://huggingface.co/docs/transformers/model_doc/bart)** (from Facebook) released with the paper [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461) by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer.
271272
1. **[BARThez](https://huggingface.co/docs/transformers/model_doc/barthez)** (from École polytechnique) released with the paper [BARThez: a Skilled Pretrained French Sequence-to-Sequence Model](https://arxiv.org/abs/2010.12321) by Moussa Kamal Eddine, Antoine J.-P. Tixier, Michalis Vazirgiannis.
272273
1. **[BARTpho](https://huggingface.co/docs/transformers/model_doc/bartpho)** (from VinAI Research) released with the paper [BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese](https://arxiv.org/abs/2109.09701) by Nguyen Luong Tran, Duong Minh Le and Dat Quoc Nguyen.

README_hd.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,7 @@ conda install -c huggingface transformers
239239
1. **[ALIGN](https://huggingface.co/docs/transformers/model_doc/align)** (Google Research से) Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig. द्वाराअनुसंधान पत्र [Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision](https://arxiv.org/abs/2102.05918) के साथ जारी किया गया
240240
1. **[AltCLIP](https://huggingface.co/docs/transformers/model_doc/altclip)** (from BAAI) released with the paper [AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities](https://arxiv.org/abs/2211.06679) by Chen, Zhongzhi and Liu, Guang and Zhang, Bo-Wen and Ye, Fulong and Yang, Qinghong and Wu, Ledell.
241241
1. **[Audio Spectrogram Transformer](https://huggingface.co/docs/transformers/model_doc/audio-spectrogram-transformer)** (from MIT) released with the paper [AST: Audio Spectrogram Transformer](https://arxiv.org/abs/2104.01778) by Yuan Gong, Yu-An Chung, James Glass.
242+
1. **[Autoformer](https://huggingface.co/docs/transformers/main/model_doc/autoformer)** (from Tsinghua University) released with the paper [Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting](https://arxiv.org/abs/2106.13008) by Haixu Wu, Jiehui Xu, Jianmin Wang, Mingsheng Long.
242243
1. **[BART](https://huggingface.co/docs/transformers/model_doc/bart)** (फेसबुक) साथ थीसिस [बार्ट: प्राकृतिक भाषा निर्माण, अनुवाद के लिए अनुक्रम-से-अनुक्रम पूर्व प्रशिक्षण , और समझ] (https://arxiv.org/pdf/1910.13461.pdf) पर निर्भर माइक लुईस, यिनहान लियू, नमन गोयल, मार्जन ग़ज़विनिनेजाद, अब्देलरहमान मोहम्मद, ओमर लेवी, वेस स्टोयानोव और ल्यूक ज़ेटलमॉयर
243244
1. **[BARThez](https://huggingface.co/docs/transformers/model_doc/barthez)** (से École polytechnique) साथ थीसिस [BARThez: a Skilled Pretrained French Sequence-to-Sequence Model](https://arxiv.org/abs/2010.12321) पर निर्भर Moussa Kamal Eddine, Antoine J.-P. Tixier, Michalis Vazirgiannis रिहाई।
244245
1. **[BARTpho](https://huggingface.co/docs/transformers/model_doc/bartpho)** (VinAI Research से) साथ में पेपर [BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese](https://arxiv.org/abs/2109.09701)गुयेन लुओंग ट्रान, डुओंग मिन्ह ले और डाट क्वोक गुयेन द्वारा पोस्ट किया गया।

README_ja.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -301,6 +301,7 @@ Flax、PyTorch、TensorFlowをcondaでインストールする方法は、それ
301301
1. **[ALIGN](https://huggingface.co/docs/transformers/model_doc/align)** (Google Research から) Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig. から公開された研究論文 [Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision](https://arxiv.org/abs/2102.05918)
302302
1. **[AltCLIP](https://huggingface.co/docs/transformers/model_doc/altclip)** (BAAI から) Chen, Zhongzhi and Liu, Guang and Zhang, Bo-Wen and Ye, Fulong and Yang, Qinghong and Wu, Ledell から公開された研究論文: [AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities](https://arxiv.org/abs/2211.06679)
303303
1. **[Audio Spectrogram Transformer](https://huggingface.co/docs/transformers/model_doc/audio-spectrogram-transformer)** (MIT から) Yuan Gong, Yu-An Chung, James Glass から公開された研究論文: [AST: Audio Spectrogram Transformer](https://arxiv.org/abs/2104.01778)
304+
1. **[Autoformer](https://huggingface.co/docs/transformers/main/model_doc/autoformer)** (from Tsinghua University) released with the paper [Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting](https://arxiv.org/abs/2106.13008) by Haixu Wu, Jiehui Xu, Jianmin Wang, Mingsheng Long.
304305
1. **[BART](https://huggingface.co/docs/transformers/model_doc/bart)** (Facebook から) Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer から公開された研究論文: [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461)
305306
1. **[BARThez](https://huggingface.co/docs/transformers/model_doc/barthez)** (École polytechnique から) Moussa Kamal Eddine, Antoine J.-P. Tixier, Michalis Vazirgiannis から公開された研究論文: [BARThez: a Skilled Pretrained French Sequence-to-Sequence Model](https://arxiv.org/abs/2010.12321)
306307
1. **[BARTpho](https://huggingface.co/docs/transformers/model_doc/bartpho)** (VinAI Research から) Nguyen Luong Tran, Duong Minh Le and Dat Quoc Nguyen から公開された研究論文: [BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese](https://arxiv.org/abs/2109.09701)

0 commit comments

Comments
 (0)