Skip to content

Conversation

@younesbelkada
Copy link
Contributor

@younesbelkada younesbelkada commented Jul 27, 2023

What does this PR do?

Refactors the IdeficsConfig to match the configuration composition patterns of multimodal models on transformers

original PR: #24796

Summary of the changes

  • Removed the copy of CLIPTextConfig, CLIPConfig in clip.py as they were used for type hints only
  • Retrieve the correct attributes on modeling_idefics.py (i.e. attributes from perceiver_config & vision_config
  • Adapted CI tests accordingly
  • Make the utils/check_config_attributes.py pass - since there is a duplicated CLIPVisionConfig (1 in the clip itself and the other in configuration_idefics.py), that script checks the unused attributes of that config for some reason (didn't
    investigated further)

For compatiblity with weights on the Hub, changes similar than: https://huggingface.co/HuggingFaceM4/tiny-random-idefics/discussions/3 needs to be applied

The docstring of the new config objects needs to be cleaned up, but can be done on the main PR.

cc @stas00

@younesbelkada younesbelkada marked this pull request as draft July 27, 2023 15:55
@younesbelkada younesbelkada marked this pull request as ready for review July 27, 2023 16:03
@younesbelkada younesbelkada requested a review from stas00 July 27, 2023 16:03
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 27, 2023

The documentation is not available anymore as the PR was closed or merged.

@stas00 stas00 merged commit 9d38545 into huggingface:add-model-idefics Jul 27, 2023
stas00 added a commit that referenced this pull request Aug 18, 2023
* rename

* restore

* mappings

* unedited tests+docs

* docs

* fixes

* fix auto-sync breakage

* cleanup

* wip

* wip

* add fetch_images

* remove einops dependency

* update

* fix

* fix

* fix

* fix

* fix

* re-add

* add batching

* rework

* fix

* improve

* add Leo as I am extending his work

* cleanup

* fix

* cleanup

* slow-test

* fix

* fix

* fixes

* deal with warning

* rename modified llama classes

* rework fetch_images

* alternative implementation

* cleanup

* strict version

* cleanup

* [`IDEFICS`] Fix idefics ci (#25056)

* Fix IDEFICS CI

* fix test file

* fixup

* some changes to make tests pass

* fix

* fixup

* Update src/transformers/models/idefics/configuration_idefics.py

Co-authored-by: Stas Bekman <[email protected]>

---------

Co-authored-by: Stas Bekman <[email protected]>

* remove compat checks

* style

* explain that Idefics is not for training from scratch

* require pt>=2.0

* fix idefics vision config (#25092)

* fix idefics vision config

* fixup

* clean

* Update src/transformers/models/idefics/configuration_idefics.py

---------

Co-authored-by: Stas Bekman <[email protected]>

* cleanup

* style

* cleanup

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* upcase

* sequence of images

* handle the case with no images

* Update src/transformers/image_processing_utils.py

Co-authored-by: Victor SANH <[email protected]>

* support pure lm take 2

* support tokenizer options

* parameterize num_channels

* fix upcase

* s|IdeficsForCausalLM|IdeficsForVisionText2Text|g

* manual to one line

* addressing review

* unbreak

* remove clip dependency

* fix test

* consistency

* PIL import

* Idefics prefix

* Idefics prefix

* hack to make tests work

* style

* fix

* fix

* revert

* try/finally

* cleanup

* clean up

* move

* [`IDEFICS`] Fix idefics config refactor (#25149)

* refactor config

* nuke init weights

* more refactor

* oops

* remove visual question answering pipeline support

* Update src/transformers/models/idefics/clip.py

Co-authored-by: Stas Bekman <[email protected]>

* Update src/transformers/models/idefics/modeling_idefics.py

* cleanup

* mv clip.py vision.py

* tidyup

---------

Co-authored-by: Stas Bekman <[email protected]>
Co-authored-by: Stas Bekman <[email protected]>

* fix

* license

* condition on pt

* fix

* style

* fix

* rm torchvision dependency, allow custom transforms

* address review

* rework device arg

* add_eos_token

* s/transforms/transform/

* fix top level imports

* fix return value

* cleanup

* cleanup

* fix

* style

* license

* license

* Update src/transformers/models/idefics/image_processing_idefics.py

Co-authored-by: Sylvain Gugger <[email protected]>

* add a wrapper to freeze vision layears

* tidyup

* use the correct std/mean settings

* parameterize values from config

* add tests/models/idefics/test_image_processing_idefics.py

* add test_processor_idefics.py

* cleanup

* cleanups

* fix

* fix

* move to the right group

* style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* add perceiver config

* reset

* missing arg docs

* Apply suggestions from code review

Co-authored-by: Leo Tronchon <[email protected]>

* address review comments

* inject automatic end of utterance tokens (#25218)

* inject automatic end of utterance tokens

* fix

* fix

* fix

* rework to not use the config

* not end_of_utterance_token at the end

* Update src/transformers/models/idefics/processing_idefics.py

Co-authored-by: Sylvain Gugger <[email protected]>

* address review

* Apply suggestions from code review

Co-authored-by: Joao Gante <[email protected]>

* Update src/transformers/image_processing_utils.py

Co-authored-by: Nicolas Patry <[email protected]>

* [`Idefics`] add image_embeddings option in generate-related methods (#25442)

* add image_embeddings option in generate-related methods

* style

* rename image_embeddings and allow perceiver embeddings precomputation

* compute embeddings within generate

* make is_encoder_decoder= True the default in config

* nested if else fix

* better triple check

* switch if elif order for pixel values / img embeds

* update model_kwargs perceiver only at the end

* use _prepare_model_inputs instead of encoder_decoder logic

* fix comment typo

* fix config default for is_encoder_decoder

* style

* add typehints

* precompute in forward

* doc builder

* style

* pop instead of get image hidden states

* Trigger CI

* Update src/transformers/models/idefics/modeling_idefics.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/models/idefics/modeling_idefics.py

Co-authored-by: Arthur <[email protected]>

* fix * + indentation + style

* simplify a bit the use_resampler logic using comments

* update diocstrings

* Trigger CI

---------

Co-authored-by: Arthur <[email protected]>

* fix rebase changes

* unbreak #25237 - to be fixed in follow up PRs

* is_composition = False

* no longer needed

---------

Co-authored-by: leot13 <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>
Co-authored-by: Victor SANH <[email protected]>
Co-authored-by: Joao Gante <[email protected]>
Co-authored-by: Nicolas Patry <[email protected]>
Co-authored-by: Arthur <[email protected]>
sgugger added a commit that referenced this pull request Aug 21, 2023
* rename

* restore

* mappings

* unedited tests+docs

* docs

* fixes

* fix auto-sync breakage

* cleanup

* wip

* wip

* add fetch_images

* remove einops dependency

* update

* fix

* fix

* fix

* fix

* fix

* re-add

* add batching

* rework

* fix

* improve

* add Leo as I am extending his work

* cleanup

* fix

* cleanup

* slow-test

* fix

* fix

* fixes

* deal with warning

* rename modified llama classes

* rework fetch_images

* alternative implementation

* cleanup

* strict version

* cleanup

* [`IDEFICS`] Fix idefics ci (#25056)

* Fix IDEFICS CI

* fix test file

* fixup

* some changes to make tests pass

* fix

* fixup

* Update src/transformers/models/idefics/configuration_idefics.py

Co-authored-by: Stas Bekman <[email protected]>

---------

Co-authored-by: Stas Bekman <[email protected]>

* remove compat checks

* style

* explain that Idefics is not for training from scratch

* require pt>=2.0

* fix idefics vision config (#25092)

* fix idefics vision config

* fixup

* clean

* Update src/transformers/models/idefics/configuration_idefics.py

---------

Co-authored-by: Stas Bekman <[email protected]>

* cleanup

* style

* cleanup

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* upcase

* sequence of images

* handle the case with no images

* Update src/transformers/image_processing_utils.py

Co-authored-by: Victor SANH <[email protected]>

* support pure lm take 2

* support tokenizer options

* parameterize num_channels

* fix upcase

* s|IdeficsForCausalLM|IdeficsForVisionText2Text|g

* manual to one line

* addressing review

* unbreak

* remove clip dependency

* fix test

* consistency

* PIL import

* Idefics prefix

* Idefics prefix

* hack to make tests work

* style

* fix

* fix

* revert

* try/finally

* cleanup

* clean up

* move

* [`IDEFICS`] Fix idefics config refactor (#25149)

* refactor config

* nuke init weights

* more refactor

* oops

* remove visual question answering pipeline support

* Update src/transformers/models/idefics/clip.py

Co-authored-by: Stas Bekman <[email protected]>

* Update src/transformers/models/idefics/modeling_idefics.py

* cleanup

* mv clip.py vision.py

* tidyup

---------

Co-authored-by: Stas Bekman <[email protected]>
Co-authored-by: Stas Bekman <[email protected]>

* fix

* license

* condition on pt

* fix

* style

* fix

* rm torchvision dependency, allow custom transforms

* address review

* rework device arg

* add_eos_token

* s/transforms/transform/

* fix top level imports

* fix return value

* cleanup

* cleanup

* fix

* style

* license

* license

* Update src/transformers/models/idefics/image_processing_idefics.py

Co-authored-by: Sylvain Gugger <[email protected]>

* add a wrapper to freeze vision layears

* tidyup

* use the correct std/mean settings

* parameterize values from config

* add tests/models/idefics/test_image_processing_idefics.py

* add test_processor_idefics.py

* cleanup

* cleanups

* fix

* fix

* move to the right group

* style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* add perceiver config

* reset

* missing arg docs

* Apply suggestions from code review

Co-authored-by: Leo Tronchon <[email protected]>

* address review comments

* inject automatic end of utterance tokens (#25218)

* inject automatic end of utterance tokens

* fix

* fix

* fix

* rework to not use the config

* not end_of_utterance_token at the end

* Update src/transformers/models/idefics/processing_idefics.py

Co-authored-by: Sylvain Gugger <[email protected]>

* address review

* Apply suggestions from code review

Co-authored-by: Joao Gante <[email protected]>

* Update src/transformers/image_processing_utils.py

Co-authored-by: Nicolas Patry <[email protected]>

* [`Idefics`] add image_embeddings option in generate-related methods (#25442)

* add image_embeddings option in generate-related methods

* style

* rename image_embeddings and allow perceiver embeddings precomputation

* compute embeddings within generate

* make is_encoder_decoder= True the default in config

* nested if else fix

* better triple check

* switch if elif order for pixel values / img embeds

* update model_kwargs perceiver only at the end

* use _prepare_model_inputs instead of encoder_decoder logic

* fix comment typo

* fix config default for is_encoder_decoder

* style

* add typehints

* precompute in forward

* doc builder

* style

* pop instead of get image hidden states

* Trigger CI

* Update src/transformers/models/idefics/modeling_idefics.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/models/idefics/modeling_idefics.py

Co-authored-by: Arthur <[email protected]>

* fix * + indentation + style

* simplify a bit the use_resampler logic using comments

* update diocstrings

* Trigger CI

---------

Co-authored-by: Arthur <[email protected]>

* fix rebase changes

* unbreak #25237 - to be fixed in follow up PRs

* is_composition = False

* no longer needed

---------

Co-authored-by: leot13 <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>
Co-authored-by: Victor SANH <[email protected]>
Co-authored-by: Joao Gante <[email protected]>
Co-authored-by: Nicolas Patry <[email protected]>
Co-authored-by: Arthur <[email protected]>
parambharat pushed a commit to parambharat/transformers that referenced this pull request Sep 26, 2023
* rename

* restore

* mappings

* unedited tests+docs

* docs

* fixes

* fix auto-sync breakage

* cleanup

* wip

* wip

* add fetch_images

* remove einops dependency

* update

* fix

* fix

* fix

* fix

* fix

* re-add

* add batching

* rework

* fix

* improve

* add Leo as I am extending his work

* cleanup

* fix

* cleanup

* slow-test

* fix

* fix

* fixes

* deal with warning

* rename modified llama classes

* rework fetch_images

* alternative implementation

* cleanup

* strict version

* cleanup

* [`IDEFICS`] Fix idefics ci (huggingface#25056)

* Fix IDEFICS CI

* fix test file

* fixup

* some changes to make tests pass

* fix

* fixup

* Update src/transformers/models/idefics/configuration_idefics.py

Co-authored-by: Stas Bekman <[email protected]>

---------

Co-authored-by: Stas Bekman <[email protected]>

* remove compat checks

* style

* explain that Idefics is not for training from scratch

* require pt>=2.0

* fix idefics vision config (huggingface#25092)

* fix idefics vision config

* fixup

* clean

* Update src/transformers/models/idefics/configuration_idefics.py

---------

Co-authored-by: Stas Bekman <[email protected]>

* cleanup

* style

* cleanup

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* upcase

* sequence of images

* handle the case with no images

* Update src/transformers/image_processing_utils.py

Co-authored-by: Victor SANH <[email protected]>

* support pure lm take 2

* support tokenizer options

* parameterize num_channels

* fix upcase

* s|IdeficsForCausalLM|IdeficsForVisionText2Text|g

* manual to one line

* addressing review

* unbreak

* remove clip dependency

* fix test

* consistency

* PIL import

* Idefics prefix

* Idefics prefix

* hack to make tests work

* style

* fix

* fix

* revert

* try/finally

* cleanup

* clean up

* move

* [`IDEFICS`] Fix idefics config refactor (huggingface#25149)

* refactor config

* nuke init weights

* more refactor

* oops

* remove visual question answering pipeline support

* Update src/transformers/models/idefics/clip.py

Co-authored-by: Stas Bekman <[email protected]>

* Update src/transformers/models/idefics/modeling_idefics.py

* cleanup

* mv clip.py vision.py

* tidyup

---------

Co-authored-by: Stas Bekman <[email protected]>
Co-authored-by: Stas Bekman <[email protected]>

* fix

* license

* condition on pt

* fix

* style

* fix

* rm torchvision dependency, allow custom transforms

* address review

* rework device arg

* add_eos_token

* s/transforms/transform/

* fix top level imports

* fix return value

* cleanup

* cleanup

* fix

* style

* license

* license

* Update src/transformers/models/idefics/image_processing_idefics.py

Co-authored-by: Sylvain Gugger <[email protected]>

* add a wrapper to freeze vision layears

* tidyup

* use the correct std/mean settings

* parameterize values from config

* add tests/models/idefics/test_image_processing_idefics.py

* add test_processor_idefics.py

* cleanup

* cleanups

* fix

* fix

* move to the right group

* style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* add perceiver config

* reset

* missing arg docs

* Apply suggestions from code review

Co-authored-by: Leo Tronchon <[email protected]>

* address review comments

* inject automatic end of utterance tokens (huggingface#25218)

* inject automatic end of utterance tokens

* fix

* fix

* fix

* rework to not use the config

* not end_of_utterance_token at the end

* Update src/transformers/models/idefics/processing_idefics.py

Co-authored-by: Sylvain Gugger <[email protected]>

* address review

* Apply suggestions from code review

Co-authored-by: Joao Gante <[email protected]>

* Update src/transformers/image_processing_utils.py

Co-authored-by: Nicolas Patry <[email protected]>

* [`Idefics`] add image_embeddings option in generate-related methods (huggingface#25442)

* add image_embeddings option in generate-related methods

* style

* rename image_embeddings and allow perceiver embeddings precomputation

* compute embeddings within generate

* make is_encoder_decoder= True the default in config

* nested if else fix

* better triple check

* switch if elif order for pixel values / img embeds

* update model_kwargs perceiver only at the end

* use _prepare_model_inputs instead of encoder_decoder logic

* fix comment typo

* fix config default for is_encoder_decoder

* style

* add typehints

* precompute in forward

* doc builder

* style

* pop instead of get image hidden states

* Trigger CI

* Update src/transformers/models/idefics/modeling_idefics.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/models/idefics/modeling_idefics.py

Co-authored-by: Arthur <[email protected]>

* fix * + indentation + style

* simplify a bit the use_resampler logic using comments

* update diocstrings

* Trigger CI

---------

Co-authored-by: Arthur <[email protected]>

* fix rebase changes

* unbreak huggingface#25237 - to be fixed in follow up PRs

* is_composition = False

* no longer needed

---------

Co-authored-by: leot13 <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>
Co-authored-by: Victor SANH <[email protected]>
Co-authored-by: Joao Gante <[email protected]>
Co-authored-by: Nicolas Patry <[email protected]>
Co-authored-by: Arthur <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants