-
Notifications
You must be signed in to change notification settings - Fork 31.1k
Add ColPali to π€ transformers #33736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
137 commits
Select commit
Hold shift + click to select a range
0f5c6a7
feat: run `add-new-model-like`
tonywu71 726f156
feat: add paligemma code with "copied from"
tonywu71 9a88bf1
feat: add ColPaliProcessor
tonywu71 fab4e46
feat: add ColPaliModel
tonywu71 66656f6
feat: add ColPaliConfig
tonywu71 a377d60
feat: rename `ColPaliForConditionalGeneration` to `ColPaliModel`
tonywu71 0addcab
fixup modeling colpali
tonywu71 e8979b9
fix: fix root import shortcuts
tonywu71 49fb8ba
fix: fix `modeling_auto` dict
tonywu71 88b0212
feat: comment out ColPali test file
tonywu71 cbd781b
fix: fix typos from `add-new-model-like`
tonywu71 44fcd04
feat: explicit the forward input args
tonywu71 a6ca45a
feat: move everything to `modular_colpali.py`
tonywu71 af9ca36
fix: put back ColPaliProcesor
tonywu71 087870b
feat: add auto-generated files
tonywu71 cc11ef8
fix: run `fix-copies`
tonywu71 f69ee9b
fix: remove DOCStRING constants to make modular converter work
tonywu71 fbe5665
fix: fix typo + modular converter
tonywu71 e58794c
fix: add missing imports
tonywu71 2dd5218
feat: no more errors when loading ColPaliModel
tonywu71 e05ea43
fix: remove unused args in forward + tweak doc
tonywu71 bda6916
feat: rename `ColPaliModel` to `ColPaliForRetrieval`
tonywu71 bfff564
fix: apply `fix-copies`
tonywu71 da4c566
feat: add ColPaliProcessor to `modular_colpali`
tonywu71 ae37f18
fix: run make quality + make style
tonywu71 38f0d8c
fix: remove duplicate line in configuration_auto
tonywu71 c63a302
feat: make ColPaliModel inehrit from PaliGemmaForConditionalGeneration
tonywu71 d66606e
fix: tweak and use ColPaliConfig
tonywu71 7f750d3
feat: rename `score` to `post_process_retrieval`
tonywu71 41dbbb8
build: run modular formatter + make style
tonywu71 28592c9
feat: convert colpali weights + fixes
tonywu71 84763a3
feat: remove old weight converter file
tonywu71 672bdb2
feat: add and validate tests
tonywu71 f7ce9b1
feat: replace harcoded path to "vidore/colpali-v1.2-hf" in tests
tonywu71 3789a6e
fix: add bfloat16 conversion in weight converter
tonywu71 5e09645
feat: replace pytest with unittest in modeling colpali test
tonywu71 8ea8273
feat: add sanity check for weight conversion (doesn't work yet)
tonywu71 d100779
feat: add shape sanity check in weigth converter
tonywu71 e6bdf40
feat: make ColPaliProcessor args explicit
tonywu71 abe3232
doc: add doc for ColPali
tonywu71 6ae178c
fix: trying to fix output mismatch
tonywu71 6d35b27
feat: tweaks
tonywu71 0653340
fix: ColPaliModelOutput inherits from ModelOutput instead of PaliGemmβ¦
tonywu71 97a6468
fix: address comments on PR
tonywu71 8212717
fix: adapt tests to the Hf norm
tonywu71 a7b297a
wip: try things
tonywu71 592e716
feat: add `__call__` method to `ColPaliProcessor`
tonywu71 f50a979
feat: remove need for dummy image in `process_queries`
tonywu71 25eb21b
build: run new modular converter
tonywu71 3ed7627
fix: fix incorrect method override
tonywu71 9038ead
Fix tests, processing, modular, convert
yonigozlan cb7e301
fix tokenization auto
yonigozlan 3f118ca
hotfix: manually fix processor -> fixme once convert modular is fixed
tonywu71 3aa11a6
fix: convert weights working
tonywu71 8ff8962
feat: rename and improve convert weight script
tonywu71 7a54fec
feat: tweaks
tonywu71 2c94eaa
fest: remove `device` input for `post_process_retrieval`
tonywu71 2d7e96f
refactor: remove unused `get_torch_device`
tonywu71 1189340
Fix all tests
yonigozlan 246b67e
docs: update ColPali model doc
tonywu71 4a5bc0c
wip: fix convert weights to hf
tonywu71 afbbc98
fix logging modular
yonigozlan 9db013d
docs: add acknowledgements in model doc
tonywu71 c4e156c
docs: add missing docstring to ColPaliProcessor
tonywu71 0b4e089
docs: tweak
tonywu71 d6a0bde
docs: add doc for `ColPaliForRetrievalOutput.forward`
tonywu71 1f115f9
feat: add modifications from colpali-engine v0.3.2 in ColPaliProcessor
tonywu71 20d1927
fix: fix and upload colapli hf weights
tonywu71 5ef48fb
refactor: rename `post_process_retrieval` to `score_retrieval`
tonywu71 5ae2bac
fix: fix wrong typing for `score_retrieval`
tonywu71 ffe894a
test: add integration test for ColPali
tonywu71 b0e33be
chore: rerun convert modular
tonywu71 f052927
build: fix root imports
tonywu71 ad09d67
Update docs/source/en/index.md
tonywu71 0dd1524
fix: address PR comments
tonywu71 b647788
wip: reduce the prediction gap in weight conversion
tonywu71 153f339
docs: add comment in weight conversion script
tonywu71 97b3a24
docs: add example for `ColPaliForRetrieval.forward`
tonywu71 a711fa7
tests: change dataset path to the new one in hf-internal
tonywu71 e9035d9
fix: colpali weight conversion works
tonywu71 9f7299b
test: add fine-grained check for ColPali integration test
tonywu71 43274d2
fix: fix typos in convert weight script
tonywu71 f6e3155
docs: move input docstring in a variable
tonywu71 da03264
fix: remove hardcoded torch device in test
tonywu71 930f91a
fix: run the new modular refactor
tonywu71 db37344
docs: fix python example for ColPali
tonywu71 e72c379
feat: add option to choose `score_retrieval`'s output dtype and device
tonywu71 5b11870
docs: update doc for `score_retrieval`
tonywu71 c53ffcb
feat: add `patch_size` property in ColPali model
tonywu71 5346292
chore: run `make fix-copies`
tonywu71 b102738
docs: update description for ColPali cookbooks
tonywu71 1d24773
fix: remove `ignore_index` methods
tonywu71 73d607a
feat: remove non-transformers specific methods
tonywu71 1db4c6c
feat: update `__init__.py` to new hf format
tonywu71 da05b70
fix: fix root imports in transformers
tonywu71 1b4f8f3
feat: remove ColPali's inheritance from PaliGemma
tonywu71 f100888
Fix CI issues
yonigozlan 38210dc
nit remove prints
yonigozlan aee8d7c
feat: remove ColPali config and model from `modular_colpali.py`
tonywu71 f53ae20
feat: add `ColPaliPreTrainedModel` and update modeling and configuratβ¦
tonywu71 b93c76b
fix: fix auto-removed imports in root `__init__.py`
tonywu71 87a16fd
fix: various fixes
tonywu71 fba3b77
fix: fix `_init_weight`
tonywu71 1e6c4ab
temp: comment `AutoModel.from_config` for experiments
tonywu71 6d20088
fix: add missing `output_attentions` arg in ColPali's forward
tonywu71 be6a0bd
fix: fix `resize_token_embeddings`
tonywu71 ecc7982
fix: make `input_ids` optional in forward
tonywu71 b1a25ce
feat: rename `projection_layer` to `embedding_proj_layer`
tonywu71 84fefad
wip: fix convert colpali weight script
tonywu71 836dc97
fix tests and convert weights from original repo
yonigozlan 1eaa3d3
fix unprotected import
yonigozlan f187bc0
fix unprotected torch import
yonigozlan 3646790
fix style
yonigozlan c8efb8a
change vlm_backbone_config to vlm_config
yonigozlan a30a74d
fix unprotected import in modular this time
yonigozlan c42c61b
fix: load config from Hub + tweaks in convert weight script
tonywu71 e981b71
docs: move example usage from model docstring to model markdown
tonywu71 2ce28f5
docs: fix input docstring for ColPali's forward method
tonywu71 a582f48
fix: use `sub_configs` for ColPaliConfig
tonywu71 9f34d80
fix: remove non-needed sanity checks in weight conversion script + twβ¦
tonywu71 05c29da
fix: fix issue with `replace_return_docstrings` in ColPali's `forward`
tonywu71 f67e217
docs: update docstring for `ColPaliConfig`
tonywu71 2ed868c
test: change model path in ColPali test
tonywu71 2aa5e9d
fix: fix ColPaliConfig
tonywu71 e6944ad
fix: fix weight conversion script
tonywu71 337a0a0
test: fix expected weights for ColPali model
tonywu71 c10e760
docs: update ColPali markdown
tonywu71 69d01fc
docs: fix minor typo in ColPaliProcessor
tonywu71 8061469
Fix tests and add _no_split_modules
yonigozlan 7dce43f
add text_config to colpali config
yonigozlan 855f139
[run slow] colpali
yonigozlan 603e9e4
Merge branch 'main' into add-colpali
yonigozlan c41bad4
move inputs to torch_device in integration test
yonigozlan 21c1309
skip test_model_parallelism
yonigozlan 505ad9e
docs: clarify quickstart snippet in ColPali's model card
tonywu71 655bac7
docs: update ColPali's model card
tonywu71 e9af3a5
Merge remote-tracking branch 'upstream/main' into add-colpali
yonigozlan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,95 @@ | ||
| <!--Copyright 2024 The HuggingFace Team. All rights reserved. | ||
|
|
||
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
| the License. You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
| specific language governing permissions and limitations under the License. | ||
|
|
||
| β οΈ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | ||
| rendered properly in your Markdown viewer. | ||
|
|
||
| --> | ||
|
|
||
| # ColPali | ||
|
|
||
| ## Overview | ||
|
|
||
| The ColPali model was proposed in [ColPali: Efficient Document Retrieval with Vision Language Models](https://doi.org/10.48550/arXiv.2407.01449) by **Manuel Faysse***, **Hugues Sibille***, **Tony Wu***, Bilel Omrani, Gautier Viaud, CΓ©line Hudelot, Pierre Colombo (* denotes equal contribution). | ||
|
|
||
| With our new model *ColPali*, we propose to leverage VLMs to construct efficient multi-vector embeddings in the visual space for document retrieval. By feeding the ViT output patches from PaliGemma-3B to a linear projection, we create a multi-vector representation of documents. We train the model to maximize the similarity between these document embeddings and the query embeddings, following the ColBERT method. | ||
|
|
||
| Using ColPali removes the need for potentially complex and brittle layout recognition and OCR pipelines with a single model that can take into account both the textual and visual content (layout, charts, ...) of a document. ColPali is also highly interpretable: similarity maps can be obtained between patches and query tokens. These maps highlight ColPaliβs strong OCR capabilities and chart understanding. | ||
|
|
||
| **Paper abstract:** | ||
|
|
||
| > Documents are visually rich structures that convey information through text, but also figures, page layouts, tables, or even fonts. Since modern retrieval systems mainly rely on the textual information they extract from document pages to index documents -often through lengthy and brittle processes-, they struggle to exploit key visual cues efficiently. This limits their capabilities in many practical document retrieval applications such as Retrieval Augmented Generation (RAG). To benchmark current systems on visually rich document retrieval, we introduce the Visual Document Retrieval Benchmark *ViDoRe*, composed of various page-level retrieval tasks spanning multiple domains, languages, and practical settings. The inherent complexity and performance shortcomings of modern systems motivate a new concept; doing document retrieval by directly embedding the images of the document pages. We release *ColPali*, a Vision Language Model trained to produce high-quality multi-vector embeddings from images of document pages. Combined with a late interaction matching mechanism, *ColPali* largely outperforms modern document retrieval pipelines while being drastically simpler, faster and end-to-end trainable. | ||
| > | ||
| > We release models, data, code and benchmarks under open licenses at [https://huggingface.co/vidore](https://huggingface.co/vidore). | ||
|
|
||
| ## Resources | ||
|
|
||
| - The official blog post detailing ColPali can be found [here](https://huggingface.co/blog/manu/colpali). π | ||
| - The original model implementation code for the ColPali model and for the `colpali-engine` package can be found [here](https://github.com/illuin-tech/colpali). π | ||
| - Cookbooks for learning to use the transformers-native version of ColPali, fine-tuning, and similarity maps generation can be found [here](https://github.com/tonywu71/colpali-cookbooks). π | ||
|
|
||
| This model was contributed by [@tonywu71](https://huggingface.co/tonywu71) and [@yonigozlan](https://huggingface.co/yonigozlan). | ||
|
|
||
| ## Usage | ||
|
|
||
| This example demonstrates how to use ColPali to embed both queries and images, calculate their similarity scores, and identify the most relevant matches. For a specific query, you can retrieve the top-k most similar images by selecting the ones with the highest similarity scores. | ||
|
|
||
| ```python | ||
| import torch | ||
| from PIL import Image | ||
|
|
||
| from transformers import ColPaliForRetrieval, ColPaliProcessor | ||
|
|
||
| model_name = "vidore/colpali-v1.2-hf" | ||
|
|
||
| model = ColPaliForRetrieval.from_pretrained( | ||
| model_name, | ||
| torch_dtype=torch.bfloat16, | ||
| device_map="cuda:0", # or "mps" if on Apple Silicon | ||
| ).eval() | ||
|
|
||
| processor = ColPaliProcessor.from_pretrained(model_name) | ||
|
|
||
| # Your inputs (replace dummy images with screenshots of your documents) | ||
| images = [ | ||
| Image.new("RGB", (32, 32), color="white"), | ||
| Image.new("RGB", (16, 16), color="black"), | ||
| ] | ||
| queries = [ | ||
| "What is the organizational structure for our R&D department?", | ||
| "Can you provide a breakdown of last yearβs financial performance?", | ||
| ] | ||
|
|
||
| # Process the inputs | ||
| batch_images = processor(images=images).to(model.device) | ||
| batch_queries = processor(text=queries).to(model.device) | ||
|
|
||
| # Forward pass | ||
| with torch.no_grad(): | ||
| image_embeddings = model(**batch_images) | ||
| query_embeddings = model(**batch_queries) | ||
|
|
||
| # Score the queries against the images | ||
| scores = processor.score_retrieval(query_embeddings, image_embeddings) | ||
| ``` | ||
|
|
||
| ## ColPaliConfig | ||
|
|
||
| [[autodoc]] ColPaliConfig | ||
|
|
||
| ## ColPaliProcessor | ||
|
|
||
| [[autodoc]] ColPaliProcessor | ||
|
|
||
| ## ColPaliForRetrieval | ||
|
|
||
| [[autodoc]] ColPaliForRetrieval | ||
| - forward | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -53,6 +53,7 @@ | |
| codegen, | ||
| cohere, | ||
| cohere2, | ||
| colpali, | ||
| conditional_detr, | ||
| convbert, | ||
| convnext, | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| # Copyright 2024 The HuggingFace Team. All rights reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| from typing import TYPE_CHECKING | ||
|
|
||
| from ...utils import _LazyModule | ||
| from ...utils.import_utils import define_import_structure | ||
|
|
||
|
|
||
| if TYPE_CHECKING: | ||
| from .configuration_colpali import * | ||
| from .modeling_colpali import * | ||
| from .processing_colpali import * | ||
| else: | ||
| import sys | ||
|
|
||
| _file = globals()["__file__"] | ||
| sys.modules[__name__] = _LazyModule(__name__, _file, define_import_structure(_file), module_spec=__spec__) |
106 changes: 106 additions & 0 deletions
106
src/transformers/models/colpali/configuration_colpali.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,106 @@ | ||
| # coding=utf-8 | ||
| # Copyright 2024 The HuggingFace Inc. team. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| """ColPali model configuration""" | ||
|
|
||
| import logging | ||
| from copy import deepcopy | ||
|
|
||
| from ...configuration_utils import PretrainedConfig | ||
| from ..auto import CONFIG_MAPPING, AutoConfig | ||
|
|
||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| class ColPaliConfig(PretrainedConfig): | ||
| r""" | ||
| Configuration class to store the configuration of a [`ColPaliForRetrieval`]. It is used to instantiate an instance | ||
| of `ColPaliForRetrieval` according to the specified arguments, defining the model architecture following the methodology | ||
| from the "ColPali: Efficient Document Retrieval with Vision Language Models" paper. | ||
|
|
||
| Creating a configuration with the default settings will result in a configuration where the VLM backbone is set to the | ||
| default PaliGemma configuration, i.e the one from [vidore/colpali-v1.2](https://huggingface.co/vidore/colpali-v1.2). | ||
|
|
||
| The ColPali config is very similar to [`PaligemmaConfig`], but with an extra attribute defining the embedding dimension. | ||
|
|
||
| Note that contrarily to what the class name suggests (actually the name refers to the ColPali **methodology**), you can | ||
| use a different VLM backbone model than PaliGemma by passing the corresponding VLM configuration to the class constructor. | ||
|
|
||
| Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the | ||
| documentation from [`PretrainedConfig`] for more information. | ||
|
|
||
| Args: | ||
| vlm_config (`PretrainedConfig`, *optional*): | ||
| Configuration of the VLM backbone model. | ||
| text_config (`PretrainedConfig`, *optional*): | ||
| Configuration of the text backbone model. Overrides the `text_config` attribute of the `vlm_config` if provided. | ||
| embedding_dim (`int`, *optional*, defaults to 128): | ||
| Dimension of the multi-vector embeddings produced by the model. | ||
|
|
||
| Example: | ||
|
|
||
| ```python | ||
| from transformers.models.colpali import ColPaliConfig, ColPaliForRetrieval | ||
|
|
||
| config = ColPaliConfig() | ||
| model = ColPaliForRetrieval(config) | ||
| ``` | ||
| """ | ||
|
|
||
tonywu71 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| model_type = "colpali" | ||
| sub_configs = {"vlm_config": PretrainedConfig, "text_config": AutoConfig} | ||
|
|
||
| def __init__( | ||
| self, | ||
| vlm_config=None, | ||
| text_config=None, | ||
| embedding_dim: int = 128, | ||
| **kwargs, | ||
| ): | ||
| if vlm_config is None: | ||
| vlm_config = CONFIG_MAPPING["paligemma"]() | ||
| logger.info( | ||
| "`vlm_config` is `None`. Initializing `vlm_config` with the `PaliGemmaConfig` with default values." | ||
| ) | ||
| elif isinstance(vlm_config, dict): | ||
| vlm_config = deepcopy(vlm_config) | ||
| if "model_type" not in vlm_config: | ||
| raise KeyError( | ||
| "The `model_type` key is missing in the `vlm_config` dictionary. Please provide the model type." | ||
| ) | ||
| elif vlm_config["model_type"] not in CONFIG_MAPPING: | ||
| raise ValueError( | ||
| f"The model type `{vlm_config['model_type']}` is not supported. Please provide a valid model type." | ||
| ) | ||
| vlm_config = CONFIG_MAPPING[vlm_config["model_type"]](**vlm_config) | ||
| elif isinstance(vlm_config, PretrainedConfig): | ||
| vlm_config = vlm_config | ||
| else: | ||
| raise TypeError( | ||
| f"Invalid type for `vlm_config`. Expected `PretrainedConfig`, `dict`, or `None`, but got {type(vlm_config)}." | ||
| ) | ||
|
|
||
| self.vlm_config = vlm_config | ||
| self.text_config = text_config = text_config if text_config is not None else vlm_config.text_config | ||
| if isinstance(self.text_config, dict): | ||
| text_config["model_type"] = text_config["model_type"] if "model_type" in text_config else "gemma" | ||
| self.text_config = CONFIG_MAPPING[text_config["model_type"]](**text_config) | ||
|
|
||
| self.embedding_dim = embedding_dim | ||
|
|
||
| super().__init__(**kwargs) | ||
|
|
||
|
|
||
| __all__ = ["ColPaliConfig"] | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.