add-long-vita #36553

ltBai · 2025-03-05T07:33:45Z

What does this PR do?

We are trying to add a new model named long-vita to the transformers repository, it is a mllm that be able to deal with long-context to 1 million tokens, looking forward to your feedback!

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

github-actions · 2025-03-05T07:33:57Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the Ready for review button (at the bottom of the PR page).

Rocketknight1 · 2025-03-06T10:16:26Z

Hi @ltBai,

In general, we recommend that most models are uploaded as custom code using the steps here, without needing a PR to the transformers library. This will let you share the model immediately, and it'll work exactly the same as a library model (except that users will need to set trust_remote_code=True)

In general we only accept PRs to add new architectures to the core transformers library when one of these are true:

There's a pretrained model with a lot of interested users
There's a paper on the architecture with SOTA results or lots of interest/citations
The model comes from a company or research group whose past models have gotten a lot of usage (because this means the new model will probably get a lot of users too)

The reason for this is that once a model is actually in transformers itself, then the team at Hugging Face takes full responsibility for maintaining the code, testing it and making sure it stays compatible with new versions of transformers. We can't do this for every model architecture!

Remember that just because a model is a custom code model, doesn't make it less important. A lot of extremely popular and high-performance models are custom code models and don't have Transformers PRs, for example Phi-4-multimodal is the top trending model on Hugging Face today, and it is also a custom code model without a library PR! Starting with a custom code model is definitely the right approach for most authors!

add-long-vita

449cbfb

github-actions bot marked this pull request as draft March 5, 2025 07:33

add-LongVITAModel

777f33e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add-long-vita #36553

add-long-vita #36553

Uh oh!

ltBai commented Mar 5, 2025

Uh oh!

github-actions bot commented Mar 5, 2025

Uh oh!

Rocketknight1 commented Mar 6, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

add-long-vita #36553

Are you sure you want to change the base?

add-long-vita #36553

Uh oh!

Conversation

ltBai commented Mar 5, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

github-actions bot commented Mar 5, 2025

Uh oh!

Rocketknight1 commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Rocketknight1 commented Mar 6, 2025 •

edited

Loading