You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Parsit is a specialized vision-language model designed for document analysis tasks. Parsit combines Qwen3 language models with SigLIP-2 vision encoders to provide state-of-the-art performance on doβ¦