Retab

The AI Automation Platform

Made with love by the team at Retab 🩷.

Our Website | Documentation | Discord | Twitter

What is Retab?

Retab is the complete developer platform and SDK for shipping state-of-the-art document processing in the age of LLMs.

We want you to use Retab for a defined purpose: get SHIP FAST automations to get STRUCTURED & QUALITY data.

For this mission, we provide the best-in-class preprocessing, help you generate prompts & extraction schemas that fit your preferred model providers, iterate & evaluate the accuracy of your configuration, and ship fast your automation directly in your code or with your preferred platforms such as n8n or Dify [WIP].

Why did we build Retab?

Because of a new, lighter paradigm

Large Language Models collapse entire layers of legacy OCR pipelines into a single, elegant abstraction. When a model can read, reason, and structure text natively, we no longer need brittle heuristics, handcrafted parsers, or heavyweight ETL jobs. Instead, we can expose a small, principled API: "give me the document, tell me the schema, and get back structured truth." Complexity evaporates, reliability rises, speed follows, and costs fall—because every component you remove is one that can no longer break.

LLM‑first design lets us focus less on plumbing and more on the questions we actually want answered—Retab stands here. We help you unlock these capabilities, offering you all the software-defined primitives to build your own document processing solutions. We see it as Stripe for document processing.

Check our documentation.

Join our Discord and share your feedback.

API Key

To use the API, you need to sign up on Retab.

SDK

Install the SDK

pip install retab

Generate a Schema

from pathlib import Path
from retab import Retab
client = Retab(api_key="YOUR_RETAB_API_KEY")

response = client.schemas.generate(
    documents=["Invoice.pdf"],
    model="gpt-4.1",          # or any model your plan supports
    temperature=0.0,          # keep the generation deterministic
    modality="native",        # "native" = let the API decide best modality
)

Extract Data

from pathlib import Path
from retab import Retab

from retab import Retab
client = Retab()

response = client.documents.extract(
    json_schema = "Invoice_schema.json",
    document = "Invoice.pdf",
    model="gpt-4.1-nano",
    temperature=0
)

print(response)

Projects

On the Platform, Projects provide a systematic way to test and validate your extraction schemas against known ground truth data. Think of it as unit testing for document AI—you can measure accuracy, compare different models, and optimize your extraction pipelines with confidence.

The project workflow for schema optimization:

Run initial project → identify low-accuracy fields
Refine descriptions and add reasoning prompts → re-run project
Compare accuracy improvements → iterate until satisfied
Deploy optimized schema to production

from retab import Retab

client = Retab()

# Submit a single document
completion = client.deployments.extract(
    project_id="eval_***",
    iteration_id="base-configuration", # or the configuration that gave you the best precision score
    document="path/to/document.pdf"
)

print(completion)

Projects give you an easy-to-use automation engine that's easy to integrate in your codebase and workflows.

Check our documentation.

Community

Let's create the future of document processing together.

Join our Discord to share your journey, discuss best practices, and give your feedback. You can also follow us on X (Twitter) at us.

We can't wait to see how you'll use Retab.

Roadmap

We share our roadmap publicly. Please submit your feature requests on Github

Among the features we're working on:

Schema optimization autopilot
Sources API
Document Edit API
n8n plugin

Name		Name	Last commit message	Last commit date
Latest commit History 724 Commits
assets		assets
blog		blog
clients		clients
cookbook		cookbook
mcp/python		mcp/python
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
script.py		script.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Retab

What is Retab?

Why did we build Retab?

API Key

SDK

Projects

Community

Useful Links

Roadmap

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

retab-dev/retab

Folders and files

Latest commit

History

Repository files navigation

Retab

What is Retab?

Why did we build Retab?

API Key

SDK

Projects

Community

Useful Links

Roadmap

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages