Skip to content

mitulgarg/Extract-Structured-Data-from-Image-Using-OCR-and-GeminiAI-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data inside Image -> Structured Data

OCRLLM.py first does OCR on Birth certificates using the 'pytesseract' library and extracts 'string' data.

Data is cleaned by removing redundant symbols using the 're' library.

Data is fed into Gemini AI along with a one-shot example and Unstructured data from the Image is now converted into Structured Data.

Thus, a Birth certificate Image is translated into structured data.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages