pdf-text-extraction

Star

Here are 11 public repositories matching this topic...

houking-can / PDFSDK

Star

Based on Foxit Quick PDF Library，python interface

pdf-merge pdf-split pdf-document-processor pdf-sdk pdf-text-extraction

Updated Apr 4, 2020
Python

vijayengineer / PDFTextSpeechConverter

Star

Converts scanned documents and ordinary documents into speech mp3 using Amazon Polly

pdf text images speech aws-polly audiobook synthesis scanned-documents pdf-text-extraction

Updated Dec 30, 2020
Python

PrathameshDhande22 / PdfTxtBot

Star

A Telegram bot which extract Text from PDF, also extract the Images of PDF Pages. Made with Python

python telegram telegram-bot python3 python-telegram-bot image-extractor python-telegram pdf-text pdf-text-extraction pdf-image

Updated Feb 27, 2023
Python

Zeeshanahmad4 / NLP-Pdf-Minning-Extracting-text-from-pdf

Star

NLP Pdf Minning Extracting text from pdf

python pdf pdf-converter text-extraction pdfkit pdf-files extract-text pdftotext pdf-format pdf-document-processor pdftoimage pdftools pdftohtml pdf-text-extraction pdfcon

Updated Apr 2, 2020
Python

eli64s / pdflex

Sponsor

Star

CLI for merging PDF contexts.

pdf-converter pdf-document pdf-generator pdf-manipulation pdf-extractor pdf-library pdf-parser pdf-data-extraction pdf-processor pdf-tools pdf-document-processor python-pdf pdf-search pdf-text-extraction pdf-python pdf-automation python-pdf-tools pdf-document-parser pdf-regex

Updated Mar 20, 2025
Python

rithulkamesh / docproc

Sponsor

Star

Opinionated and Sophisticated Document Region Analyzer.

python machine-learning ocr text-classification text-extraction data-extraction region-detection content-extraction document-analysis layout-analysis pdf-processing pdf-text-extraction document-parsing equation-detection mathematical-symbols

Updated Apr 13, 2025
Python

VirajMadhu / pdf_key_matcher

Star

Highlights the key matches between your Given PDF and the description text

python open-source pdf cv python-script python3 text-extraction terminal-based ats text-compression pdf-text-extraction virajmadhu

Updated Dec 4, 2024
Python

rmottanet / unchainedtext

Star

UnchainedText: Break free from PDFs! Easily extract raw text to .txt for preprocessing.

extractor text-extraction data-extraction text-processing pdf-text-extraction text-extraction-tool

Updated Apr 2, 2024
Python

Dinesh-Sharma2004 / Resume_Screener

Star

An AI-powered tool that extracts text from PDF resumes, predicts the most suitable job role using Hugging Face BART MNLI, and rewrites the resume in a professional LaTeX format using Google FLAN-T5. Built with Flask for the backend and Streamlit for the frontend, it offers a fast, user-friendly way to analyze and improve resumes in real time.

api flask machine-learning natural-language-processing sentiment-analysis webapp resume-builder pdf-parser resume-analysis latex-resume huggingface pdf-processing streamlit pdf-text-extraction resume-improvement

Updated Aug 6, 2025
Python

simonpierreboucher / Crawler

Star

A robust, modular web crawler built in Python for extracting and saving content from websites. This crawler is specifically designed to extract text content from both HTML and PDF files, saving them in a structured format with metadata.

rate-limiting http-requests error-handling html-parsing data-collection text-processing web-crawling content-extraction yaml-configuration data-scraping python-crawler modular-design metadata-storage url-normalization pdf-text-extraction structured-data-storage concurrent-crawling data-extraction-pipeline data-preservation-and-recovery

Updated Nov 18, 2024
Python

Spikes2012 / DjangoBusPriority

Star

This is for Technology Application Project at Swinburne University of Technology

django file-upload text-extraction image-to-text webapplication pdf-text-extraction

Updated Jun 6, 2023
Python

Improve this page

Add a description, image, and links to the pdf-text-extraction topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the pdf-text-extraction topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pdf-text-extraction

Here are 11 public repositories matching this topic...

houking-can / PDFSDK

vijayengineer / PDFTextSpeechConverter

PrathameshDhande22 / PdfTxtBot

Zeeshanahmad4 / NLP-Pdf-Minning-Extracting-text-from-pdf

eli64s / pdflex

rithulkamesh / docproc

VirajMadhu / pdf_key_matcher

rmottanet / unchainedtext

Dinesh-Sharma2004 / Resume_Screener

simonpierreboucher / Crawler

Spikes2012 / DjangoBusPriority

Improve this page

Add this topic to your repo