RAG-Chatbot

This repository contains a Retrieval-Augmented Generation (RAG) chatbot designed to assist users by answering questions based on the context of uploaded documents. It combines state-of-the-art technologies for document processing, embeddings generation, vector search, and natural language generation.

Features

OCR and PDF Parsing: Extracts text from scanned or text-based PDFs using PyPDFLoader and pytesseract (OCR).
Document Embeddings: Uses SentenceTransformer for generating embeddings of documents and queries.
Vector Search: Implements FAISS for storing and retrieving document embeddings with similarity search.
Query Refinement: An LLM (llama3.2) is used to refine user queries for better retrieval performance.
Customizable Prompt Templates: Prompts can be tailored for context-specific assistance.
Streamlit Interface: Allows Admin to upload PDFs, process them, and update the vector store.
FastAPI Backend: Provides REST APIs for asking questions and refining queries.
Sources for Answers: Responses include document names and page numbers for better traceability.

Technologies Used

Python Libraries

LangChain: For document loaders, prompts, and retrieval-based question answering.
SentenceTransformers: For generating embeddings.
FAISS: For efficient similarity search.
pytesseract: For OCR text extraction.
FastAPI: For creating the API server.
Streamlit: For creating the Admin interface.

Frontend

Next.js: For building the frontend of the web application, providing a dynamic and interactive user interface.

Model:

dunzhang/stella_en_1.5B_v5: for embeddings.
llama-3.2: for query refinement and responses.

Clone the Repository

To clone the repository, run the following commands in your terminal:

git clone https://github.com/Shridhar7-8/RAG-Chatbot.git
cd rag-chatbot

Set up a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Usage Instructions

Go to admin folder

Start the Streamlit app:

streamlit run admin.py

Open the app in your browser at http://localhost:8501.
Upload PDF files and update the vector store.
Once the vector store creation is complete, stop the script by exiting (Ctrl+C).

Start the FastAPI Backend

Navigate to the fast_server folder inside admin.

cd fast_server

Run main.py to start the FastAPI server.

Start the Frontend Chatbot

open a new terminal and Navigate to the chatbot directory.

cd chatbot

Install dependencies (if not already installed).

npm install

Start the development server.

npm run dev

Open the chatbot interface in your browser. The default URL is usually http://localhost:3000

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
admin		admin
bot-env		bot-env
chatbot		chatbot
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG-Chatbot

Features

Technologies Used

Python Libraries

Frontend

Model:

Clone the Repository

Usage Instructions

Go to admin folder

Start the FastAPI Backend

Start the Frontend Chatbot

About

Uh oh!

Releases

Packages

Languages

License

Shridhar7-8/RAG-Chatbot

Folders and files

Latest commit

History

Repository files navigation

RAG-Chatbot

Features

Technologies Used

Python Libraries

Frontend

Model:

Clone the Repository

Usage Instructions

Go to admin folder

Start the FastAPI Backend

Start the Frontend Chatbot

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages