This repository contains a scalable and modular pipeline for ingesting large-scale datasets into vector databases to power Retrieval-Augmented Generation (RAG) applications.
distributed-systems machine-learning natural-language-processing ai postgresql distributed-computing ray etl-pipeline rag mlops llm open-search ai-infrastructure
-
Updated
Mar 7, 2025 - Python