Spaces:
Running
Running
metadata
title: RAG Research Assistant API
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
RAG Research Assistant API
This backend provides a sophisticated FastAPI-based service for a Retrieval Augmented Generation (RAG) system designed to assist with research paper searches and information retrieval.
Features
- ArXiv paper search with customizable filtering
- Document chunking and processing
- Embedding generation using Sentence Transformers
- Vector search using FAISS
- LLM response generation with multiple model options
- Markdown-formatted research results
Requirements
- Python 3.8+
- FastAPI
- Sentence Transformers
- FAISS
- Hugging Face API access
Installation
- Clone the repository
- Navigate to the backend directory
- Install dependencies:
pip install -r requirements.txt
- Create a
.env
file based on.env.example
and add your Hugging Face API key
Running the Application
To run the development server:
python run.py
Or with uvicorn directly:
uvicorn app.main:app --reload
The API will be available at http://localhost:8000, and the API documentation at http://localhost:8000/docs.
API Endpoints
/rag/query
: Process a query through the RAG pipeline/rag/search
: Search for papers without LLM processing/rag/models
: Get available LLM models/rag/stats
: Get system statistics/rag/clear/cache
: Clear the paper cache/rag/clear/database
: Clear the vector database/health
: Simple health check endpoint
Docker
You can also run the application using Docker:
docker build -t rag-backend .
docker run -p 8000:8000 -e HF_API_KEY=your_key_here rag-backend
Architecture
The application follows a service-oriented architecture:
ArxivService
: Interface to ArXiv APIDocumentService
: Process papers into chunksEmbeddingService
: Generate embeddingsVectorService
: Store and search vectorsLlmService
: Generate responses using LLMsFormatterService
: Format resultsRagService
: Orchestrate the entire RAG pipeline