metadata

title: RAG Research Assistant API
emoji: 📚
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false

RAG Research Assistant API

This backend provides a sophisticated FastAPI-based service for a Retrieval Augmented Generation (RAG) system designed to assist with research paper searches and information retrieval.

Features

ArXiv paper search with customizable filtering
Document chunking and processing
Embedding generation using Sentence Transformers
Vector search using FAISS
LLM response generation with multiple model options
Markdown-formatted research results

Requirements

Python 3.8+
FastAPI
Sentence Transformers
FAISS
Hugging Face API access

Installation

Clone the repository
Navigate to the backend directory
Install dependencies:

pip install -r requirements.txt

Create a .env file based on .env.example and add your Hugging Face API key

Running the Application

To run the development server:

python run.py

Or with uvicorn directly:

uvicorn app.main:app --reload

The API will be available at http://localhost:8000, and the API documentation at http://localhost:8000/docs.

API Endpoints

/rag/query: Process a query through the RAG pipeline
/rag/search: Search for papers without LLM processing
/rag/models: Get available LLM models
/rag/stats: Get system statistics
/rag/clear/cache: Clear the paper cache
/rag/clear/database: Clear the vector database
/health: Simple health check endpoint

Docker

You can also run the application using Docker:

docker build -t rag-backend .
docker run -p 8000:8000 -e HF_API_KEY=your_key_here rag-backend

Architecture

The application follows a service-oriented architecture:

ArxivService: Interface to ArXiv API
DocumentService: Process papers into chunks
EmbeddingService: Generate embeddings
VectorService: Store and search vectors
LlmService: Generate responses using LLMs
FormatterService: Format results
RagService: Orchestrate the entire RAG pipeline