Spaces:

SAboodh
/

aboodmed

Running

App Files Files Community

Ctrl+K

Ctrl+K

1 contributor

History: 3 commits

SAboodh's picture

add an ai agent to it - Follow Up Deployment

6aed30c verified 20 days ago

.gitattributes

1.52 kB

initial commit 20 days ago
README.md

209 Bytes

Elevator pitch A bilingual medical study copilot that answers questions with cited evidence, lets users upload their own sources, and auto-generates high‑yield explanations, flashcards, and quizzes. Target users Medical students (preclinical and clinical) Interns/residents Young doctors preparing for exams or refreshing protocols Core value Trust: every answer is traceable to a source Speed: concise, exam‑style outputs (high yield) Flexibility: extend the knowledge base with your PDFs/notes Accessibility: English/Arabic, simple UX Feature set (refined) Smart search (RAG) Query in EN/AR Answer: concise summary, inline superscript citations, reference list with links/DOI Modes: overview | step‑by‑step | exam‑focused | emergency Structured medical database Curated topics with learning objectives, key points, different audience levels (M1/M3/Intern) Versioned content with “last reviewed” date Custom source upload Upload PDF/HTML/Docx Automatic extraction, chunking, embedding, metadata capture (title, year, license) Per‑user/private vs shared collections License field + “own/authorized” confirmation Study tools Explanations: short, high‑yield sections Flashcards: Anki‑ready, cloze and Q/A, tagged Quizzes: SBA MCQs with rationales and citations, difficulty, objective mapping Emergency & First Aid Condensed checklists with strong guardrails Prominent disclaimers + “call emergency” triggers for red‑flags Link to official guidelines (WHO/CDC/NICE/ACLS) Bilingual interface (EN/AR) UI toggles language; results can show bilingual terms (ar term [en]) Search supports both; transliteration dictionary for meds Primary references and licensing First Aid Step 1/2 and similar are copyrighted. Do not scrape or ship them. You can: Allow users to upload their legally obtained copies to their private workspace. Use public/redistributable sources in the default knowledge base (WHO, CDC, NICE, Cochrane, AHA/ACC summaries where allowed). Record license and source type for every document. Show “Source provided by user” labels. Safety and compliance Educational only; not personal medical advice Show last review date + guideline year For drugs/doses/contraindications, require explicit citations and add lint checks Protected content: per-user isolation; no cross-user exposure of uploads Logging and review: SME approval before publishing shared content Answer format (recommended) Short answer (5–8 lines) Bulleted key points Red-flag note if applicable Citations: [1], [2] inline; full refs below “Last updated: YYYY‑MM‑DD” Example answer policy prompt (system) “You are a medical education assistant. Answer concisely in the user’s language (EN/AR). Use only the retrieved snippets. For each clinical claim, include a citation [n]. If evidence is weak/uncertain, say so. Add a brief safety note when applicable. Do not give individualized treatment advice. Return JSON: { answer, bullets[], red_flags[], citations:[{id,title,year,url_or_doi}], safety_note, last_updated }” High-level architecture Frontend: Next.js/React with i18n (EN/AR), RTL support Backend: FastAPI or Node (Express) DB: Postgres (users, sources, lessons, cards, quizzes) Object storage: S3/compatible (uploads) Vector search: lightweight embeddings via API (OpenAI text-embedding-3-small/large) + cosine; add BM25 for hybrid Orchestration: LlamaIndex/LangChain or light custom TTS/Video (phase 2): Azure/Google/ElevenLabs + Remotion/moviepy Auth: email/password + optional SSO later Data model (minimal) users: id, email, role, created_at sources: id, owner_id/null (public), title, year, url_or_doi, license, source_type, visibility, created_at chunks: id, source_id, text, meta(page, section), embedding_vector(store separately), created_at lessons: id, title, audience_level, objectives[], last_reviewed_at, status(draft/published) lesson_artifacts: lesson_id, explanation_json, flashcards_json, quiz_json quizzes: id, lesson_id, questions_json jobs: id, type(ingest/generate), status, error, created_at Guardrails for Emergency mode Template adds: “If severe symptoms (chest pain, dyspnea, syncope), call emergency services immediately.” No doses unless cited; show source year prominently Prefer algorithmic steps (ABCs) with links to ACLS/BLS/Open‑access protocols RAG pipeline (practical) Ingest: extract text (PyMuPDF/trafilatura), clean, segment by headings, 600–1000 tokens with overlap Embed: API embeddings (normalize vectors); store .npy or in a vector DB Retrieval: top‑k dense + optional keyword filter; deduplicate; re-rank Generate: pass only retrieved chunks; enforce “no outside info”; request JSON Validate: auto-check presence of citations for each claim; flag missing Human review: approve before public publish MVP scope (2 weeks) Week 1: Upload + ingest PDFs (private) Smart Q&A with citations (EN/AR) Generate explanations + flashcards + 6 MCQs as JSON Basic UI (search box, answer panel, citations, file upload) Week 2: User accounts + private/public sources High‑yield mode toggle (shorter answers) Export: Anki .apkg or JSON, quiz CSV/JSON Emergency section (guardrails + curated public sources) Basic analytics (query counts, latency, failure rate) Success metrics Trust: % answers with ≥2 citations; SME approval rate Learning: average quiz accuracy change after study Engagement: DAU, queries/user/day, time to first correct answer Cost: tokens/query, embeddings/storage per source Cost control Cache embeddings per chunk checksum Answer-length caps; short “high‑yield” mode Batch embeddings Drop large PDFs not authorized; throttle upload size Internationalization Store language field per lesson/artifact On generation: prompt for EN or AR; optionally bilingual glossary UI: next-translate or react-i18next; ensure RTL styling and Arabic-safe fonts What to do next (based on your current repo) Add endpoints: POST /upload (PDF) -> returns source_id POST /ingest?source_id -> chunk + embed + index POST /qa -> {query, lang, mode} -> {answer_json} POST /lesson -> {topic, audience, lang} -> artifacts JSON GET /download/flashcards.apkg for export (phase 2) Add an “Answer contract” and validate server-side: Reject if JSON missing citations for claims Add a setting to pick “English/Arabic/bilingual” for outputs Legal note (important) Do not ship copyrighted books (First Aid Step 1/2) in your default dataset. Allow user uploads with an “I own rights” confirmation. Keep uploaded content private per user/organization. Prefer public/open sources for the shared knowledge base. If you want, I can: Turn this into a polished README.md and a simple landing page copy. Provide a minimal OpenAPI spec for the endpoints above. Generate example JSON outputs for a couple of topics (e.g., DKA, ACS) in both EN and AR. - Initial Deployment 20 days ago
index.html

35.1 kB

add an ai agent to it - Follow Up Deployment 20 days ago
style.css

388 Bytes

initial commit 20 days ago