RAG System & Infrastructure

What is RAG?

Retrieval-Augmented Generation (RAG) is an AI architecture that combines the power of large language models with real-time data retrieval. Instead of relying solely on pre-trained knowledge, RAG systems retrieve relevant information from databases and knowledge bases to provide accurate, context-aware responses.

In MedSense Analytics, our RAG system enables the AI assistant to answer questions about patients, sensor data, anomalies, and system architecture by dynamically retrieving relevant information from the database and project knowledge base.

RAG Architecture

User Query

"Show patient 1 chart"

→

Query Classifier

Extract Intent

Patient ID, Chart Type

→

Data Retrieval

SQLAlchemy

Patient Data, Trends

→

LLM Generation

OpenAI GPT

Context + Response

→

Response

Text + Charts

System Components

1. Query Classifier

Analyzes user queries to determine:

Whether database access is needed
Patient ID extraction from queries
Query intent (summary, chart, anomaly, etc.)
Chart type detection (cough, steps, keywords)

                        Keywords: "patient", "chart", "anomaly", "trend"

                        Patterns: "patient 1", "room 101", "show chart"

2. Data Retriever

Retrieves relevant data from database:

Patient summaries and statistics
Time-series sensor data
Anomaly flags and severity
Medical history (medicines, past diseases)
Global analytics and trends

                        SQLAlchemy ORM

                        SQLite / PostgreSQL

3. Context Builder

Constructs context for LLM:

Formats patient data into readable text
Includes relevant anomalies and analysis
Adds project knowledge for system questions
Structures data for optimal LLM understanding

                        Context includes:

                        - Patient summary

                        - Anomaly analysis

                        - Medical history

                        - Project knowledge

4. LLM Generator

Generates intelligent responses:

OpenAI GPT model (configurable)
System prompts for medical context
Clinically relevant interpretations
Chart descriptions and insights

                        Model: GPT-3.5-turbo / GPT-4

                        Temperature: 0.7

                        Max tokens: 600

Knowledge Base

Project Knowledge Base

The system maintains a comprehensive knowledge base about:

Hardware & Sensors

Arduino Nano 33 BLE Sense Rev2 specifications
Sensor details (BMI270, MP34DT06JTR, HS3003)
Sensor to metric mappings
Bluetooth Low Energy (BLE) protocol

ML Models & Detection

Z-Score anomaly detection algorithm
Severity levels and thresholds
Detection pipeline workflow

API & Database

REST API endpoints
Database schema (SQLAlchemy)
Data flow architecture
Technology stack details

Data Processing

Time-series aggregation
Trend analysis methods
Anomaly flagging logic

Query Processing Workflow

1

Query Reception

User submits query via chat interface. Query is sent to `/api/v1/chat` endpoint.

2

Query Classification

QueryClassifier analyzes the query to determine if database access is needed and extracts patient IDs, chart types, and intent.

3

Data Retrieval

PatientDataRetriever fetches relevant data from database: patient summaries, time-series data, anomalies, medical history.

4

Context Building

Context builder formats retrieved data and adds project knowledge (if needed) into a structured context for the LLM.

5

LLM Generation

OpenAI GPT model generates response using system prompt, context, and user query. Response includes clinically relevant insights.

6

Response Delivery

Response is returned with optional chart data. Frontend renders markdown-formatted text and interactive charts.

Key Features

🎯

Intelligent Query Understanding

Automatically extracts patient IDs, chart types, and intent from natural language queries.

📊

Dynamic Chart Generation

Detects chart requests and generates interactive visualizations using Plotly.js.

🏥

Clinical Context Awareness

Considers patient medications, medical history, and anomalies for contextually relevant responses.

🔍

Anomaly Analysis

Provides concern assessments based on anomaly severity, z-scores, and clinical significance.

📚

Project Knowledge Integration

Answers system architecture, hardware, and technical questions using embedded knowledge base.

⚡

Real-time Data Access

Retrieves live data from database, ensuring responses reflect current patient status.

Technical Implementation

Component	Technology	Purpose
Query Classification	Python Regex, Keyword Matching	Extract intent and parameters from natural language
Data Retrieval	SQLAlchemy ORM	Query database for patient data, anomalies, trends
LLM Integration	OpenAI API (GPT-3.5/4)	Generate contextual, clinically relevant responses
Chart Rendering	Plotly.js	Interactive visualizations in chat interface
Markdown Rendering	Custom JavaScript Parser	Format LLM responses with bold, lists, code blocks
Context Management	Python Dict, JSON	Structure data for optimal LLM understanding

Example Queries

Patient Data Queries

"Show me patient 1's summary"
"What medicines is patient 2 taking?"
"Show cough chart for patient 3"
"What is patient 1's medical history?"
"Should I be concerned about patient 2?"

System & Technical Queries

"What sensors are used?"
"How does anomaly detection work?"
"Explain the Z-score algorithm"
"What is the data flow architecture?"
"Tell me about the Arduino hardware"