Files
helia/IMPLEMENTATION_SUMMARY.md
Santiago Martinez-Avial 5ce6d7e1d3 WIP
2025-12-23 13:35:15 +01:00

58 lines
2.9 KiB
Markdown

# Implementation Summary: Modular Agentic Framework
## Overview
We have successfully implemented the core Agentic Framework for the PHQ-8 assessment benchmark. This architecture uses **LangGraph** to orchestrate a multi-stage reasoning process ("RISEN") and supports dynamic switching between Local (Tier 1) and Cloud (Tier 3) models. The system is fully integrated with the MongoDB infrastructure for data and prompts.
## Completed Components
### 1. Agent Architecture (`src/helia/agent/graph.py`)
- Implemented a `StateGraph` that manages the workflow lifecycle.
- **Nodes**: `ingestion`, `extract_evidence`, `map_criteria`, `score_item`, `human_review`, `persistence`.
- **Routing**: Conditional edges loop through the 8 PHQ-8 items before proceeding to review.
- **HITL**: Configured `MemorySaver` to allow human-in-the-loop interrupts at the `human_review` stage.
### 2. State Management (`src/helia/agent/state.py`)
- Created `ClinicalState` Pydantic model.
- Strictly types the workflow memory, including:
- `transcript_text`: The input data.
- `scores`: A list of `PHQ8ItemScore` objects (accumulated via reducer).
- `current_item_index`: Tracks progress through the 8 items.
- `current_evidence` / `current_reasoning`: Transient fields for the RISEN loop.
### 3. RISEN Logic (`src/helia/agent/nodes/assessment.py`)
- Refactored the monolithic evaluation logic into three granular nodes:
1. **Extract**: Finds verbatim quotes for the specific symptom.
2. **Map**: Aligns evidence to the 0-3 scoring criteria.
3. **Score**: Assigns the final value and structured reasoning.
- **Prompt Management**: Fetches prompts dynamically from the MongoDB `Prompt` collection using `Prompt.find_one`.
### 4. Runner & Config (`src/helia/agent/runner.py`)
- Created a CLI entry point: `python -m helia.agent.runner <run_id>`.
- Initializes the MongoDB connection via `init_db`.
- Fetches all available `Transcript` documents from the database to run the benchmark.
- Injects the specific `RunConfig` (Tier 1/2/3) into the graph's runtime configuration.
### 5. Ingestion (`src/helia/ingestion/loader.py`)
- Added `ClinicalDataLoader` to abstract transcript fetching.
- Loads directly from the `Transcript` Beanie document model in MongoDB.
### 6. Database Migrations
- Created `migrations/init_risen_prompts.py` to seed the database with the required "RISEN" prompt templates (`phq8-extract`, `phq8-map`, `phq8-score`).
## Usage
1. **Seed Prompts**:
```bash
python migrations/init_risen_prompts.py
```
2. **Run Agent**:
```bash
# Run with the default Tier 3 (Cloud) config (defined in config.yaml)
python -m helia.agent.runner gemini-flash
```
## Next Steps
1. **Safety**: Implement the `Safety Guardrail` (parallel node) as designed in `plans/safety-guardrail-architecture.md`.
2. **Persistence**: Uncomment the DB save logic in `persistence_node` to save `AssessmentResult` documents.