DEL

2025-12-20 17:38:10 +01:00
parent 1180b2a64e
commit 5ef0fc0ccc
15 changed files with 1454 additions and 0 deletions
--- a/plans/agentic-architecture-phq8.md
+++ b/plans/agentic-architecture-phq8.md
@@ -0,0 +1,95 @@
+# Plan: Modular Agentic Framework for Clinical Assessment (Helia)
+
+## Overview
+
+Implement a production-grade, privacy-first Agentic Framework using LangGraph to automate PHQ-8 clinical assessments. The system allows dynamic switching between Local (Tier 1), Self-Hosted (Tier 2), and Cloud (Tier 3) models to benchmark performance.
+
+## Problem Statement
+
+The current system relies on a monolithic script (`src/helia/agent/workflow.py` is a placeholder) and a single-pass evaluation logic that likely underperforms on smaller local models. To prove the thesis hypothesis—that local models can match cloud performance—we need a sophisticated **Stateful Architecture** that implements Multi-Stage Reasoning ("RISEN" pattern) and robust Human-in-the-Loop (HITL) workflows.
+
+## Proposed Solution
+
+A **Hierarchical Agent Supervisor** architecture built with **LangGraph**:
+
+1.  **Supervisor**: Orchestrates the workflow and manages state.
+2.  **Assessment Agent**: Implements the "RISEN" (Reasoning Improvement via Stage-wise Evaluation Network) pattern:
+    *   **Extract**: Quote relevant patient text.
+    *   **Map**: Align quotes to PHQ-8 criteria.
+    *   **Score**: Assign 0-3 value.
+3.  **Ingestion**: Standardizes data from S3/Local into a `ClinicalState`.
+4.  **Benchmarking**: Automates the comparison between Generated Scores vs. Ground Truth (DAIC-WOZ labels).
+
+**Note:** A dedicated **Safety Guardrail** agent has been designed but is scoped out of this MVP. See `plans/safety-guardrail-architecture.md` for details.
+
+## Technical Approach
+
+### Architecture: The "Helia Graph"
+
+```mermaid
+graph TD
+    Start --> Ingestion
+    Ingestion --> Router{Router}
+
+    subgraph "Assessment Agent (RISEN)"
+        Router --> Extract[Extract Evidence]
+        Extract --> Map[Map to Criteria]
+        Map --> Score[Score Item]
+        Score --> NextItem{Next Item?}
+        NextItem -- Yes --> Extract
+    end
+
+    NextItem -- No --> HumanReview["Human Review (HITL)"]
+    HumanReview --> Finalize[Finalize & Persist]
+```
+
+### Implementation Phases
+
+#### Phase 1: Core Graph & State Management (Foundation)
+*   **Goal**: Establish the LangGraph structure and Pydantic State.
+*   **Deliverables**:
+    *   `src/helia/agent/state.py`: Define `ClinicalState` (transcript, current_item, scores).
+    *   `src/helia/agent/graph.py`: Define the main `StateGraph` with Ingestion -> Assessment -> Persistence nodes.
+    *   `src/helia/ingestion/loader.py`: Add "Ground Truth" loading for DAIC-WOZ labels (critical for benchmarking).
+
+#### Phase 2: The "RISEN" Assessment Logic
+*   **Goal**: Replace monolithic `PHQ8Evaluator` with granular nodes.
+*   **Deliverables**:
+    *   `src/helia/agent/nodes/assessment.py`: Implement `extract_node`, `map_node`, `score_node`.
+    *   `src/helia/prompts/`: Create specialized prompt templates for each stage (optimized for Llama 3).
+    *   **Refactor**: Update `PHQ8Evaluator` to be callable as a tool/node rather than a standalone class.
+
+#### Phase 3: Tier Switching & Execution
+*   **Goal**: Implement dynamic model config.
+*   **Deliverables**:
+    *   `src/helia/configuration.py`: Ensure `RunConfig` (Tier 1/2/3) propagates to LangGraph `configurable` params.
+    *   `src/helia/agent/runner.py`: CLI entry point to run batch benchmarks.
+
+#### Phase 4: Human-in-the-Loop & Persistence
+*   **Goal**: Enable clinician review and data saving.
+*   **Deliverables**:
+    *   **Checkpointing**: Configure MongoDB/Postgres checkpointer for LangGraph.
+    *   **Review Flow**: Implement the `interrupt_before` logic for the "Finalize" node.
+    *   **Metrics**: Calculate "Item-Level Agreement" (MAE/Kappa) between Agent and Ground Truth.
+
+## Acceptance Criteria
+
+### Functional Requirements
+- [ ] **Stateful Workflow**: System successfully transitions Ingest -> Assess -> Persist using LangGraph.
+- [ ] **Multi-Stage Scoring**: Each PHQ-8 item is scored using the Extract -> Map -> Score pattern.
+- [ ] **Model Swapping**: Can run the *exact same graph* with `gpt-4` (Tier 3) and `llama3` (Tier 1) just by changing config.
+- [ ] **Benchmarking**: Automatically output a CSV comparing `Model_Score` vs `Human_Label` for all 8 items.
+
+### Non-Functional Requirements
+- [ ] **Privacy**: Tier 1 execution sends ZERO bytes to external APIs.
+- [ ] **Reproducibility**: Every run logs the exact prompts used and model version to MongoDB.
+
+## Dependencies & Risks
+- **Risk**: Local models (Tier 1) may hallucinate formatting in the "Map" stage.
+    *   *Mitigation*: Use `instructor` or constrained decoding (JSON mode) for Tier 1.
+- **Dependency**: Requires DAIC-WOZ dataset (assumed available locally or mocked).
+
+## References
+- **LangGraph**: [State Management](https://langchain-ai.github.io/langgraph/concepts/high_level/#state)
+- **Clinical Best Practice**: [RISEN Framework (2025)](https://pubmed.ncbi.nlm.nih.gov/40720397/)
+- **Project Config**: `src/helia/configuration.py`
--- a/plans/safety-guardrail-architecture.md
+++ b/plans/safety-guardrail-architecture.md
@@ -0,0 +1,69 @@
+# Plan: Safety Guardrail Architecture (Post-MVP)
+
+## Overview
+
+A dedicated, parallel **Safety Guardrail Agent** designed to monitor clinical sessions for immediate risks (self-harm, suicidal ideation) and intervene regardless of the primary assessment agent's state. This component is critical for "Duty of Care" compliance but is scoped out of the initial MVP to focus on the core scoring pipeline.
+
+## Problem Statement
+
+General-purpose reasoning agents (like the PHQ-8 scorer) often exhibit "tunnel vision," focusing exclusively on their analytical task while missing or delaying the flagging of critical safety signals. In a clinical context, waiting for a 60-second reasoning loop to finish before flagging a suicide risk is unacceptable.
+
+## Proposed Solution
+
+A **Parallel Supervisor** pattern where the Safety Agent runs asynchronously alongside the main Assessment Agent.
+
+### Architecture
+
+```mermaid
+graph TD
+    Router{Router}
+
+    subgraph "Main Flow"
+        Router --> Assessment[Assessment Agent]
+    end
+
+    subgraph "Safety Layer"
+        Router --> Safety[Safety Guardrail]
+        Safety --> |Risk Detected| Interrupt[Interrupt Signal]
+    end
+
+    Assessment --> Merger
+    Interrupt --> Merger
+    Merger --> Handler{Risk Handling}
+```
+
+## Technical Approach
+
+### 1. The Safety Agent Node
+*   **Model**: Uses a smaller, faster model (e.g., Llama-3-8B-Instruct or a specialized BERT classifier) optimized for classification, not reasoning.
+*   **Prompting**: Few-shot prompted specifically for:
+    *   Suicidal Ideation (Passive vs Active)
+    *   Self-Harm Intent
+    *   Harm to Others
+*   **Output**: Boolean flag (`risk_detected`) + `risk_category` + `evidence_snippet`.
+
+### 2. Parallel Execution in LangGraph
+*   **Fan-Out**: The Supervisor node spawns *both* `assessment_node` and `safety_node` for every transcript chunk.
+*   **Race Condition Handling**:
+    *   If `safety_node` returns `risk_detected=True`, it must trigger a **`NodeInterrupt`** or inject a high-priority state update that overrides the Assessment Agent's output.
+
+### 3. Integration Points (Post-MVP)
+*   **State Schema**:
+    ```python
+    class ClinicalState(BaseModel):
+        # ... existing fields ...
+        safety_flags: List[SafetyAlert] = []
+        is_session_halted: bool = False
+    ```
+*   **Transition Logic**:
+    If `is_session_halted` becomes True, the graph routes immediately to a "Crisis Protocol" node, bypassing all remaining PHQ-8 items.
+
+## Implementation Plan
+
+1.  **Define Safety Schema**: Create `SafetyAlert` Pydantic model.
+2.  **Implement Guardrail Node**: Create `src/helia/agent/nodes/safety.py`.
+3.  **Update Graph**: Modify `src/helia/agent/graph.py` to add the parallel edge.
+4.  **Test Scenarios**: Create synthetic transcripts with hidden self-harm indicators to verify interruption works.
+
+## References
+*   [EmoAgent: Assessing and Safeguarding Human-AI Interaction (2025)](https://www.semanticscholar.org/paper/110ab0beb74ffb7ab1efe55ad36b4732835fa5c9)