WIP

2025-12-23 13:35:15 +01:00
parent a9346ccb34
commit 5ce6d7e1d3
12 changed files with 734 additions and 22 deletions
--- a/plans/agentic-architecture-phq8.md
+++ b/plans/agentic-architecture-phq8.md
@@ -17,7 +17,7 @@ A **Hierarchical Agent Supervisor** architecture built with **LangGraph**:
    *   **Extract**: Quote relevant patient text.
    *   **Map**: Align quotes to PHQ-8 criteria.
    *   **Score**: Assign 0-3 value.
-3.  **Ingestion**: Standardizes data from S3/Local into a `ClinicalState`.
+3.  **Ingestion**: Standardizes data from MongoDB into a `ClinicalState`.
 4.  **Benchmarking**: Automates the comparison between Generated Scores vs. Ground Truth (DAIC-WOZ labels).

 **Note:** A dedicated **Safety Guardrail** agent has been designed but is scoped out of this MVP. See `plans/safety-guardrail-architecture.md` for details.
@@ -50,20 +50,20 @@ graph TD
 *   **Deliverables**:
    *   `src/helia/agent/state.py`: Define `ClinicalState` (transcript, current_item, scores).
    *   `src/helia/agent/graph.py`: Define the main `StateGraph` with Ingestion -> Assessment -> Persistence nodes.
-    *   `src/helia/ingestion/loader.py`: Add "Ground Truth" loading for DAIC-WOZ labels (critical for benchmarking).
+    *   `src/helia/ingestion/loader.py`: Refactor to load Transcript documents from MongoDB.

 #### Phase 2: The "RISEN" Assessment Logic
 *   **Goal**: Replace monolithic `PHQ8Evaluator` with granular nodes.
 *   **Deliverables**:
-    *   `src/helia/agent/nodes/assessment.py`: Implement `extract_node`, `map_node`, `score_node`.
-    *   `src/helia/prompts/`: Create specialized prompt templates for each stage (optimized for Llama 3).
+    *   `src/helia/agent/nodes/assessment.py`: Implement `extract_node`, `map_node`, `score_node` that fetch prompts from DB.
+    *   `migrations/init_risen_prompts.py`: Database migration to seed the Extract/Map/Score prompts.
    *   **Refactor**: Update `PHQ8Evaluator` to be callable as a tool/node rather than a standalone class.

 #### Phase 3: Tier Switching & Execution
 *   **Goal**: Implement dynamic model config.
 *   **Deliverables**:
    *   `src/helia/configuration.py`: Ensure `RunConfig` (Tier 1/2/3) propagates to LangGraph `configurable` params.
-    *   `src/helia/agent/runner.py`: CLI entry point to run batch benchmarks.
+    *   `src/helia/agent/runner.py`: CLI entry point to run batch benchmarks using MongoDB transcripts.

 #### Phase 4: Human-in-the-Loop & Persistence
 *   **Goal**: Enable clinician review and data saving.
@@ -87,7 +87,7 @@ graph TD
 ## Dependencies & Risks
 - **Risk**: Local models (Tier 1) may hallucinate formatting in the "Map" stage.
    *   *Mitigation*: Use `instructor` or constrained decoding (JSON mode) for Tier 1.
- **Dependency**: Requires DAIC-WOZ dataset (assumed available locally or mocked).
+- **Dependency**: Requires DAIC-WOZ dataset (loaded in MongoDB).

 ## References
 - **LangGraph**: [State Management](https://langchain-ai.github.io/langgraph/concepts/high_level/#state)