WIP

2025-12-23 13:35:15 +01:00
parent a9346ccb34
commit 5ce6d7e1d3
12 changed files with 734 additions and 22 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -18,6 +18,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

 -   **Install Dependencies**: `uv sync`
 -   **Run Agent**: `python -m helia.main "Your query here"`
+-   **Verify Prompts**: `python scripts/verify_prompt_db.py`
 -   **Lint**: `uv run ruff check .`
 -   **Format**: `uv run ruff format .`
 -   **Type Check**: `uv run ty check`
@@ -25,32 +26,34 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

 ## Architecture

-Helia is a modular ReAct-style agent framework designed for clinical interview analysis:
+Helia is a modular ReAct-style agent framework designed for clinical interview analysis.
+
+### Core Modules

 1.  **Ingestion** (`src/helia/ingestion/`):
-    -   Parses clinical interview transcripts (e.g., DAIC-WOZ dataset).
-    -   Standardizes raw text/audio into `Utterance` objects.
+    -   **Parser**: `TranscriptParser` parses clinical interview transcripts (e.g., DAIC-WOZ dataset).
+    -   **Loader**: `ClinicalDataLoader` in `loader.py` retrieves `Transcript` documents from MongoDB.
+    -   **Legacy**: `S3DatasetLoader` (deprecated for runtime use, used for initial population).

-2.  **Analysis & Enrichment** (`src/helia/analysis/`):
-    -   **MetadataExtractor**: Enriches utterances with sentiment, tone, and speech acts.
-    -   **Model Agnostic**: Designed to swap backend LLMs (OpenAI vs. Local/Quantized models).
+2.  **Data Models** (`src/helia/models/`):
+    -   **Transcript**: Document model for interview transcripts.
+    -   **Utterance/Turn**: Standardized conversation units.
+    -   **Prompt**: Manages prompt templates and versioning.

 3.  **Assessment** (`src/helia/assessment/`):
-    -   Implements clinical logic for standard instruments (e.g., PHQ-8).
-    -   Maps unstructured dialogue to structured clinical scores.
+    -   **Evaluator**: `PHQ8Evaluator` (in `core.py`) orchestrates the LLM interaction.
+    -   **Logic**: Implements clinical logic for standard instruments (e.g., PHQ-8).
+    -   **Schema**: `src/helia/assessment/schema.py` defines `AssessmentResult` and `Evidence`.

 4.  **Persistence Layer** (`src/helia/db.py`):
    -   **Document-Based Storage**: Uses MongoDB with Beanie (ODM).
-    -   **Core Model**: `AssessmentResult` (in `src/helia/assessment/schema.py`) acts as the single source of truth for experimental results.
-    -   **Data Capture**: Stores the full context of each run:
-        -   **Configuration**: Model version, prompts, temperature (critical for comparing tiers).
-        -   **Evidence**: Specific quotes and reasoning supporting each PHQ-8 score.
-        -   **Outcome**: Final diagnosis and total scores.
+    -   **Data Capture**: Stores full context (Configuration, Evidence, Outcomes) to support comparative analysis.

 5.  **Agent Workflow** (`src/helia/agent/`):
-    -   Built with **LangGraph**.
-    -   **Router Pattern**: Decides when to call specific tools (search, scoring).
-    -   **Tools**: Clinical scoring utilities, Document retrieval.
+    -   **Graph Architecture**: Implements RISEN pattern (Extract -> Map -> Score) using LangGraph in `src/helia/agent/graph.py`.
+    -   **State**: `ClinicalState` (in `state.py`) manages transcript, scores, and execution status.
+    -   **Nodes**: Specialized logic in `src/helia/agent/nodes/` (assessment, persistence).
+    -   **Execution**: Run benchmarks via `python -m helia.agent.runner <run_id>`.

 ## Development Standards