# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Context: Bachelor Thesis **Title:** A Modular Agent Framework for Therapeutic Interview Analysis **Goal:** systematically compare local-first/on-premise LLMs against cloud-based state-of-the-art models for a specific therapeutic task (PHQ-8 assessment). **Core Hypothesis:** Small quantized language models running locally can provide analytical performance comparable to large cloud models when supported by an appropriate agentic framework. **Key Requirements:** - **Privacy-First:** Architecture must support local/on-premise execution to address clinical data privacy concerns. - **Modularity:** The system must allow easy swapping of underlying models (Tier 1: Local, Tier 2: Self-hosted, Tier 3: Cloud). - **Benchmark:** The system is evaluated on its ability to accurately map therapy transcripts to PHQ-8 (Patient Health Questionnaire) scores using the DAIC-WOZ dataset. ## Commands - **Install Dependencies**: `uv sync` - **Run Agent**: `python -m helia.main "Your query here"` - **Verify Prompts**: `python scripts/verify_prompt_db.py` - **Lint**: `uv run ruff check .` - **Format**: `uv run ruff format .` - **Type Check**: `uv run ty check` - **Test**: *No test suite currently exists. (Priority Roadmap item)* ## Architecture Helia is a modular ReAct-style agent framework designed for clinical interview analysis. ### Core Modules 1. **Ingestion** (`src/helia/ingestion/`): - **Parser**: `TranscriptParser` parses clinical interview transcripts (e.g., DAIC-WOZ dataset). - **Loader**: `ClinicalDataLoader` in `loader.py` retrieves `Transcript` documents from MongoDB. - **Legacy**: `S3DatasetLoader` (deprecated for runtime use, used for initial population). 2. **Data Models** (`src/helia/models/`): - **Transcript**: Document model for interview transcripts. - **Utterance/Turn**: Standardized conversation units. - **Prompt**: Manages prompt templates and versioning. 3. **Assessment** (`src/helia/assessment/`): - **Evaluator**: `PHQ8Evaluator` (in `core.py`) orchestrates the LLM interaction. - **Logic**: Implements clinical logic for standard instruments (e.g., PHQ-8). - **Schema**: `src/helia/assessment/schema.py` defines `AssessmentResult` and `Evidence`. 4. **Persistence Layer** (`src/helia/db.py`): - **Document-Based Storage**: Uses MongoDB with Beanie (ODM). - **Data Capture**: Stores full context (Configuration, Evidence, Outcomes) to support comparative analysis. 5. **Agent Workflow** (`src/helia/agent/`): - **Graph Architecture**: Implements RISEN pattern (Extract -> Map -> Score) using LangGraph in `src/helia/agent/graph.py`. - **State**: `ClinicalState` (in `state.py`) manages transcript, scores, and execution status. - **Nodes**: Specialized logic in `src/helia/agent/nodes/` (assessment, persistence). - **Execution**: Run benchmarks via `python -m helia.agent.runner `. ## Development Standards - **Environment**: Requires `OPENAI_API_KEY` and MongoDB credentials. - **Configuration**: managed via Pydantic models in `src/helia/configuration.py`. - **Python**: Uses implicit namespace packages. `__init__.py` files may be missing by design in some subdirectories. - **Code Style**: Follows PEP 8. Enforced via `ruff`. - **Security**: Do not commit secrets. Avoid hardcoding model parameters; use configuration injection to support the comparative benchmark (Tiers 1-3).