CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Context: Bachelor Thesis

Title: A Modular Agent Framework for Therapeutic Interview Analysis Goal: systematically compare local-first/on-premise LLMs against cloud-based state-of-the-art models for a specific therapeutic task (PHQ-8 assessment).

Core Hypothesis: Small quantized language models running locally can provide analytical performance comparable to large cloud models when supported by an appropriate agentic framework.

Key Requirements:

Privacy-First: Architecture must support local/on-premise execution to address clinical data privacy concerns.
Modularity: The system must allow easy swapping of underlying models (Tier 1: Local, Tier 2: Self-hosted, Tier 3: Cloud).
Benchmark: The system is evaluated on its ability to accurately map therapy transcripts to PHQ-8 (Patient Health Questionnaire) scores using the DAIC-WOZ dataset.

Commands

Install Dependencies: uv sync
Run Agent: python -m helia.main "Your query here"
Verify Prompts: python scripts/verify_prompt_db.py
Lint: uv run ruff check .
Format: uv run ruff format .
Type Check: uv run ty check
Test: No test suite currently exists. (Priority Roadmap item)

Architecture

Helia is a modular ReAct-style agent framework designed for clinical interview analysis.

Core Modules

Ingestion (src/helia/ingestion/):
- Parser: TranscriptParser parses clinical interview transcripts (e.g., DAIC-WOZ dataset).
- Loader: ClinicalDataLoader in loader.py retrieves Transcript documents from MongoDB.
- Legacy: S3DatasetLoader (deprecated for runtime use, used for initial population).
Data Models (src/helia/models/):
- Transcript: Document model for interview transcripts.
- Utterance/Turn: Standardized conversation units.
- Prompt: Manages prompt templates and versioning.
Assessment (src/helia/assessment/):
- Evaluator: PHQ8Evaluator (in core.py) orchestrates the LLM interaction.
- Logic: Implements clinical logic for standard instruments (e.g., PHQ-8).
- Schema: src/helia/assessment/schema.py defines AssessmentResult and Evidence.
Persistence Layer (src/helia/db.py):
- Document-Based Storage: Uses MongoDB with Beanie (ODM).
- Data Capture: Stores full context (Configuration, Evidence, Outcomes) to support comparative analysis.
Agent Workflow (src/helia/agent/):
- Graph Architecture: Implements RISEN pattern (Extract -> Map -> Score) using LangGraph in src/helia/agent/graph.py.
- State: ClinicalState (in state.py) manages transcript, scores, and execution status.
- Nodes: Specialized logic in src/helia/agent/nodes/ (assessment, persistence).
- Execution: Run benchmarks via python -m helia.agent.runner <run_id>.

Development Standards

Environment: Requires OPENAI_API_KEY and MongoDB credentials.
Configuration: managed via Pydantic models in src/helia/configuration.py.
Python: Uses implicit namespace packages. __init__.py files may be missing by design in some subdirectories.
Code Style: Follows PEP 8. Enforced via ruff.
Security: Do not commit secrets. Avoid hardcoding model parameters; use configuration injection to support the comparative benchmark (Tiers 1-3).

3.5 KiB Raw Permalink Blame History