3.3 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Context: Bachelor Thesis
Title: A Modular Agent Framework for Therapeutic Interview Analysis Goal: systematically compare local-first/on-premise LLMs against cloud-based state-of-the-art models for a specific therapeutic task (PHQ-8 assessment).
Core Hypothesis: Small quantized language models running locally can provide analytical performance comparable to large cloud models when supported by an appropriate agentic framework.
Key Requirements:
- Privacy-First: Architecture must support local/on-premise execution to address clinical data privacy concerns.
- Modularity: The system must allow easy swapping of underlying models (Tier 1: Local, Tier 2: Self-hosted, Tier 3: Cloud).
- Benchmark: The system is evaluated on its ability to accurately map therapy transcripts to PHQ-8 (Patient Health Questionnaire) scores using the DAIC-WOZ dataset.
Commands
- Install Dependencies:
uv sync - Run Agent:
python -m helia.main "Your query here" - Lint:
uv run ruff check . - Format:
uv run ruff format . - Type Check:
uv run ty check - Test: No test suite currently exists. (Priority Roadmap item)
Architecture
Helia is a modular ReAct-style agent framework designed for clinical interview analysis:
-
Ingestion (
src/helia/ingestion/):- Parses clinical interview transcripts (e.g., DAIC-WOZ dataset).
- Standardizes raw text/audio into
Utteranceobjects.
-
Analysis & Enrichment (
src/helia/analysis/):- MetadataExtractor: Enriches utterances with sentiment, tone, and speech acts.
- Model Agnostic: Designed to swap backend LLMs (OpenAI vs. Local/Quantized models).
-
Assessment (
src/helia/assessment/):- Implements clinical logic for standard instruments (e.g., PHQ-8).
- Maps unstructured dialogue to structured clinical scores.
-
Persistence Layer (
src/helia/db.py):- Document-Based Storage: Uses MongoDB with Beanie (ODM).
- Core Model:
AssessmentResult(insrc/helia/assessment/schema.py) acts as the single source of truth for experimental results. - Data Capture: Stores the full context of each run:
- Configuration: Model version, prompts, temperature (critical for comparing tiers).
- Evidence: Specific quotes and reasoning supporting each PHQ-8 score.
- Outcome: Final diagnosis and total scores.
-
Agent Workflow (
src/helia/agent/):- Built with LangGraph.
- Router Pattern: Decides when to call specific tools (search, scoring).
- Tools: Clinical scoring utilities, Document retrieval.
Development Standards
- Environment: Requires
OPENAI_API_KEYand MongoDB credentials. - Configuration: managed via Pydantic models in
src/helia/configuration.py. - Python: Uses implicit namespace packages.
__init__.pyfiles may be missing by design in some subdirectories. - Code Style: Follows PEP 8. Enforced via
ruff. - Security: Do not commit secrets. Avoid hardcoding model parameters; use configuration injection to support the comparative benchmark (Tiers 1-3).