Files
helia/CLAUDE.md
Santiago Martinez-Avial 5ce6d7e1d3 WIP
2025-12-23 13:35:15 +01:00

3.5 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Context: Bachelor Thesis

Title: A Modular Agent Framework for Therapeutic Interview Analysis Goal: systematically compare local-first/on-premise LLMs against cloud-based state-of-the-art models for a specific therapeutic task (PHQ-8 assessment).

Core Hypothesis: Small quantized language models running locally can provide analytical performance comparable to large cloud models when supported by an appropriate agentic framework.

Key Requirements:

  • Privacy-First: Architecture must support local/on-premise execution to address clinical data privacy concerns.
  • Modularity: The system must allow easy swapping of underlying models (Tier 1: Local, Tier 2: Self-hosted, Tier 3: Cloud).
  • Benchmark: The system is evaluated on its ability to accurately map therapy transcripts to PHQ-8 (Patient Health Questionnaire) scores using the DAIC-WOZ dataset.

Commands

  • Install Dependencies: uv sync
  • Run Agent: python -m helia.main "Your query here"
  • Verify Prompts: python scripts/verify_prompt_db.py
  • Lint: uv run ruff check .
  • Format: uv run ruff format .
  • Type Check: uv run ty check
  • Test: No test suite currently exists. (Priority Roadmap item)

Architecture

Helia is a modular ReAct-style agent framework designed for clinical interview analysis.

Core Modules

  1. Ingestion (src/helia/ingestion/):

    • Parser: TranscriptParser parses clinical interview transcripts (e.g., DAIC-WOZ dataset).
    • Loader: ClinicalDataLoader in loader.py retrieves Transcript documents from MongoDB.
    • Legacy: S3DatasetLoader (deprecated for runtime use, used for initial population).
  2. Data Models (src/helia/models/):

    • Transcript: Document model for interview transcripts.
    • Utterance/Turn: Standardized conversation units.
    • Prompt: Manages prompt templates and versioning.
  3. Assessment (src/helia/assessment/):

    • Evaluator: PHQ8Evaluator (in core.py) orchestrates the LLM interaction.
    • Logic: Implements clinical logic for standard instruments (e.g., PHQ-8).
    • Schema: src/helia/assessment/schema.py defines AssessmentResult and Evidence.
  4. Persistence Layer (src/helia/db.py):

    • Document-Based Storage: Uses MongoDB with Beanie (ODM).
    • Data Capture: Stores full context (Configuration, Evidence, Outcomes) to support comparative analysis.
  5. Agent Workflow (src/helia/agent/):

    • Graph Architecture: Implements RISEN pattern (Extract -> Map -> Score) using LangGraph in src/helia/agent/graph.py.
    • State: ClinicalState (in state.py) manages transcript, scores, and execution status.
    • Nodes: Specialized logic in src/helia/agent/nodes/ (assessment, persistence).
    • Execution: Run benchmarks via python -m helia.agent.runner <run_id>.

Development Standards

  • Environment: Requires OPENAI_API_KEY and MongoDB credentials.
  • Configuration: managed via Pydantic models in src/helia/configuration.py.
  • Python: Uses implicit namespace packages. __init__.py files may be missing by design in some subdirectories.
  • Code Style: Follows PEP 8. Enforced via ruff.
  • Security: Do not commit secrets. Avoid hardcoding model parameters; use configuration injection to support the comparative benchmark (Tiers 1-3).