Helia

A Modular Agent Framework for Therapeutic Interview Analysis

This project is the core implementation for the Bachelor Thesis: "Comparing Local, Self-Hosted, and Cloud LLM Deployments for Therapeutic Interview Analysis".

Project Context

Helia aims to bridge the gap between AI and mental healthcare by automating the analysis of diagnostic interviews. Specifically, it tests whether local, privacy-first LLMs can match the performance of cloud-based models when mapping therapy transcripts to standardized PHQ-8 (Patient Health Questionnaire) scores.

Project Structure

src/helia/
├── agent/
│   └── workflow.py      # LangGraph agent & router
├── analysis/
│   └── extractor.py     # Metadata extraction (LLM-agnostic)
├── assessment/
│   ├── core.py          # Clinical assessment logic (PHQ-8)
│   └── schema.py        # Data models (AssessmentResult, PHQ8Item)
├── ingestion/
│   └── parser.py        # Transcript parsing (DAIC-WOZ support)
├── db.py                # MongoDB persistence layer
└── main.py              # CLI entry point

Data Flow

graph TD
    A[Transcript File<br/>TSV/TXT] -->|TranscriptParser| B(Utterance Objects)
    B -->|MetadataExtractor<br/>+ LLM| C(Enriched Utterances)
    C -->|Assessment Engine<br/>+ Clinical Logic| D(PHQ-8 Scores & Evidence)
    D -->|Persistence Layer| E[(MongoDB / Beanie)]

    U[User Query] -->|LangGraph Agent| R{Router}
    R -->|Assessment Tool| D
    R -->|Search Tool| E

Ingestion: TranscriptParser reads clinical interview files (e.g., DAIC-WOZ).
Analysis: MetadataExtractor enriches data with sentiment/tone using interchangeable LLMs.
Assessment: The core engine maps dialogue to clinical criteria (PHQ-8), generating scores and citing evidence.
Persistence: Results are stored as structured AssessmentResult documents in MongoDB for analysis and benchmarking.

Implemented Features

Modular LLM Backend: designed to switch between Cloud (OpenAI) and Local models (Tier 1-3 comparative benchmark).
Clinical Parsing: Native support for DAIC-WOZ transcript formats.
Structured Assessment: Maps unstructured conversation to validatable PHQ-8 scores.
Document Persistence: Stores full experimental context (config + evidence + scores) in MongoDB using Beanie.

Roadmap

Comparative Benchmark: Run full evaluation across Local vs. Cloud tiers.
Vector Search: Implement semantic search over transcript evidence.
Test Suite: Add comprehensive tests for the assessment logic.

Installation

Install the package using uv.

uv sync

Quick Start

Environment Setup:

export OPENAI_API_KEY=sk-...
# Ensure MongoDB is running (e.g., via Docker)

Run an Assessment:

python -m helia.main "assess --input data/transcript.tsv"

Development

Linting: uv run ruff check .
Formatting: uv run ruff format .
Type Checking: uv run pyrefly

License

This project is available as open source under the terms of the MIT License.

3.2 KiB Raw Blame History