3.6 KiB
Helia
A Modular Agent Framework for Therapeutic Interview Analysis
This project is the core implementation for the Bachelor Thesis: "Comparing Local, Self-Hosted, and Cloud LLM Deployments for Therapeutic Interview Analysis".
Project Context
Helia aims to bridge the gap between AI and mental healthcare by automating the analysis of diagnostic interviews. Specifically, it tests whether local, privacy-first LLMs can match the performance of cloud-based models when mapping therapy transcripts to standardized PHQ-8 (Patient Health Questionnaire) scores.
Project Structure
src/helia/
├── agent/
│ └── workflow.py # LangGraph agent & router
├── analysis/
│ └── extractor.py # Metadata extraction (LLM-agnostic)
├── assessment/
│ └── core.py # Clinical assessment logic (PHQ-8)
│ └── schema.py # Data models (AssessmentResult, PHQ8Item)
├── ingestion/
│ └── parser.py # Transcript parsing (DAIC-WOZ support)
├── db.py # MongoDB persistence layer
└── main.py # CLI entry point
Data Flow
graph TD
A[Transcript File<br/>TSV/TXT] -->|TranscriptParser| B(Utterance Objects)
B -->|MetadataExtractor<br/>+ LLM| C(Enriched Utterances)
C -->|Assessment Engine<br/>+ Clinical Logic| D(PHQ-8 Scores & Evidence)
D -->|Persistence Layer| E[(MongoDB / Beanie)]
U[User Query] -->|LangGraph Agent| R{Router}
R -->|Assessment Tool| D
R -->|Search Tool| E
- Ingestion:
TranscriptParserreads clinical interview files (e.g., DAIC-WOZ). - Analysis:
MetadataExtractorenriches data with sentiment/tone using interchangeable LLMs. - Assessment: The core engine maps dialogue to clinical criteria (PHQ-8), generating scores and citing evidence.
- Persistence: Results are stored as structured
AssessmentResultdocuments in MongoDB for analysis and benchmarking.
Implemented Features
- Modular LLM Backend: designed to switch between Cloud (OpenAI) and Local models (Tier 1-3 comparative benchmark).
- Clinical Parsing: Native support for DAIC-WOZ transcript formats.
- Structured Assessment: Maps unstructured conversation to validatable PHQ-8 scores.
- Document Persistence: Stores full experimental context (config + evidence + scores) in MongoDB using Beanie.
- Unified Configuration: Single YAML config with environment variable fallbacks for system settings.
Roadmap
- Comparative Benchmark: Run full evaluation across Local vs. Cloud tiers.
- Vector Search: Implement semantic search over transcript evidence.
- Test Suite: Add comprehensive tests for the assessment logic.
Installation
Install the package using uv.
uv sync
Quick Start
-
Configuration: Copy
example.config.yamltoconfig.yamland edit it.cp example.config.yaml config.yamlYou can set system settings (MongoDB, S3) in the YAML file, or use environment variables (e.g.,
HELIA_MONGO_URI,HELIA_S3_BUCKET). Provider API keys can be set in YAML or via{PROVIDER_NAME}_API_KEY(uppercased), e.g.ANTHROPIC_API_KEYforproviders.anthropic. -
Run an Assessment:
# Ensure MongoDB is running uv run python -m helia.main config.yaml
Development
- Linting:
uv run ruff check . - Formatting:
uv run ruff format . - Type Checking:
uv run ty check
License
This project is available as open source under the terms of the MIT License.