Files
helia/IMPLEMENTATION_SUMMARY.md
Santiago Martinez-Avial 5ce6d7e1d3 WIP
2025-12-23 13:35:15 +01:00

2.9 KiB

Implementation Summary: Modular Agentic Framework

Overview

We have successfully implemented the core Agentic Framework for the PHQ-8 assessment benchmark. This architecture uses LangGraph to orchestrate a multi-stage reasoning process ("RISEN") and supports dynamic switching between Local (Tier 1) and Cloud (Tier 3) models. The system is fully integrated with the MongoDB infrastructure for data and prompts.

Completed Components

1. Agent Architecture (src/helia/agent/graph.py)

  • Implemented a StateGraph that manages the workflow lifecycle.
  • Nodes: ingestion, extract_evidence, map_criteria, score_item, human_review, persistence.
  • Routing: Conditional edges loop through the 8 PHQ-8 items before proceeding to review.
  • HITL: Configured MemorySaver to allow human-in-the-loop interrupts at the human_review stage.

2. State Management (src/helia/agent/state.py)

  • Created ClinicalState Pydantic model.
  • Strictly types the workflow memory, including:
    • transcript_text: The input data.
    • scores: A list of PHQ8ItemScore objects (accumulated via reducer).
    • current_item_index: Tracks progress through the 8 items.
    • current_evidence / current_reasoning: Transient fields for the RISEN loop.

3. RISEN Logic (src/helia/agent/nodes/assessment.py)

  • Refactored the monolithic evaluation logic into three granular nodes:
    1. Extract: Finds verbatim quotes for the specific symptom.
    2. Map: Aligns evidence to the 0-3 scoring criteria.
    3. Score: Assigns the final value and structured reasoning.
  • Prompt Management: Fetches prompts dynamically from the MongoDB Prompt collection using Prompt.find_one.

4. Runner & Config (src/helia/agent/runner.py)

  • Created a CLI entry point: python -m helia.agent.runner <run_id>.
  • Initializes the MongoDB connection via init_db.
  • Fetches all available Transcript documents from the database to run the benchmark.
  • Injects the specific RunConfig (Tier 1/2/3) into the graph's runtime configuration.

5. Ingestion (src/helia/ingestion/loader.py)

  • Added ClinicalDataLoader to abstract transcript fetching.
  • Loads directly from the Transcript Beanie document model in MongoDB.

6. Database Migrations

  • Created migrations/init_risen_prompts.py to seed the database with the required "RISEN" prompt templates (phq8-extract, phq8-map, phq8-score).

Usage

  1. Seed Prompts:

    python migrations/init_risen_prompts.py
    
  2. Run Agent:

    # Run with the default Tier 3 (Cloud) config (defined in config.yaml)
    python -m helia.agent.runner gemini-flash
    

Next Steps

  1. Safety: Implement the Safety Guardrail (parallel node) as designed in plans/safety-guardrail-architecture.md.
  2. Persistence: Uncomment the DB save logic in persistence_node to save AssessmentResult documents.