# Helia Agentic Interview Framework for ingesting, analyzing, and querying transcript data. ## Project Structure ``` src/helia/ ├── agent/ │ └── workflow.py # LangGraph agent workflow ├── analysis/ │ └── extractor.py # LLM metadata extraction ├── graph/ │ ├── loader.py # Neo4j data loading │ └── schema.py # Pydantic graph models ├── ingestion/ │ └── parser.py # Transcript parsing logic └── main.py # CLI entry point ``` ## Data Flow ```mermaid graph TD A[Transcript File
TSV/TXT] -->|TranscriptParser| B(Utterance Objects) B -->|MetadataExtractor
+ OpenAI LLM| C(Enriched UtteranceNodes) C -->|GraphLoader| D[(Neo4j Database)] E[User Question] -->|LangGraph Agent| F{Router} F -->|Graph Tool| D F -->|Vector Tool| G[(Vector Store)] D --> H[Context] G --> H H -->|Synthesizer| I[Answer] ``` 1. **Ingestion**: `TranscriptParser` reads TSV/txt files into `Utterance` objects. 2. **Analysis**: `MetadataExtractor` enriches utterances with sentiment and tone using LLMs. 3. **Graph**: `GraphLoader` pushes nodes and relationships to Neo4j database. 4. **Agent**: ReAct workflow queries graph/vector data to answer user questions. ## Implemented Features - Parse DAIC-WOZ transcripts and simple text formats. - Extract metadata (sentiment, tone, speech acts) via OpenAI. - Load `Utterance` and `Speaker` nodes into Neo4j. - Run basic LangGraph agent with planner and router. ## Roadmap - Add robust error handling for LLM API failures. - Implement real `graph_tool` and `vector_tool` logic. - Enhance agent planning capabilities. - Add comprehensive test suite. ## Installation Install the package using `uv`. ```sh uv pip install helia ``` ## Quick Start Run the agent directly from the command line. ```sh export OPENAI_API_KEY=sk-... export NEO4J_URI=bolt://localhost:7687 export NEO4J_PASSWORD=password python -m helia.main "How many interruptions occurred?" ``` ## Usage Parse a transcript file programmatically. ```python from helia.ingestion.parser import TranscriptParser from pathlib import Path parser = TranscriptParser() utterances = parser.parse(Path("transcript.tsv")) ``` Extract metadata from utterances. ```python from helia.analysis.extractor import MetadataExtractor extractor = MetadataExtractor() nodes = extractor.extract(utterances) ``` Load data into Neo4j. ```python from helia.graph.loader import GraphLoader loader = GraphLoader() loader.connect() loader.load_utterances(nodes) loader.close() ``` ## Contributing Fork the project and submit a pull request. ## License This project is available as open source under the terms of the [MIT License](LICENSE).