2.7 KiB
2.7 KiB
Helia
Agentic Interview Framework for ingesting, analyzing, and querying transcript data.
Project Structure
src/helia/
├── agent/
│ └── workflow.py # LangGraph agent workflow
├── analysis/
│ └── extractor.py # LLM metadata extraction
├── graph/
│ ├── loader.py # Neo4j data loading
│ └── schema.py # Pydantic graph models
├── ingestion/
│ └── parser.py # Transcript parsing logic
└── main.py # CLI entry point
Data Flow
graph TD
A[Transcript File<br/>TSV/TXT] -->|TranscriptParser| B(Utterance Objects)
B -->|MetadataExtractor<br/>+ OpenAI LLM| C(Enriched UtteranceNodes)
C -->|GraphLoader| D[(Neo4j Database)]
E[User Question] -->|LangGraph Agent| F{Router}
F -->|Graph Tool| D
F -->|Vector Tool| G[(Vector Store)]
D --> H[Context]
G --> H
H -->|Synthesizer| I[Answer]
- Ingestion:
TranscriptParserreads TSV/txt files intoUtteranceobjects. - Analysis:
MetadataExtractorenriches utterances with sentiment and tone using LLMs. - Graph:
GraphLoaderpushes nodes and relationships to Neo4j database. - Agent: ReAct workflow queries graph/vector data to answer user questions.
Implemented Features
- Parse DAIC-WOZ transcripts and simple text formats.
- Extract metadata (sentiment, tone, speech acts) via OpenAI.
- Load
UtteranceandSpeakernodes into Neo4j. - Run basic LangGraph agent with planner and router.
Roadmap
- Add robust error handling for LLM API failures.
- Implement real
graph_toolandvector_toollogic. - Enhance agent planning capabilities.
- Add comprehensive test suite.
Installation
Install the package using uv.
uv pip install helia
Quick Start
Run the agent directly from the command line.
export OPENAI_API_KEY=sk-...
export NEO4J_URI=bolt://localhost:7687
export NEO4J_PASSWORD=password
python -m helia.main "How many interruptions occurred?"
Usage
Parse a transcript file programmatically.
from helia.ingestion.parser import TranscriptParser
from pathlib import Path
parser = TranscriptParser()
utterances = parser.parse(Path("transcript.tsv"))
Extract metadata from utterances.
from helia.analysis.extractor import MetadataExtractor
extractor = MetadataExtractor()
nodes = extractor.extract(utterances)
Load data into Neo4j.
from helia.graph.loader import GraphLoader
loader = GraphLoader()
loader.connect()
loader.load_utterances(nodes)
loader.close()
Contributing
Fork the project and submit a pull request.
License
This project is available as open source under the terms of the MIT License.