diff --git a/.claude/commands/helia-dev.md b/.claude/commands/helia-dev.md new file mode 100644 index 0000000..053c5b5 --- /dev/null +++ b/.claude/commands/helia-dev.md @@ -0,0 +1,7 @@ +--- +description: Work on the Helia codebase with thesis alignment and strict standards +argument-hint: [task description] +allowed-tools: Skill(helia-dev) +--- + +Invoke the helia-dev skill for: $ARGUMENTS diff --git a/.claude/skills/helia-dev/SKILL.md b/.claude/skills/helia-dev/SKILL.md new file mode 100644 index 0000000..f777955 --- /dev/null +++ b/.claude/skills/helia-dev/SKILL.md @@ -0,0 +1,54 @@ +--- +name: helia-dev +description: Aligns development tasks with Helia's thesis goals (Local vs Cloud) and enforces project standards (Ruff, Pyrefly). Use when working on the Helia codebase. +--- + + +To ensure all development work on the Helia codebase aligns with the Bachelor Thesis goals (Local vs. Cloud benchmark, Privacy-First) and adheres to strict code quality standards (ruff, ty). + + + +Always invoke this skill when starting a new task in the Helia repository. It grounds the agent in the research context and enforces the "Quality Contract". + + + +## Research Context Awareness +Before performing any task, the agent MUST understand: +1. **Goal**: We are benchmarking Local Quantized LLMs vs. Cloud LLMs for PHQ-8 assessment. +2. **Constraint 1 (Privacy)**: Data processing must support on-premise execution. +3. **Constraint 2 (Modularity)**: The system must allow easy swapping of Model Tiers (1-3). +4. **Constraint 3 (Persistence)**: `AssessmentResult` is the source of truth for experiments. + +## Tooling Standards +All code changes must pass: +1. **Linting/Formatting**: `uv run ruff check .` and `uv run ruff format .` +2. **Type Checking**: `uv run ty check` + + + +1. **Read Context**: + - Read `CLAUDE.md` to load the latest project status and thesis requirements. + - (Optional) Read `documents/Bachelor Thesis Exposé - Santiago Martinez-Avial.md` if deep research context is needed. + +2. **Execute Task**: + - Perform the requested engineering task (Feature, Bugfix, Refactor). + - **Critical**: Ensure any architectural changes support the 3-Tier Model Strategy (Local, Self-Hosted, Cloud). + +3. **Enforce Quality**: + - Run `uv run ruff format .` to fix formatting. + - Run `uv run ruff check . --fix` to fix linting errors. + - Run `uv run ty check` to ensure type safety. + - **Fix any errors** found by these tools before declaring the task complete. + +4. **Verify Alignment**: + - Check: Does the change break the "swappable model" architecture? + - Check: Does the change introduce any hard dependencies on cloud services (violating Privacy-First)? + + + +- Task is completed. +- Code passes `ruff` (lint/format). +- Code passes `ty` (types). +- Architecture remains modular (supports Tiers 1-3). +- `CLAUDE.md` is updated if the task changed the architecture or standards. +