diff --git a/.claude/commands/helia-dev.md b/.claude/commands/helia-dev.md
new file mode 100644
index 0000000..053c5b5
--- /dev/null
+++ b/.claude/commands/helia-dev.md
@@ -0,0 +1,7 @@
+---
+description: Work on the Helia codebase with thesis alignment and strict standards
+argument-hint: [task description]
+allowed-tools: Skill(helia-dev)
+---
+
+Invoke the helia-dev skill for: $ARGUMENTS
diff --git a/.claude/skills/helia-dev/SKILL.md b/.claude/skills/helia-dev/SKILL.md
new file mode 100644
index 0000000..f777955
--- /dev/null
+++ b/.claude/skills/helia-dev/SKILL.md
@@ -0,0 +1,54 @@
+---
+name: helia-dev
+description: Aligns development tasks with Helia's thesis goals (Local vs Cloud) and enforces project standards (Ruff, Pyrefly). Use when working on the Helia codebase.
+---
+
+<objective>
+To ensure all development work on the Helia codebase aligns with the Bachelor Thesis goals (Local vs. Cloud benchmark, Privacy-First) and adheres to strict code quality standards (ruff, ty).
+</objective>
+
+<quick_start>
+Always invoke this skill when starting a new task in the Helia repository. It grounds the agent in the research context and enforces the "Quality Contract".
+</quick_start>
+
+<context_alignment>
+## Research Context Awareness
+Before performing any task, the agent MUST understand:
+1.  **Goal**: We are benchmarking Local Quantized LLMs vs. Cloud LLMs for PHQ-8 assessment.
+2.  **Constraint 1 (Privacy)**: Data processing must support on-premise execution.
+3.  **Constraint 2 (Modularity)**: The system must allow easy swapping of Model Tiers (1-3).
+4.  **Constraint 3 (Persistence)**: `AssessmentResult` is the source of truth for experiments.
+
+## Tooling Standards
+All code changes must pass:
+1.  **Linting/Formatting**: `uv run ruff check .` and `uv run ruff format .`
+2.  **Type Checking**: `uv run ty check`
+</context_alignment>
+
+<process>
+1.  **Read Context**:
+    -   Read `CLAUDE.md` to load the latest project status and thesis requirements.
+    -   (Optional) Read `documents/Bachelor Thesis Exposé - Santiago Martinez-Avial.md` if deep research context is needed.
+
+2.  **Execute Task**:
+    -   Perform the requested engineering task (Feature, Bugfix, Refactor).
+    -   **Critical**: Ensure any architectural changes support the 3-Tier Model Strategy (Local, Self-Hosted, Cloud).
+
+3.  **Enforce Quality**:
+    -   Run `uv run ruff format .` to fix formatting.
+    -   Run `uv run ruff check . --fix` to fix linting errors.
+    -   Run `uv run ty check` to ensure type safety.
+    -   **Fix any errors** found by these tools before declaring the task complete.
+
+4.  **Verify Alignment**:
+    -   Check: Does the change break the "swappable model" architecture?
+    -   Check: Does the change introduce any hard dependencies on cloud services (violating Privacy-First)?
+</process>
+
+<success_criteria>
+-   Task is completed.
+-   Code passes `ruff` (lint/format).
+-   Code passes `ty` (types).
+-   Architecture remains modular (supports Tiers 1-3).
+-   `CLAUDE.md` is updated if the task changed the architecture or standards.
+</success_criteria>