Hands-On Harness Engineering
AI coding agents are capable. The problem is making them reliable. This course teaches you how — by building a complete harness around a real Node.js CLI, module by module, from scratch.
No theory without practice. Every concept ships as working code.
Start building
Course
12 hands-on modules. Build a complete harness around a real project. Start here.
Projects
6 standalone practice environments. Apply harness patterns to real-world scenarios on your own.
Resource Library
Copy-ready templates — AGENTS.md, feature_list.json, claude-progress.md — drop them into any repo today.
Course modules
12 modules, each one shipping working code:
- Module 00 — Onboarding: Set up your environment. Clone the starter repo. Run the baseline agent.
- Module 01 — Why Models Need Harnesses: See a capable model fail on a simple task without constraints.
- Module 02 — Anatomy of a Harness: The four components every harness needs and why.
- Module 03 — Repo as Source of Truth: Make the repository the agent's single source of truth.
- Module 04 — Splitting Instructions: Separate what the agent must do from how it should work.
- Module 05 — Multi-Session Continuity: Keep state across sessions so the agent never loses context.
- Module 06 — Bootstrap Phase: Deterministic initialization before every agent run.
- Module 07 — Scope Boundaries & WIP=1: Prevent agents from working on multiple things at once.
- Module 08 — Feature Lists as Primitives: Atomic, verifiable work units the agent cannot misinterpret.
- Module 09 — Three-Layer Termination: Stop the agent from declaring victory prematurely.
- Module 10 — Observability & Clean State: Make the agent's runtime visible and debuggable.
- Module 11 — Capstone: Ablation Study: Remove harness components one by one and measure what breaks.
What you will build
By the end of Module 11, you have a working harness that:
- Constrains agent behavior with explicit, version-controlled rules — no guesswork.
- Maintains context across long-running, multi-session tasks without losing state.
- Stops the agent from declaring victory before the work is actually done.
- Verifies every change through a full-pipeline test suite before handoff.
- Makes runtime observable — logs, progress files, and rollback paths built in.
The core mechanism
A harness doesn't make the model smarter. It builds a closed-loop working system around it:
Reference material
For engineers who want the theory behind what the course builds:
- Lectures: 12 theoretical lectures covering the research and first principles behind each harness pattern.
- OpenAI: Harness engineering — leveraging Codex in an agent-first world
- Anthropic: Effective harnesses for long-running agents
- Anthropic: Harness design for long-running application development
- Awesome Harness Engineering
