Hands-On Harness Engineering

AI coding agents are capable. The problem is making them reliable. This course teaches you how — by building a complete harness around a real Node.js CLI, module by module, from scratch.

No theory without practice. Every concept ships as working code.

Start building

Course

12 hands-on modules. Build a complete harness around a real project. Start here.

Projects

6 standalone practice environments. Apply harness patterns to real-world scenarios on your own.

Resource Library

Copy-ready templates — AGENTS.md, feature_list.json, claude-progress.md — drop them into any repo today.

Course modules

12 modules, each one shipping working code:

Module 00 — Onboarding: Set up your environment. Clone the starter repo. Run the baseline agent.
Module 01 — Why Models Need Harnesses: See a capable model fail on a simple task without constraints.
Module 02 — Anatomy of a Harness: The four components every harness needs and why.
Module 03 — Repo as Source of Truth: Make the repository the agent's single source of truth.
Module 04 — Splitting Instructions: Separate what the agent must do from how it should work.
Module 05 — Multi-Session Continuity: Keep state across sessions so the agent never loses context.
Module 06 — Bootstrap Phase: Deterministic initialization before every agent run.
Module 07 — Scope Boundaries & WIP=1: Prevent agents from working on multiple things at once.
Module 08 — Feature Lists as Primitives: Atomic, verifiable work units the agent cannot misinterpret.
Module 09 — Three-Layer Termination: Stop the agent from declaring victory prematurely.
Module 10 — Observability & Clean State: Make the agent's runtime visible and debuggable.
Module 11 — Capstone: Ablation Study: Remove harness components one by one and measure what breaks.

What you will build

By the end of Module 11, you have a working harness that:

Constrains agent behavior with explicit, version-controlled rules — no guesswork.
Maintains context across long-running, multi-session tasks without losing state.
Stops the agent from declaring victory before the work is actually done.
Verifies every change through a full-pipeline test suite before handoff.
Makes runtime observable — logs, progress files, and rollback paths built in.

The core mechanism

A harness doesn't make the model smarter. It builds a closed-loop working system around it:

Reference material

For engineers who want the theory behind what the course builds:

Lectures: 12 theoretical lectures covering the research and first principles behind each harness pattern.
OpenAI: Harness engineering — leveraging Codex in an agent-first world
Anthropic: Effective harnesses for long-running agents
Anthropic: Harness design for long-running application development
Awesome Harness Engineering

Hands-On Harness Engineering ​

Start building ​

Course

Projects

Resource Library

Course modules ​

What you will build ​