Skip to content

Pairs with: Lecture 09 — Why Agents Declare Victory Too Early and Lecture 10 — Why End-to-End Testing Changes Results. Time: ~90 min. Difficulty: Advanced. Prerequisites: Module 08 checkpoint.

Module 09. Three-Layer Termination

Why this module

Each feature in feature_list.json has its own verification command. That is necessary but not sufficient: a project ships when every layer is green at the same time, against the real binary, on a cold environment. Lecture 09 calls the failure mode premature completion — agents declare done after a single layer (usually unit tests) passes, while integration is broken. Lecture 10 sharpens the fix: the only verification that counts is the full pipeline running the real entry point.

This module builds verify.sh, the single command that proves the project ships. It runs three layers — lint/typecheck → unit tests → end-to-end against the binary — and exits non-zero if any one fails.

Concepts

  • Three-layer termination check — sequence of checks ordered cheapest-to-most-realistic:
    1. Layer 1: syntactic correctness — typecheck, lint. Fast, cheap, catches typos.
    2. Layer 2: unit/integration testsnode --test. Fast, isolates components.
    3. Layer 3: end-to-end — drives the real binary against real inputs, asserts on real outputs.
  • Component boundary defects — bugs that live between components. Unit tests stub them; only end-to-end exercises them.
  • Test adequacy gradient — unit ≤ integration ≤ end-to-end. Each layer is strictly stronger than the prior.
  • Architecture boundary check — a verification step that asserts structural rules (e.g., "no file under src/store/ imports from src/commands/"). Prevents the harness from rotting silently.
  • Review feedback promotion — every recurring code-review comment becomes a checked rule in verify.sh. The harness gets stronger after each round.

→ Read Lecture 09 for the empirical confidence-calibration data and Lecture 10 for the test-adequacy gradient with case study.

Lab

Step 1 — Add Layer 1: typecheck and lint

Install the toolchain:

sh
pnpm add -D typescript@^5.4 eslint@^9 typescript-eslint@^7

Create eslint.config.js:

js
import tseslint from "typescript-eslint";

export default tseslint.config(
  ...tseslint.configs.recommended,
  {
    rules: {
      "no-console": "off",
      "@typescript-eslint/no-unused-vars": ["error", { "argsIgnorePattern": "^_" }],
    }
  }
);

Add the npm scripts in package.json:

json
"scripts": {
  "typecheck": "tsc --noEmit -p tsconfig.json",
  "lint": "eslint 'src/**/*.ts'",
  "test": "tsx --test src/**/*.test.ts",
  "e2e": "node --test --test-reporter spec test-e2e/**/*.test.mjs",
  "verify": "./verify.sh",
  "wip": "node scripts/wip.mjs"
}

Step 2 — Add an architecture boundary check

scripts/check-boundaries.mjs:

js
import { readFile, readdir, stat } from "node:fs/promises";
import { join } from "node:path";

const RULES = [
  { dir: "src/store", forbid: ["src/commands"], reason: "store must not import commands" },
];

async function* walk(dir) {
  for (const e of await readdir(dir, { withFileTypes: true })) {
    const p = join(dir, e.name);
    if (e.isDirectory()) yield* walk(p);
    else if (p.endsWith(".ts")) yield p;
  }
}

let violations = 0;
for (const rule of RULES) {
  for await (const f of walk(rule.dir)) {
    const src = await readFile(f, "utf8");
    for (const bad of rule.forbid) {
      const re = new RegExp(`from\\s+["'][^"']*${bad.replace("/", "/")}`);
      if (re.test(src)) {
        console.error(`BOUNDARY: ${f}: ${rule.reason}`);
        violations++;
      }
    }
  }
}

if (violations) process.exit(3);
console.log("boundaries OK");

Step 3 — Add Layer 3: end-to-end test against the real binary

test-e2e/ask.test.mjs:

js
import { test } from "node:test";
import assert from "node:assert/strict";
import { execSync } from "node:child_process";
import { mkdirSync, writeFileSync, rmSync } from "node:fs";
import { mkdtempSync } from "node:fs";
import { tmpdir } from "node:os";
import { join } from "node:path";

const ROOT = process.cwd();

test("e2e: import then ask returns a citation line", () => {
  const src = mkdtempSync(join(tmpdir(), "noted-e2e-"));
  writeFileSync(join(src, "alpha.md"), "# Alpha\nthe alpha document");
  writeFileSync(join(src, "bravo.md"), "# Bravo\nthe bravo document");

  rmSync(".noted", { recursive: true, force: true });
  execSync(`./bin/noted import ${src}`, { cwd: ROOT, stdio: "pipe" });
  execSync(`./bin/noted index`, { cwd: ROOT, stdio: "pipe" });
  const out = execSync(`./bin/noted ask alpha`, { cwd: ROOT }).toString();

  assert.match(out, /^\[[a-f0-9]{12}\]/m, "expected a [<id>] line");
  assert.match(out, /^\s+cite: /m, "expected a cite: line per result");
  assert.match(out, /Alpha/, "expected the Alpha note to be ranked first");
});

test("e2e: --json shape", () => {
  const out = execSync(`./bin/noted ask alpha --json`, { cwd: ROOT }).toString();
  const parsed = JSON.parse(out);
  assert.ok(Array.isArray(parsed.results));
  assert.ok(parsed.results.length <= 3);
});

Note: this test requires the --json flag from Module 08. If you skipped that feature, finish it now.

Step 4 — Write verify.sh

sh
cat > verify.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail

ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$ROOT_DIR"

step() { printf '\n==> %s\n' "$1"; }

step "Layer 1a: typecheck"
pnpm typecheck

step "Layer 1b: lint"
pnpm lint

step "Layer 1c: architecture boundaries"
node scripts/check-boundaries.mjs

step "Layer 2: unit tests"
pnpm test

step "Layer 3: end-to-end"
pnpm e2e

step "Feature list integrity"
node scripts/wip.mjs status

echo
echo "verify: ALL LAYERS GREEN"
EOF
chmod +x verify.sh

Step 5 — Run it

sh
./verify.sh

If any layer fails, the script prints which layer and exits non-zero. Fix the failure (do not bypass the check) and re-run. The reward for getting it green is the line verify: ALL LAYERS GREEN.

Step 6 — Wire verify.sh into the agent workflow

Update AGENTS.md's workflow:

md
## Workflow

1. `./init.sh` — must exit 0 before any work.
2. Read `PROGRESS.md` and pick the next item from `feature_list.json`.
3. `pnpm wip activate <id>`.
4. Implement.
5. `./verify.sh` — must be green before claiming done.
6. `pnpm wip pass <id>`.
7. Update `PROGRESS.md`, commit.

Update the routing file's hard-constraints block to include:

md
3. **Three layers, every time.** A feature is not done until `./verify.sh` exits 0.

Step 7 — Promote one review comment into a check

This is the key habit Module 09 leaves you with. Pick one recurring nit you have caught yourself making in this course (likely candidates: "ts files use single quotes" or "no console.log in command files") and add it to eslint.config.js as a rule. The next agent that violates it gets stopped at Layer 1, not at code review.

Document it in docs/STYLE.md:

md
## Promoted from review

| Date       | Reviewer comment                          | Lint rule                  |
|------------|-------------------------------------------|----------------------------|
| 2025-MM-DD | "src files use single quotes"             | `quotes: ["error","single"]` |

Step 8 — Update artifacts and commit

PROGRESS.md Current State:

md
- Last verified: Module 09 checkpoint (`./verify.sh` exits 0, all three layers).
sh
git add .
git commit -q -m "module-09: three-layer verify.sh + boundary + e2e + first promoted lint"

Verification

sh
./verify.sh >/tmp/m09.log 2>&1 && \
grep -q "verify: ALL LAYERS GREEN" /tmp/m09.log && \
grep -q "Layer 1a: typecheck" /tmp/m09.log && \
grep -q "Layer 2: unit tests" /tmp/m09.log && \
grep -q "Layer 3: end-to-end" /tmp/m09.log && \
echo "M09 OK"

Expected:

M09 OK

Common pitfalls

  • Skipping Layer 3 because Layer 2 passes. Component-boundary defects only show up when the binary runs against real input. The whole point of Module 09 is to refuse to declare done until Layer 3 is green.
  • Letting verify.sh exit 0 on a partial run. set -euo pipefail at the top is non-negotiable. Without it, an unset variable or piped failure silently passes the script.
  • Mocking the binary in Layer 3. Layer 3 must call ./bin/noted as a subprocess. If you find yourself importing TypeScript modules into the e2e test, you have built another Layer 2.
  • Adding a "skip on CI" flag for the boundary check. Don't. Either fix the boundary violation or rewrite the rule. Skips compound.

Next

Module 10 — Observability and Clean State. The pipeline tells you green or red; it does not yet tell you why a session went the way it did. Module 10 adds runtime logs, a sprint contract, and the five-dimension clean-state exit.