Pairs with: Lecture 09 — Why Agents Declare Victory Too Early and Lecture 10 — Why End-to-End Testing Changes Results. Time: ~90 min. Difficulty: Advanced. Prerequisites: Module 08 checkpoint.
Module 09. Three-Layer Termination
Why this module
Each feature in feature_list.json has its own verification command. That is necessary but not sufficient: a project ships when every layer is green at the same time, against the real binary, on a cold environment. Lecture 09 calls the failure mode premature completion — agents declare done after a single layer (usually unit tests) passes, while integration is broken. Lecture 10 sharpens the fix: the only verification that counts is the full pipeline running the real entry point.
This module builds verify.sh, the single command that proves the project ships. It runs three layers — lint/typecheck → unit tests → end-to-end against the binary — and exits non-zero if any one fails.
Concepts
- Three-layer termination check — sequence of checks ordered cheapest-to-most-realistic:
- Layer 1: syntactic correctness — typecheck, lint. Fast, cheap, catches typos.
- Layer 2: unit/integration tests —
node --test. Fast, isolates components. - Layer 3: end-to-end — drives the real binary against real inputs, asserts on real outputs.
- Component boundary defects — bugs that live between components. Unit tests stub them; only end-to-end exercises them.
- Test adequacy gradient — unit ≤ integration ≤ end-to-end. Each layer is strictly stronger than the prior.
- Architecture boundary check — a verification step that asserts structural rules (e.g., "no file under
src/store/imports fromsrc/commands/"). Prevents the harness from rotting silently. - Review feedback promotion — every recurring code-review comment becomes a checked rule in
verify.sh. The harness gets stronger after each round.
→ Read Lecture 09 for the empirical confidence-calibration data and Lecture 10 for the test-adequacy gradient with case study.
Lab
Step 1 — Add Layer 1: typecheck and lint
Install the toolchain:
sh
pnpm add -D typescript@^5.4 eslint@^9 typescript-eslint@^7Create eslint.config.js:
js
import tseslint from "typescript-eslint";
export default tseslint.config(
...tseslint.configs.recommended,
{
rules: {
"no-console": "off",
"@typescript-eslint/no-unused-vars": ["error", { "argsIgnorePattern": "^_" }],
}
}
);Add the npm scripts in package.json:
json
"scripts": {
"typecheck": "tsc --noEmit -p tsconfig.json",
"lint": "eslint 'src/**/*.ts'",
"test": "tsx --test src/**/*.test.ts",
"e2e": "node --test --test-reporter spec test-e2e/**/*.test.mjs",
"verify": "./verify.sh",
"wip": "node scripts/wip.mjs"
}Step 2 — Add an architecture boundary check
scripts/check-boundaries.mjs:
js
import { readFile, readdir, stat } from "node:fs/promises";
import { join } from "node:path";
const RULES = [
{ dir: "src/store", forbid: ["src/commands"], reason: "store must not import commands" },
];
async function* walk(dir) {
for (const e of await readdir(dir, { withFileTypes: true })) {
const p = join(dir, e.name);
if (e.isDirectory()) yield* walk(p);
else if (p.endsWith(".ts")) yield p;
}
}
let violations = 0;
for (const rule of RULES) {
for await (const f of walk(rule.dir)) {
const src = await readFile(f, "utf8");
for (const bad of rule.forbid) {
const re = new RegExp(`from\\s+["'][^"']*${bad.replace("/", "/")}`);
if (re.test(src)) {
console.error(`BOUNDARY: ${f}: ${rule.reason}`);
violations++;
}
}
}
}
if (violations) process.exit(3);
console.log("boundaries OK");Step 3 — Add Layer 3: end-to-end test against the real binary
test-e2e/ask.test.mjs:
js
import { test } from "node:test";
import assert from "node:assert/strict";
import { execSync } from "node:child_process";
import { mkdirSync, writeFileSync, rmSync } from "node:fs";
import { mkdtempSync } from "node:fs";
import { tmpdir } from "node:os";
import { join } from "node:path";
const ROOT = process.cwd();
test("e2e: import then ask returns a citation line", () => {
const src = mkdtempSync(join(tmpdir(), "noted-e2e-"));
writeFileSync(join(src, "alpha.md"), "# Alpha\nthe alpha document");
writeFileSync(join(src, "bravo.md"), "# Bravo\nthe bravo document");
rmSync(".noted", { recursive: true, force: true });
execSync(`./bin/noted import ${src}`, { cwd: ROOT, stdio: "pipe" });
execSync(`./bin/noted index`, { cwd: ROOT, stdio: "pipe" });
const out = execSync(`./bin/noted ask alpha`, { cwd: ROOT }).toString();
assert.match(out, /^\[[a-f0-9]{12}\]/m, "expected a [<id>] line");
assert.match(out, /^\s+cite: /m, "expected a cite: line per result");
assert.match(out, /Alpha/, "expected the Alpha note to be ranked first");
});
test("e2e: --json shape", () => {
const out = execSync(`./bin/noted ask alpha --json`, { cwd: ROOT }).toString();
const parsed = JSON.parse(out);
assert.ok(Array.isArray(parsed.results));
assert.ok(parsed.results.length <= 3);
});Note: this test requires the --json flag from Module 08. If you skipped that feature, finish it now.
Step 4 — Write verify.sh
sh
cat > verify.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$ROOT_DIR"
step() { printf '\n==> %s\n' "$1"; }
step "Layer 1a: typecheck"
pnpm typecheck
step "Layer 1b: lint"
pnpm lint
step "Layer 1c: architecture boundaries"
node scripts/check-boundaries.mjs
step "Layer 2: unit tests"
pnpm test
step "Layer 3: end-to-end"
pnpm e2e
step "Feature list integrity"
node scripts/wip.mjs status
echo
echo "verify: ALL LAYERS GREEN"
EOF
chmod +x verify.shStep 5 — Run it
sh
./verify.shIf any layer fails, the script prints which layer and exits non-zero. Fix the failure (do not bypass the check) and re-run. The reward for getting it green is the line verify: ALL LAYERS GREEN.
Step 6 — Wire verify.sh into the agent workflow
Update AGENTS.md's workflow:
md
## Workflow
1. `./init.sh` — must exit 0 before any work.
2. Read `PROGRESS.md` and pick the next item from `feature_list.json`.
3. `pnpm wip activate <id>`.
4. Implement.
5. `./verify.sh` — must be green before claiming done.
6. `pnpm wip pass <id>`.
7. Update `PROGRESS.md`, commit.Update the routing file's hard-constraints block to include:
md
3. **Three layers, every time.** A feature is not done until `./verify.sh` exits 0.Step 7 — Promote one review comment into a check
This is the key habit Module 09 leaves you with. Pick one recurring nit you have caught yourself making in this course (likely candidates: "ts files use single quotes" or "no console.log in command files") and add it to eslint.config.js as a rule. The next agent that violates it gets stopped at Layer 1, not at code review.
Document it in docs/STYLE.md:
md
## Promoted from review
| Date | Reviewer comment | Lint rule |
|------------|-------------------------------------------|----------------------------|
| 2025-MM-DD | "src files use single quotes" | `quotes: ["error","single"]` |Step 8 — Update artifacts and commit
PROGRESS.md Current State:
md
- Last verified: Module 09 checkpoint (`./verify.sh` exits 0, all three layers).sh
git add .
git commit -q -m "module-09: three-layer verify.sh + boundary + e2e + first promoted lint"Verification
sh
./verify.sh >/tmp/m09.log 2>&1 && \
grep -q "verify: ALL LAYERS GREEN" /tmp/m09.log && \
grep -q "Layer 1a: typecheck" /tmp/m09.log && \
grep -q "Layer 2: unit tests" /tmp/m09.log && \
grep -q "Layer 3: end-to-end" /tmp/m09.log && \
echo "M09 OK"Expected:
M09 OKCommon pitfalls
- Skipping Layer 3 because Layer 2 passes. Component-boundary defects only show up when the binary runs against real input. The whole point of Module 09 is to refuse to declare done until Layer 3 is green.
- Letting
verify.shexit 0 on a partial run.set -euo pipefailat the top is non-negotiable. Without it, an unset variable or piped failure silently passes the script. - Mocking the binary in Layer 3. Layer 3 must call
./bin/notedas a subprocess. If you find yourself importing TypeScript modules into the e2e test, you have built another Layer 2. - Adding a "skip on CI" flag for the boundary check. Don't. Either fix the boundary violation or rewrite the rule. Skips compound.
Next
Module 10 — Observability and Clean State. The pipeline tells you green or red; it does not yet tell you why a session went the way it did. Module 10 adds runtime logs, a sprint contract, and the five-dimension clean-state exit.
