Skip to content

Pairs with: Lecture 03 — Why the Repository Must Become the System of Record. Time: ~60 min. Difficulty: Intermediate. Prerequisites: Module 02 checkpoint.

Module 03. Repo as System of Record

Why this module

After Module 02 your code works, but the repository does not explain itself. A new agent (or a teammate, or you in two weeks) opens it and sees src/ plus .noted/. Why does it exist? What is the contract? Where does new work go? Lecture 03 calls this absence a knowledge visibility gap — the proportion of project knowledge that lives in someone's head instead of on disk. This module closes that gap for noted-cli.

The test of success is the cold-start test: open the repo in a fresh terminal with no history and answer five questions using only the files. If you cannot, the agent cannot.

Concepts

  • System of record — the authoritative source of project truth. For a harness, that source must be the repository. Chat logs, Slack, and your own memory are not durable; the file tree is.
  • Cold-start test — open the repo in a brand-new agent session and ask five questions:
    1. What does this project do?
    2. How do I run it?
    3. How do I verify it works?
    4. What is currently in progress?
    5. What is the next thing to do?
  • Knowledge visibility gap — the share of those answers that are not derivable from files in the repo. Target is zero.
  • ACID for state — the same guarantees databases give to data, applied to harness state: atomic writes, consistent across files, isolated from in-memory edits, durable on disk.

→ Read Lecture 03 for the long-form treatment, including the empirical "rebuild cost" measurement protocol.

Lab

Step 1 — Add the routing entry file

Copy ../../resources/templates/AGENTS.md into your repo root, then make these two edits:

  • Replace the H1 with # AGENTS.md — noted-cli.
  • Replace the Required Artifacts list with the four files we are about to create:
md
## Required Artifacts

- `AGENTS.md` (you are here): routing file for any agent session
- `docs/ARCHITECTURE.md`: code map and data flow
- `docs/PRODUCT.md`: scope, non-goals, definition of done
- `feature_list.json`: source of truth for feature state (added in Module 08)

Leave the rest of the template intact. Module 04 will deliberately over-stuff this file so we can practice splitting it.

Step 2 — Write docs/ARCHITECTURE.md

sh
mkdir -p docs
md
# Architecture

`noted-cli` is a single Node 20 + TypeScript binary that ingests markdown
notes, builds a token index, and answers keyword queries.

## Layout

- `bin/noted` — executable shim, resolves `tsx` from local `node_modules`.
- `src/cli.ts` — argv dispatcher; the only file that calls `process.exit`.
- `src/commands/<verb>.ts` — one file per CLI verb, exports `run<Verb>(...)`.
- `src/store/types.ts` — JSON schema types for on-disk state.
- `src/store/io.ts` — read/write helpers for `.noted/*.json`.
- `.noted/` — local state directory; never committed.

## Data flow

markdown files .noted/notes.json .noted/index.json │ │ │ ▼ ▼ ▼ noted import ───────────► notes (id,path,title,body) ──► noted index ──► tokens │ ▼ noted ask (Module 04+)


## Boundaries

- `src/commands/*` may import from `src/store/*`, never the other way around.
- Only `src/cli.ts` is allowed to call `process.exit`. Everything else returns an
  exit code as a number.
- All file I/O goes through `src/store/io.ts`. No `fs` calls scattered through
  command files.

Save that as docs/ARCHITECTURE.md.

Step 3 — Write docs/PRODUCT.md

md
# Product

`noted-cli` is a personal research-notes tool. The intended user runs it
locally against a directory of markdown files and asks free-form questions.

## In scope

- Ingest markdown files from a directory tree.
- Maintain a flat token index over titles and bodies.
- Answer keyword queries with the top-k matching note ids and snippets.
- Track a `feature_list.json` and `PROGRESS.md` so any session can resume.

## Out of scope

- Embeddings, vector search, or any external API call.
- A GUI, a server, or any networked endpoint.
- Multi-user state or collaboration.
- Any storage format other than JSON in `.noted/`.

## Definition of Done (project-wide)

A change is done when:

1. `pnpm test` passes locally.
2. `./verify.sh` (added in Module 09) exits 0.
3. The `feature_list.json` entry for the change has `state: "passing"` and
   non-empty `evidence`.
4. `PROGRESS.md` records the change in its session log.

Save that as docs/PRODUCT.md.

Step 4 — Add a one-screen README.md

A new contributor (human or agent) hits the README.md first. Keep it tight.

md
# noted-cli

A tiny markdown-notes CLI. See [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md)
and [`docs/PRODUCT.md`](docs/PRODUCT.md) for design and scope. Agents start
at [`AGENTS.md`](AGENTS.md).

## Run

```sh
pnpm install
./bin/noted --help

Verify

sh
pnpm test

### Step 5 — Run the cold-start test on yourself

Close every editor tab. Open a *new* terminal in the repo root. Without using your memory, answer the five questions out loud, citing the file each answer comes from.

| Question                                  | Source file you cite |
|-------------------------------------------|---------------------|
| What does this project do?                | `README.md` → `docs/PRODUCT.md` |
| How do I run it?                          | `README.md`         |
| How do I verify it works?                 | `README.md` → `docs/PRODUCT.md` (Definition of Done) |
| What is currently in progress?            | `AGENTS.md` (Module 05 will add `PROGRESS.md`) |
| What is the next thing to do?             | `AGENTS.md` (Module 08 will add `feature_list.json`) |

Two answers point at "the file we add in Module 05/08." That is fine — Module 05 is when those answers stop being placeholders. Write those two gaps down in a comment on the cold-start checklist (next step) so you remember to close them.

### Step 6 — Record the test result

`docs/cold-start-log.md`:

```md
# Cold-start log

## 2025-MM-DD (after Module 03)

Ran the cold-start test. Q1, Q2, Q3 answered fully from files. Q4 and Q5
have placeholders pending Module 05 (`PROGRESS.md`) and Module 08
(`feature_list.json`).

Knowledge visibility gap (5 questions, 2 placeholders): 40%. Target by end
of Module 08: 0%.

Step 7 — Commit

sh
git add .
git commit -q -m "module-03: AGENTS.md, ARCHITECTURE.md, PRODUCT.md, cold-start log"

Verification

sh
test -f AGENTS.md && \
test -f docs/ARCHITECTURE.md && \
test -f docs/PRODUCT.md && \
test -f docs/cold-start-log.md && \
grep -q "Definition of Done" docs/PRODUCT.md && \
echo "M03 OK"

Expected:

M03 OK

Common pitfalls

  • Writing a 2,000-word AGENTS.md. Keep it routing-only; details belong in docs/. Module 04 explicitly attacks bloat.
  • Putting code rules in PRODUCT.md. PRODUCT.md is scope and non-goals. Architecture rules go in docs/ARCHITECTURE.md.
  • Cold-start test from memory. If you "remember" the answer, it is not in the repo. Open the file you would cite and verify the words are actually there.
  • Letting README.md and docs/PRODUCT.md drift. README points at docs/. Do not duplicate the content; link to it.

Next

Module 04 — Splitting Instructions. You will deliberately bloat AGENTS.md with everything the agent might need, then refactor it into a routing file plus topic docs.