.claude/skills/ship-it/SKILL.md

---
name: ship-it
description: AFK autopilot. Drives a shell loop that works through every ready issue in the tracker (GitHub via gh, Gitea via tea), implementing each vertical slice end-to-end and committing per issue. Status streams to the terminal so the human can tail progress locally and Ctrl-C anytime. The shell is the loop; each iteration dispatches one fresh headless Claude run to ship one issue. Use when the user invokes /ship-it, says "go AFK on this", "work the backlog", "ralph the issues", or "ship everything".
---

# Ship It — AFK backlog autopilot

**Mode: AFK.** No human in the loop. Does not ask questions mid-run. If a slice is undecidable, the iteration labels the issue `needs-decision` and the loop moves on. The human gets one summary at the end, not chatter during.

## How this works (read before invoking)

The actual loop runs in a shell script: `.claude/skills/ship-it/loop.sh`. **The shell is the loop**, not you. Each iteration shells out to a fresh, headless `claude -p` invocation that processes exactly one issue using `.claude/skills/ship-it/iterate.md` as its prompt. Three reasons this design beats "LLM keeps going inside one session":

1. **Fresh context per issue.** No drift, no accumulated history bloating the window.
2. **Visible in the terminal.** Progress streams to stdout and tees to a log file. The human can tail it from another shell, see commits land, and Ctrl-C cleanly.
3. **Survives session close.** Closing the interactive Claude window doesn't kill the loop. Re-attach by tailing the log.

## Files

- `loop.sh` — orchestrator. Tracker detection, preflight, dispatch loop, status output, stop conditions, summary.
- `iterate.md` — the prompt passed to each per-issue headless Claude. Read it; it defines what "shipped" means.
- `SKILL.md` — this file. When the user invokes `/ship-it`, you bootstrap and hand off.

## When the user invokes /ship-it

You (the interactive Claude) do the bootstrap, not the work. Concretely:

1. **Preflight in chat** (catches the obvious failures before the script runs):
   - `git status --porcelain` empty?
   - On `main` (or `$SHIP_IT_TRUNK`)? Up-to-date with origin?
   - `gh auth status` (or tea token) returns 0?
   - `gh issue list --state open --label slice | wc -l` ≥ 1?
2. **Show the plan** in one short block: tracker host, trunk branch, count of ready issues, the first 3 issue titles, the log path. Nothing more.
3. **Ask one question:** "Start? Reply `go`." This is the *only* human-in-the-loop checkpoint — kicking off AFK work is a real commitment, deserves an explicit ok.
4. **On `go`:** run the loop in the foreground so the user sees live output:
   ```
   bash .claude/skills/ship-it/loop.sh
   ```
   Do not background it. Do not pipe through anything that buffers. The user can Ctrl-C.
5. **While it runs:** stay silent. Don't interject. Don't "monitor" by re-reading logs in chat — the user has the terminal.
6. **When it exits:** read the final `==== ship-it summary ====` block from the log file, present it once with concrete next steps ("2 issues are `needs-decision` — open them to answer their questions?").

## Following progress

The script logs to stdout AND tees to `.ship-it-logs/run-<RUN_ID>.log`. Tail from another terminal:

```bash
tail -f .ship-it-logs/run-*.log
```

Per-issue detail (everything the headless Claude did for that one issue) is in `.ship-it-logs/iter-<RUN_ID>-<ISSUE>.log` — useful for debugging a failed iteration.

Commits land in git as the loop runs. Watch with:

```bash
watch -n 5 'git log --oneline -10 origin/main'
```

## Config (env vars, override before invoking)

| Var | Default | Purpose |
|---|---|---|
| `SHIP_IT_MAX` | 50 | Hard cap on iterations per run |
| `SHIP_IT_MAX_FAIL` | 3 | Consecutive failures before stop |
| `SHIP_IT_TRUNK` | `main` | Trunk branch name |
| `SHIP_IT_TIMEOUT` | `30m` | Per-issue timeout (kills the headless claude) |
| `SHIP_IT_LOG_DIR` | `<repo>/.ship-it-logs` | Where logs go |

## What each iteration does (per `iterate.md`)

For one issue: read it → branch from trunk → write failing e2e test at the outermost layer → implement layer by layer until the test passes → run the full suite → outermost-layer smoke check → commit (one commit, message ends `Closes #N`) → push → open PR with acceptance-criteria checkboxes + smoke evidence → wait for CI → merge if green and branch protection allows, else leave open for review → return to trunk → emit `ITERATION_RESULT:` line for the loop.

**Commit per issue:** yes, exactly. One commit per slice, referenced to the issue, lands on the branch before the PR opens. The slice scope was made small in `/prd-to-issues` precisely so this is one tight commit, not a series.

## Stop conditions (in priority order)

1. **User Ctrl-C** → trap catches SIGINT, current step finishes cleanly, summary prints, exit 130.
2. **Backlog empty** (no ready issues) → exit 0.
3. **Three consecutive hard failures** → exit 1. Something systemic — bad dependency, branch protection blocking, flaky env. Surfaces for human review.
4. **Precondition violated mid-run** → exit non-zero with reason.

## What "ready" means (the loop's filter)

An issue is `ready` iff:
- State is open
- Has label `slice` (filed by `/prd-to-issues`)
- Does NOT have label `blocked`, `needs-decision`, or `ci-failed`
- Is not a spike (spikes deliver decisions, not code — humans handle those)

Issues are processed in number order — walking-skeleton first, as `/prd-to-issues` ordered them.

## Safety boundaries

The headless Claude is launched with a tool allowlist that excludes destructive operations. It cannot:

- Force-push or rewrite shared history
- Bypass branch protection or skip CI hooks (`--no-verify`, `--admin`)
- Auto-merge red or pending PRs (the iterate prompt forbids it, and CI gates back it up)
- Modify CI/CD config or IaC unless the slice's `Slice — layers touched` line explicitly names that layer
- Close issues without the outermost-layer smoke check passing
- Assign people or change milestones/projects

If something tries to push past these in practice (e.g. a slice "needs" a CI change to pass), it should fail the iteration with `needs-decision` and let a human approve the scope expansion.

## What not to do

- **Don't drive the loop yourself by reading issues and implementing them inline.** The shell is the loop. If you're tempted to "just do this one in chat," stop and run the script.
- **Don't background the script** so the user can keep chatting with you. The output IS the value. The user wants to watch it work.
- **Don't summarize between iterations.** Chatter belongs in the final summary, not after each commit.
- **Don't tag the user in PR/issue comments** during the run. They're not in the loop until the script exits.
- **Don't restart a failed iteration manually.** The loop's `needs-decision` and `ci-failed` labels are how failures stay in the tracker for human triage. Manual restart skips that.

## How this fits the chain

`/grill-me <feature>` (together) → `/prd` (together) → `/prd-to-issues` (mostly together, file step needs `create`) → `/ship-it` (AFK). The four-skill arc takes a vague feature idea to merged code with one human checkpoint per phase boundary.
chore(skills): add workflow chain — grill-me → prd → prd-to-issues → ship-it Four workflow skills that take a feature from fuzzy idea to merged code. Two human-in-the-loop phases (grill-me, prd), one mostly-together (prd-to-issues files only on explicit 'create'), and one AFK (ship-it). grill-me TOGETHER pressure-test the idea with hard interview questions prd TOGETHER synthesize PRD; gaps stay explicit, not papered over prd-to-issues MOSTLY thin vertical-slice issues with coverage matrix + per-issue Slice check; self-audits before showing ship-it AFK shell loop ships each slice end-to-end with one commit per issue, status streams to terminal, Ctrl-C-able, survives session close Vertical-slice principle throughout: every issue cuts end-to-end through every integration layer (no horizontal "do all the DB work first" issues). The AFK loop only ships against acceptance criteria already locked in by the PRD phase — autonomous code never runs against undefined contracts. ship-it tracker support: gh (GitHub) and tea (Gitea). For this repo, set SHIP_IT_TRUNK=development to override the main default. See .claude/skills/README.md for the full how-to and a worked example. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> 2026-05-21 16:27:15 +02:00			`---`
			`name: ship-it`
			description: AFK autopilot. Drives a shell loop that works through every ready issue in the tracker (GitHub via gh, Gitea via tea), implementing each vertical slice end-to-end and committing per issue. Status streams to the terminal so the human can tail progress locally and Ctrl-C anytime. The shell is the loop; each iteration dispatches one fresh headless Claude run to ship one issue. Use when the user invokes /ship-it, says "go AFK on this", "work the backlog", "ralph the issues", or "ship everything".
			`---`

			`# Ship It — AFK backlog autopilot`

			Mode: AFK. No human in the loop. Does not ask questions mid-run. If a slice is undecidable, the iteration labels the issue `needs-decision` and the loop moves on. The human gets one summary at the end, not chatter during.

			`## How this works (read before invoking)`

			The actual loop runs in a shell script: `.claude/skills/ship-it/loop.sh`. The shell is the loop, not you. Each iteration shells out to a fresh, headless `claude -p` invocation that processes exactly one issue using `.claude/skills/ship-it/iterate.md` as its prompt. Three reasons this design beats "LLM keeps going inside one session":

			`1. Fresh context per issue. No drift, no accumulated history bloating the window.`
			`2. Visible in the terminal. Progress streams to stdout and tees to a log file. The human can tail it from another shell, see commits land, and Ctrl-C cleanly.`
			`3. Survives session close. Closing the interactive Claude window doesn't kill the loop. Re-attach by tailing the log.`

			`## Files`

			- `loop.sh` — orchestrator. Tracker detection, preflight, dispatch loop, status output, stop conditions, summary.
			- `iterate.md` — the prompt passed to each per-issue headless Claude. Read it; it defines what "shipped" means.
			- `SKILL.md` — this file. When the user invokes `/ship-it`, you bootstrap and hand off.

			`## When the user invokes /ship-it`

			`You (the interactive Claude) do the bootstrap, not the work. Concretely:`

			`1. Preflight in chat (catches the obvious failures before the script runs):`
			- `git status --porcelain` empty?
			- On `main` (or `$SHIP_IT_TRUNK`)? Up-to-date with origin?
			- `gh auth status` (or tea token) returns 0?
			- `gh issue list --state open --label slice \| wc -l` ≥ 1?
			`2. Show the plan in one short block: tracker host, trunk branch, count of ready issues, the first 3 issue titles, the log path. Nothing more.`
			3. Ask one question: "Start? Reply `go`." This is the only human-in-the-loop checkpoint — kicking off AFK work is a real commitment, deserves an explicit ok.
			4. On `go`: run the loop in the foreground so the user sees live output:
			```
			`bash .claude/skills/ship-it/loop.sh`
			```
			`Do not background it. Do not pipe through anything that buffers. The user can Ctrl-C.`
			`5. While it runs: stay silent. Don't interject. Don't "monitor" by re-reading logs in chat — the user has the terminal.`
			6. When it exits: read the final `==== ship-it summary ====` block from the log file, present it once with concrete next steps ("2 issues are `needs-decision` — open them to answer their questions?").

			`## Following progress`

			The script logs to stdout AND tees to `.ship-it-logs/run-<RUN_ID>.log`. Tail from another terminal:

			```bash
			`tail -f .ship-it-logs/run-*.log`
			```

			Per-issue detail (everything the headless Claude did for that one issue) is in `.ship-it-logs/iter-<RUN_ID>-<ISSUE>.log` — useful for debugging a failed iteration.

			`Commits land in git as the loop runs. Watch with:`

			```bash
			`watch -n 5 'git log --oneline -10 origin/main'`
			```

			`## Config (env vars, override before invoking)`

			`\| Var \| Default \| Purpose \|`
			`\|---\|---\|---\|`
			\| `SHIP_IT_MAX` \| 50 \| Hard cap on iterations per run \|
			\| `SHIP_IT_MAX_FAIL` \| 3 \| Consecutive failures before stop \|
			\| `SHIP_IT_TRUNK` \| `main` \| Trunk branch name \|
			\| `SHIP_IT_TIMEOUT` \| `30m` \| Per-issue timeout (kills the headless claude) \|
			\| `SHIP_IT_LOG_DIR` \| `<repo>/.ship-it-logs` \| Where logs go \|

			## What each iteration does (per `iterate.md`)

			For one issue: read it → branch from trunk → write failing e2e test at the outermost layer → implement layer by layer until the test passes → run the full suite → outermost-layer smoke check → commit (one commit, message ends `Closes #N`) → push → open PR with acceptance-criteria checkboxes + smoke evidence → wait for CI → merge if green and branch protection allows, else leave open for review → return to trunk → emit `ITERATION_RESULT:` line for the loop.

			Commit per issue: yes, exactly. One commit per slice, referenced to the issue, lands on the branch before the PR opens. The slice scope was made small in `/prd-to-issues` precisely so this is one tight commit, not a series.

			`## Stop conditions (in priority order)`

			`1. User Ctrl-C → trap catches SIGINT, current step finishes cleanly, summary prints, exit 130.`
			`2. Backlog empty (no ready issues) → exit 0.`
			`3. Three consecutive hard failures → exit 1. Something systemic — bad dependency, branch protection blocking, flaky env. Surfaces for human review.`
			`4. Precondition violated mid-run → exit non-zero with reason.`

			`## What "ready" means (the loop's filter)`

			An issue is `ready` iff:
			`- State is open`
			- Has label `slice` (filed by `/prd-to-issues`)
			- Does NOT have label `blocked`, `needs-decision`, or `ci-failed`
			`- Is not a spike (spikes deliver decisions, not code — humans handle those)`

			Issues are processed in number order — walking-skeleton first, as `/prd-to-issues` ordered them.

			`## Safety boundaries`

			`The headless Claude is launched with a tool allowlist that excludes destructive operations. It cannot:`

			`- Force-push or rewrite shared history`
			- Bypass branch protection or skip CI hooks (`--no-verify`, `--admin`)
			`- Auto-merge red or pending PRs (the iterate prompt forbids it, and CI gates back it up)`
			- Modify CI/CD config or IaC unless the slice's `Slice — layers touched` line explicitly names that layer
			`- Close issues without the outermost-layer smoke check passing`
			`- Assign people or change milestones/projects`

			If something tries to push past these in practice (e.g. a slice "needs" a CI change to pass), it should fail the iteration with `needs-decision` and let a human approve the scope expansion.

			`## What not to do`

			`- Don't drive the loop yourself by reading issues and implementing them inline. The shell is the loop. If you're tempted to "just do this one in chat," stop and run the script.`
			`- Don't background the script so the user can keep chatting with you. The output IS the value. The user wants to watch it work.`
			`- Don't summarize between iterations. Chatter belongs in the final summary, not after each commit.`
			`- Don't tag the user in PR/issue comments during the run. They're not in the loop until the script exits.`
			- Don't restart a failed iteration manually. The loop's `needs-decision` and `ci-failed` labels are how failures stay in the tracker for human triage. Manual restart skips that.

			`## How this fits the chain`

			`/grill-me <feature>` (together) → `/prd` (together) → `/prd-to-issues` (mostly together, file step needs `create`) → `/ship-it` (AFK). The four-skill arc takes a vague feature idea to merged code with one human checkpoint per phase boundary.