--- name: ship-it description: AFK autopilot. Drives a shell loop that works through every ready issue in the tracker (GitHub via gh, Gitea via tea), implementing each vertical slice end-to-end and committing per issue. Status streams to the terminal so the human can tail progress locally and Ctrl-C anytime. The shell is the loop; each iteration dispatches one fresh headless Claude run to ship one issue. Use when the user invokes /ship-it, says "go AFK on this", "work the backlog", "ralph the issues", or "ship everything". --- # Ship It — AFK backlog autopilot **Mode: AFK.** No human in the loop. Does not ask questions mid-run. If a slice is undecidable, the iteration labels the issue `needs-decision` and the loop moves on. The human gets one summary at the end, not chatter during. ## How this works (read before invoking) The actual loop runs in a shell script: `.claude/skills/ship-it/loop.sh`. **The shell is the loop**, not you. Each iteration shells out to a fresh, headless `claude -p` invocation that processes exactly one issue using `.claude/skills/ship-it/iterate.md` as its prompt. Three reasons this design beats "LLM keeps going inside one session": 1. **Fresh context per issue.** No drift, no accumulated history bloating the window. 2. **Visible in the terminal.** Progress streams to stdout and tees to a log file. The human can tail it from another shell, see commits land, and Ctrl-C cleanly. 3. **Survives session close.** Closing the interactive Claude window doesn't kill the loop. Re-attach by tailing the log. ## Files - `loop.sh` — orchestrator. Tracker detection, preflight, dispatch loop, status output, stop conditions, summary. - `iterate.md` — the prompt passed to each per-issue headless Claude. Read it; it defines what "shipped" means. - `SKILL.md` — this file. When the user invokes `/ship-it`, you bootstrap and hand off. ## When the user invokes /ship-it You (the interactive Claude) do the bootstrap, not the work. Concretely: 1. **Preflight in chat** (catches the obvious failures before the script runs): - `git status --porcelain` empty? - On `main` (or `$SHIP_IT_TRUNK`)? Up-to-date with origin? - `gh auth status` (or tea token) returns 0? - `gh issue list --state open --label slice | wc -l` ≥ 1? 2. **Show the plan** in one short block: tracker host, trunk branch, count of ready issues, the first 3 issue titles, the log path. Nothing more. 3. **Ask one question:** "Start? Reply `go`." This is the *only* human-in-the-loop checkpoint — kicking off AFK work is a real commitment, deserves an explicit ok. 4. **On `go`:** run the loop in the foreground so the user sees live output: ``` bash .claude/skills/ship-it/loop.sh ``` Do not background it. Do not pipe through anything that buffers. The user can Ctrl-C. 5. **While it runs:** stay silent. Don't interject. Don't "monitor" by re-reading logs in chat — the user has the terminal. 6. **When it exits:** read the final `==== ship-it summary ====` block from the log file, present it once with concrete next steps ("2 issues are `needs-decision` — open them to answer their questions?"). ## Following progress The script logs to stdout AND tees to `.ship-it-logs/run-.log`. Tail from another terminal: ```bash tail -f .ship-it-logs/run-*.log ``` Per-issue detail (everything the headless Claude did for that one issue) is in `.ship-it-logs/iter--.log` — useful for debugging a failed iteration. Commits land in git as the loop runs. Watch with: ```bash watch -n 5 'git log --oneline -10 origin/main' ``` ## Config (env vars, override before invoking) | Var | Default | Purpose | |---|---|---| | `SHIP_IT_MAX` | 50 | Hard cap on iterations per run | | `SHIP_IT_MAX_FAIL` | 3 | Consecutive failures before stop | | `SHIP_IT_TRUNK` | `main` | Trunk branch name | | `SHIP_IT_TIMEOUT` | `30m` | Per-issue timeout (kills the headless claude) | | `SHIP_IT_LOG_DIR` | `/.ship-it-logs` | Where logs go | ## What each iteration does (per `iterate.md`) For one issue: read it → branch from trunk → write failing e2e test at the outermost layer → implement layer by layer until the test passes → run the full suite → outermost-layer smoke check → commit (one commit, message ends `Closes #N`) → push → open PR with acceptance-criteria checkboxes + smoke evidence → wait for CI → merge if green and branch protection allows, else leave open for review → return to trunk → emit `ITERATION_RESULT:` line for the loop. **Commit per issue:** yes, exactly. One commit per slice, referenced to the issue, lands on the branch before the PR opens. The slice scope was made small in `/prd-to-issues` precisely so this is one tight commit, not a series. ## Stop conditions (in priority order) 1. **User Ctrl-C** → trap catches SIGINT, current step finishes cleanly, summary prints, exit 130. 2. **Backlog empty** (no ready issues) → exit 0. 3. **Three consecutive hard failures** → exit 1. Something systemic — bad dependency, branch protection blocking, flaky env. Surfaces for human review. 4. **Precondition violated mid-run** → exit non-zero with reason. ## What "ready" means (the loop's filter) An issue is `ready` iff: - State is open - Has label `slice` (filed by `/prd-to-issues`) - Does NOT have label `blocked`, `needs-decision`, or `ci-failed` - Is not a spike (spikes deliver decisions, not code — humans handle those) Issues are processed in number order — walking-skeleton first, as `/prd-to-issues` ordered them. ## Safety boundaries The headless Claude is launched with a tool allowlist that excludes destructive operations. It cannot: - Force-push or rewrite shared history - Bypass branch protection or skip CI hooks (`--no-verify`, `--admin`) - Auto-merge red or pending PRs (the iterate prompt forbids it, and CI gates back it up) - Modify CI/CD config or IaC unless the slice's `Slice — layers touched` line explicitly names that layer - Close issues without the outermost-layer smoke check passing - Assign people or change milestones/projects If something tries to push past these in practice (e.g. a slice "needs" a CI change to pass), it should fail the iteration with `needs-decision` and let a human approve the scope expansion. ## What not to do - **Don't drive the loop yourself by reading issues and implementing them inline.** The shell is the loop. If you're tempted to "just do this one in chat," stop and run the script. - **Don't background the script** so the user can keep chatting with you. The output IS the value. The user wants to watch it work. - **Don't summarize between iterations.** Chatter belongs in the final summary, not after each commit. - **Don't tag the user in PR/issue comments** during the run. They're not in the loop until the script exits. - **Don't restart a failed iteration manually.** The loop's `needs-decision` and `ci-failed` labels are how failures stay in the tracker for human triage. Manual restart skips that. ## How this fits the chain `/grill-me ` (together) → `/prd` (together) → `/prd-to-issues` (mostly together, file step needs `create`) → `/ship-it` (AFK). The four-skill arc takes a vague feature idea to merged code with one human checkpoint per phase boundary.