Four workflow skills that take a feature from fuzzy idea to merged code.
Two human-in-the-loop phases (grill-me, prd), one mostly-together (prd-to-issues
files only on explicit 'create'), and one AFK (ship-it).
grill-me TOGETHER pressure-test the idea with hard interview questions
prd TOGETHER synthesize PRD; gaps stay explicit, not papered over
prd-to-issues MOSTLY thin vertical-slice issues with coverage matrix +
per-issue Slice check; self-audits before showing
ship-it AFK shell loop ships each slice end-to-end with one
commit per issue, status streams to terminal,
Ctrl-C-able, survives session close
Vertical-slice principle throughout: every issue cuts end-to-end through every
integration layer (no horizontal "do all the DB work first" issues). The
AFK loop only ships against acceptance criteria already locked in by the PRD
phase — autonomous code never runs against undefined contracts.
ship-it tracker support: gh (GitHub) and tea (Gitea). For this repo, set
SHIP_IT_TRUNK=development to override the main default.
See .claude/skills/README.md for the full how-to and a worked example.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
116 lines
7.1 KiB
Markdown
116 lines
7.1 KiB
Markdown
---
|
|
name: ship-it
|
|
description: AFK autopilot. Drives a shell loop that works through every ready issue in the tracker (GitHub via gh, Gitea via tea), implementing each vertical slice end-to-end and committing per issue. Status streams to the terminal so the human can tail progress locally and Ctrl-C anytime. The shell is the loop; each iteration dispatches one fresh headless Claude run to ship one issue. Use when the user invokes /ship-it, says "go AFK on this", "work the backlog", "ralph the issues", or "ship everything".
|
|
---
|
|
|
|
# Ship It — AFK backlog autopilot
|
|
|
|
**Mode: AFK.** No human in the loop. Does not ask questions mid-run. If a slice is undecidable, the iteration labels the issue `needs-decision` and the loop moves on. The human gets one summary at the end, not chatter during.
|
|
|
|
## How this works (read before invoking)
|
|
|
|
The actual loop runs in a shell script: `.claude/skills/ship-it/loop.sh`. **The shell is the loop**, not you. Each iteration shells out to a fresh, headless `claude -p` invocation that processes exactly one issue using `.claude/skills/ship-it/iterate.md` as its prompt. Three reasons this design beats "LLM keeps going inside one session":
|
|
|
|
1. **Fresh context per issue.** No drift, no accumulated history bloating the window.
|
|
2. **Visible in the terminal.** Progress streams to stdout and tees to a log file. The human can tail it from another shell, see commits land, and Ctrl-C cleanly.
|
|
3. **Survives session close.** Closing the interactive Claude window doesn't kill the loop. Re-attach by tailing the log.
|
|
|
|
## Files
|
|
|
|
- `loop.sh` — orchestrator. Tracker detection, preflight, dispatch loop, status output, stop conditions, summary.
|
|
- `iterate.md` — the prompt passed to each per-issue headless Claude. Read it; it defines what "shipped" means.
|
|
- `SKILL.md` — this file. When the user invokes `/ship-it`, you bootstrap and hand off.
|
|
|
|
## When the user invokes /ship-it
|
|
|
|
You (the interactive Claude) do the bootstrap, not the work. Concretely:
|
|
|
|
1. **Preflight in chat** (catches the obvious failures before the script runs):
|
|
- `git status --porcelain` empty?
|
|
- On `main` (or `$SHIP_IT_TRUNK`)? Up-to-date with origin?
|
|
- `gh auth status` (or tea token) returns 0?
|
|
- `gh issue list --state open --label slice | wc -l` ≥ 1?
|
|
2. **Show the plan** in one short block: tracker host, trunk branch, count of ready issues, the first 3 issue titles, the log path. Nothing more.
|
|
3. **Ask one question:** "Start? Reply `go`." This is the *only* human-in-the-loop checkpoint — kicking off AFK work is a real commitment, deserves an explicit ok.
|
|
4. **On `go`:** run the loop in the foreground so the user sees live output:
|
|
```
|
|
bash .claude/skills/ship-it/loop.sh
|
|
```
|
|
Do not background it. Do not pipe through anything that buffers. The user can Ctrl-C.
|
|
5. **While it runs:** stay silent. Don't interject. Don't "monitor" by re-reading logs in chat — the user has the terminal.
|
|
6. **When it exits:** read the final `==== ship-it summary ====` block from the log file, present it once with concrete next steps ("2 issues are `needs-decision` — open them to answer their questions?").
|
|
|
|
## Following progress
|
|
|
|
The script logs to stdout AND tees to `.ship-it-logs/run-<RUN_ID>.log`. Tail from another terminal:
|
|
|
|
```bash
|
|
tail -f .ship-it-logs/run-*.log
|
|
```
|
|
|
|
Per-issue detail (everything the headless Claude did for that one issue) is in `.ship-it-logs/iter-<RUN_ID>-<ISSUE>.log` — useful for debugging a failed iteration.
|
|
|
|
Commits land in git as the loop runs. Watch with:
|
|
|
|
```bash
|
|
watch -n 5 'git log --oneline -10 origin/main'
|
|
```
|
|
|
|
## Config (env vars, override before invoking)
|
|
|
|
| Var | Default | Purpose |
|
|
|---|---|---|
|
|
| `SHIP_IT_MAX` | 50 | Hard cap on iterations per run |
|
|
| `SHIP_IT_MAX_FAIL` | 3 | Consecutive failures before stop |
|
|
| `SHIP_IT_TRUNK` | `main` | Trunk branch name |
|
|
| `SHIP_IT_TIMEOUT` | `30m` | Per-issue timeout (kills the headless claude) |
|
|
| `SHIP_IT_LOG_DIR` | `<repo>/.ship-it-logs` | Where logs go |
|
|
|
|
## What each iteration does (per `iterate.md`)
|
|
|
|
For one issue: read it → branch from trunk → write failing e2e test at the outermost layer → implement layer by layer until the test passes → run the full suite → outermost-layer smoke check → commit (one commit, message ends `Closes #N`) → push → open PR with acceptance-criteria checkboxes + smoke evidence → wait for CI → merge if green and branch protection allows, else leave open for review → return to trunk → emit `ITERATION_RESULT:` line for the loop.
|
|
|
|
**Commit per issue:** yes, exactly. One commit per slice, referenced to the issue, lands on the branch before the PR opens. The slice scope was made small in `/prd-to-issues` precisely so this is one tight commit, not a series.
|
|
|
|
## Stop conditions (in priority order)
|
|
|
|
1. **User Ctrl-C** → trap catches SIGINT, current step finishes cleanly, summary prints, exit 130.
|
|
2. **Backlog empty** (no ready issues) → exit 0.
|
|
3. **Three consecutive hard failures** → exit 1. Something systemic — bad dependency, branch protection blocking, flaky env. Surfaces for human review.
|
|
4. **Precondition violated mid-run** → exit non-zero with reason.
|
|
|
|
## What "ready" means (the loop's filter)
|
|
|
|
An issue is `ready` iff:
|
|
- State is open
|
|
- Has label `slice` (filed by `/prd-to-issues`)
|
|
- Does NOT have label `blocked`, `needs-decision`, or `ci-failed`
|
|
- Is not a spike (spikes deliver decisions, not code — humans handle those)
|
|
|
|
Issues are processed in number order — walking-skeleton first, as `/prd-to-issues` ordered them.
|
|
|
|
## Safety boundaries
|
|
|
|
The headless Claude is launched with a tool allowlist that excludes destructive operations. It cannot:
|
|
|
|
- Force-push or rewrite shared history
|
|
- Bypass branch protection or skip CI hooks (`--no-verify`, `--admin`)
|
|
- Auto-merge red or pending PRs (the iterate prompt forbids it, and CI gates back it up)
|
|
- Modify CI/CD config or IaC unless the slice's `Slice — layers touched` line explicitly names that layer
|
|
- Close issues without the outermost-layer smoke check passing
|
|
- Assign people or change milestones/projects
|
|
|
|
If something tries to push past these in practice (e.g. a slice "needs" a CI change to pass), it should fail the iteration with `needs-decision` and let a human approve the scope expansion.
|
|
|
|
## What not to do
|
|
|
|
- **Don't drive the loop yourself by reading issues and implementing them inline.** The shell is the loop. If you're tempted to "just do this one in chat," stop and run the script.
|
|
- **Don't background the script** so the user can keep chatting with you. The output IS the value. The user wants to watch it work.
|
|
- **Don't summarize between iterations.** Chatter belongs in the final summary, not after each commit.
|
|
- **Don't tag the user in PR/issue comments** during the run. They're not in the loop until the script exits.
|
|
- **Don't restart a failed iteration manually.** The loop's `needs-decision` and `ci-failed` labels are how failures stay in the tracker for human triage. Manual restart skips that.
|
|
|
|
## How this fits the chain
|
|
|
|
`/grill-me <feature>` (together) → `/prd` (together) → `/prd-to-issues` (mostly together, file step needs `create`) → `/ship-it` (AFK). The four-skill arc takes a vague feature idea to merged code with one human checkpoint per phase boundary.
|