Release v1.30.0 — Workflow Call/Return & Agent Delegation #56

Merged
Mike Bros merged 54 commits from release/1.30.0 into master 2026-03-18 06:23:20 +00:00
Collaborator

Release v1.30.0 — Workflow Call/Return & Agent Delegation

Changes

  • OP#2324: Git branch and worktree cleanup command with age-based pruning
  • OP#2325: Standardize command output summaries with project name and version
  • OP#2331: Design mode field (goto|call|dispatch) for goto schema
  • OP#2332: Design result envelope schema for call/dispatch return payloads
  • OP#2333: Add timeout, on_error, on_success, returns fields to goto schema
  • OP#2334: Update schema validation scripts for call/dispatch fields
  • OP#2335: Test schema parsing with all three goto modes
  • OP#2336: Implement mode: call blocking in conductor routing
  • OP#2337: Implement result payload capture from called phase
  • OP#2338: Implement resume-with-context for calling phase after call returns
  • OP#2339: Implement result branching with on_success and on_error routing
  • OP#2340: Test call/return with mock phases
  • OP#2341: Implement mode: dispatch non-blocking invocation in conductor
  • OP#2342: Design heartbeat protocol and ledger schema
  • OP#2343: Implement heartbeat monitor with configurable interval
  • OP#2344: Implement timeout detection and error event firing
  • OP#2345: Implement on_error handler routing for timeout and crash events
  • OP#2346: Implement completion notification and result retrieval for dispatch
  • OP#2347: Test dispatch with simulated timeout and recovery
  • OP#2348: Design agent registry schema
  • OP#2349: Implement agent invocation via call/dispatch targeting external agents
  • OP#2350: Implement structured result parsing from agent output
  • OP#2351: Test specialist agent delegation
  • OP#2352: Update skills to prefer MCP bulk actions for batch operations
  • OP#2353: ADR: Call/return and async delegation architecture
  • OP#2354: Wiki guide: Creating specialist agents
  • OP#2355: Wiki guide: Choosing between goto, call, and dispatch
  • OP#2356: End-to-end integration tests across all three goto modes
  • OP#2474: Add max_iterations and convergence_condition to goto schema
  • OP#2475: Design and implement handoff envelope schema
  • OP#2476: Implement conductor handoff lifecycle
  • OP#2478: Test loop guards and handoff envelopes
  • OP#2484: Test result envelope writing and conductor consumption
  • OP#2487: Add result envelope writing to orchestration-aware skills

PR Review Fixes

  • OP#2512: Fix invoke template escaping protocol (shell injection prevention)
  • OP#2513: Fix ISO 8601 duration regex (accept P1D, PT1H30M)
  • OP#2514: Align verdict vocabulary (approved/needs_changes)

Checklist

  • All version tasks closed in Gravity PM
  • Version file matches Gravity PM version (manifest.json = 1.30.0)
  • 7 epics in testing
  • PR review complete — Acceptable verdict
  • 3 important findings fixed on release branch

References

Version: v1.30.0 (Gravity PM ID: 154)

## Release v1.30.0 — Workflow Call/Return & Agent Delegation ### Changes - OP#2324: Git branch and worktree cleanup command with age-based pruning - OP#2325: Standardize command output summaries with project name and version - OP#2331: Design mode field (goto|call|dispatch) for goto schema - OP#2332: Design result envelope schema for call/dispatch return payloads - OP#2333: Add timeout, on_error, on_success, returns fields to goto schema - OP#2334: Update schema validation scripts for call/dispatch fields - OP#2335: Test schema parsing with all three goto modes - OP#2336: Implement mode: call blocking in conductor routing - OP#2337: Implement result payload capture from called phase - OP#2338: Implement resume-with-context for calling phase after call returns - OP#2339: Implement result branching with on_success and on_error routing - OP#2340: Test call/return with mock phases - OP#2341: Implement mode: dispatch non-blocking invocation in conductor - OP#2342: Design heartbeat protocol and ledger schema - OP#2343: Implement heartbeat monitor with configurable interval - OP#2344: Implement timeout detection and error event firing - OP#2345: Implement on_error handler routing for timeout and crash events - OP#2346: Implement completion notification and result retrieval for dispatch - OP#2347: Test dispatch with simulated timeout and recovery - OP#2348: Design agent registry schema - OP#2349: Implement agent invocation via call/dispatch targeting external agents - OP#2350: Implement structured result parsing from agent output - OP#2351: Test specialist agent delegation - OP#2352: Update skills to prefer MCP bulk actions for batch operations - OP#2353: ADR: Call/return and async delegation architecture - OP#2354: Wiki guide: Creating specialist agents - OP#2355: Wiki guide: Choosing between goto, call, and dispatch - OP#2356: End-to-end integration tests across all three goto modes - OP#2474: Add max_iterations and convergence_condition to goto schema - OP#2475: Design and implement handoff envelope schema - OP#2476: Implement conductor handoff lifecycle - OP#2478: Test loop guards and handoff envelopes - OP#2484: Test result envelope writing and conductor consumption - OP#2487: Add result envelope writing to orchestration-aware skills ### PR Review Fixes - OP#2512: Fix invoke template escaping protocol (shell injection prevention) - OP#2513: Fix ISO 8601 duration regex (accept P1D, PT1H30M) - OP#2514: Align verdict vocabulary (approved/needs_changes) ### Checklist - [x] All version tasks closed in Gravity PM - [x] Version file matches Gravity PM version (manifest.json = 1.30.0) - [x] 7 epics in testing - [x] PR review complete — Acceptable verdict - [x] 3 important findings fixed on release branch ### References Version: v1.30.0 (Gravity PM ID: 154)
Closes OP#2362

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add git-cleanup script that summarizes local branches (with last commit
date, tracking status, merged status) and worktrees, plus offers cleanup
via --older, --merged, and -i interactive modes. Protected branches are
never deleted; dry-run is the default.

Closes OP#2324

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extend the goto schema with a mode field that controls return behavior.
Default mode "goto" preserves existing fire-and-forget semantics.
New "call" mode suspends the caller until target returns a result.
New "dispatch" mode lets the caller continue while target runs async.
Schema version bumped to 2 with full backward compatibility.

Closes OP#2331

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Define a structured result envelope that called/dispatched phases return
to the caller. Includes status (success/error/timeout/cancelled), payload,
timing info, error details with standard codes, and consumption patterns
for both call (synchronous) and dispatch (asynchronous) modes.

Closes OP#2332

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add ### {Project Name} — v{version} header to cross-skill guidance
output blocks in gr-refinement, gr-pr-review, and gr-pr-followup.
These three skills were the only ones missing the standard header
that was already present in gr-preflight, gr-implementation,
gr-postflight, and gr-release. Skills invoked outside a version
context (single-ticket refinement, ad-hoc PR review/followup) show
project name only without version.

Closes OP#2325

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extend goto schema with call/dispatch control fields: timeout (ISO 8601
duration), on_error/on_success (routing targets for result-based branching),
and returns (expected payload shape declaration). All fields are optional
and ignored when mode is goto (default). Updated validation rules 15-19.

Closes OP#2333

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Audit and update 5 skill files to use bulk MCP tools (bulk_transition,
bulk_update, bulk_create) instead of sequential individual calls when
operating on 3+ work packages. Add bulk operations preference section
to pm-adapter.md as guidance for future skill authors.

Skills updated:
- gr-preflight: Phase 1 audit fixes, Phase 4a claiming, Phase 5 SP assignment
- gr-postflight: already used bulk ops (no changes needed)
- gr-release: explicit bulk_transition example for version closing
- gr-refinement: bulk_create for batch subtask creation
- gr-pr-followup: bulk_create for triage work package creation
- gr-implementation: inherently sequential per-task loop (no changes needed)

Closes OP#2352

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create validate-goto-schema.sh that checks all goto blocks in
gravity-phases.yaml against schema v2 rules. Validates type, target,
condition, mode, timeout format, on_error/on_success target resolution,
returns type declarations, and warns when call/dispatch fields appear
on goto-mode blocks. Requires yq v4+.

Closes OP#2334

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create 10 test cases covering all three goto modes: backward-compatible
goto (with and without explicit mode), call with full options and minimal
config, dispatch async, and error cases for invalid mode, misplaced
call/dispatch fields, self-transition, and invalid returns types.

Closes OP#2335

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merged 4 feature branches into version branch:
- feature/2324-branch-cleanup (git-cleanup script)
- feature/2325-command-summaries (standardize output headers)
- feature/2352-mcp-bulk-audit (bulk MCP action preferences)
- feature/2326-call-return-schema (goto mode/call/dispatch schema)

All merges clean, no conflicts.

Closes OP#2363

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Call Mode Routing section to the conductor SKILL.md with instructions
for call_id generation, current_call.json metadata file, call stack tracking
for nested calls (3 levels), GRAVITY_CALL_ID env var propagation, blocking
behavior, cleanup protocol, and error handling table.

Closes OP#2336

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add "Dispatch Mode Routing" section to SKILL.md documenting the full
dispatch invocation sequence: dispatch_id generation, call metadata,
env var export, non-blocking target invocation, heartbeat monitor
registration, and concurrent dispatch tracking.

Closes OP#2341

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add call-result-capture.md reference documenting: result file path convention
(.gravity/results/{call_id}.result.json), called phase write protocol, conductor
read and validation logic, missing file handling (synthetic error envelope),
timeout handling, validation checks table, and cleanup after consumption.

Closes OP#2337

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create heartbeat-protocol.md reference documenting the ledger schema,
rolling window (20-entry cap), writing priority (grav-ai > direct file),
file path conventions, multi-dispatch support, and validation rules.

Closes OP#2342

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Resume with Context section to conductor SKILL.md defining how the
conductor injects call results into the resumed phase's prompt as structured
markdown. Covers injection format template, field mapping, rendering rules
for success/error/timeout, payload formatting heuristic, nested call scoping,
and context lifetime semantics.

Closes OP#2338

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add monitor section to heartbeat-protocol.md: configurable poll interval
(default 30s), per-dispatch state tracking, concurrent dispatch support,
staleness evaluation against timeout, and malformed ledger handling.

Closes OP#2343

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add timeout detection section to heartbeat-protocol.md: staleness
comparison logic, structured timeout error events with diagnostic
context, crash detection for missing ledgers, and ledger state update
on timeout.

Closes OP#2344

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Result Branching section to conductor SKILL.md defining the decision
logic for routing based on call result status. Covers success/error/timeout
decision tree, decision table, target resolution for branch targets,
context injection templates for _result and _error keys, and chaining
calls from branch targets.

Closes OP#2339

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add error handler routing section to heartbeat-protocol.md: routing
decision logic, error envelope propagation, recovery/escalation/
degradation/fallback patterns, and user-facing error when no on_error
handler is configured.

Closes OP#2345

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add completion detection section to heartbeat-protocol.md: status-based
detection, result envelope retrieval from .gravity/results/, returns
schema validation, on_success routing, cleanup sequence, and full
agent-monitor-conductor interaction diagram.

Closes OP#2346

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create call-mode-test-cases.md with 6 behavioral test cases: TC1 successful
call with data return, TC2 error routing to on_error, TC3 missing target
validation error, TC4 nested calls A→B→C with stack management, TC5 call_id
propagation across 7 checkpoints, TC6 on_success routing with result payload.

Closes OP#2340

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create dispatch-mode-test-cases.md with 6 test cases: successful
dispatch with result return, heartbeat timeout, agent crash (no
heartbeat), on_success routing, concurrent dispatches from same
phase, and rolling window eviction.

Closes OP#2347

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merged 2 feature branches into version branch:
- feature/2327-sync-call (call mode routing, result capture, resume, branching, tests)
- feature/2328-async-dispatch (dispatch mode, heartbeat protocol, monitor, timeout, error handling, tests)

All merges clean, no conflicts.

Closes OP#2364

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Define the agents.yaml registry with model, purpose, invoke template,
cost_tier, and input/output schema fields. Include three example agents:
wp-writer (AI drafting), schema-validator (local CLI), and
result-standardizer (intermediary normalizer).

Closes OP#2348

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add loop guard fields to the goto block schema for call/dispatch modes.
max_iterations caps iteration count; convergence_condition exits on a
dot-path expression match against the result envelope. Update manifest
example, schema docs, validation rules, and validate-goto-schema.sh.

Closes OP#2474

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Document agent target resolution (agent:{name} syntax), command
construction from invoke templates, input validation against schemas,
headless Claude and local CLI backends, output capture with envelope
wrapping, failure handling, and backend extension point interface.

Closes OP#2349

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Define the handoff envelope schema at .gravity/handoffs/{call_id}.handoff.json
for passing context between call/dispatch loop iterations. Documents field
reference, lifecycle (write/read/cleanup), and relationship to result
envelopes and heartbeat protocol.

Closes OP#2475

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Define two parsing paths (JSON stdout, file-based) with parse failure
handling, output schema validation, raw output preservation, and the
intermediary standardizer pattern. Add Result Parsing section to
conductor SKILL.md with quick reference table.

Closes OP#2350

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Six test cases covering call mode structured result, dispatch mode with
heartbeat, output_schema validation (3 scenarios), invalid output parse
error, invocation failure with non-zero exit, and the intermediary
standardizer recovery pattern.

Closes OP#2351

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Loop and Handoff Lifecycle section to the conductor SKILL.md covering
all 7 steps: write handoff after iteration, read/inject context before next,
track iteration count, evaluate convergence via dot-path accessor, route on
convergence or max_iterations, and continue loop. Includes decision flowchart
and edge case documentation.

Closes OP#2476

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create 7 test cases covering: convergence exit (TC1), max_iterations exit
(TC2), handoff round-trip (TC3), iteration counter (TC4), context injection
prompt structure (TC5), dot-path convergence evaluation (TC6), and missing
path handling (TC7).

Closes OP#2478

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merged feature/2329-agent-framework into version branch.
Clean merge, no conflicts.

Closes OP#2365

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merged feature/2472-loop-guards into version branch.
Clean merge, no conflicts.

Closes OP#2485

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Result Envelope sections to gr-pr-review, gr-implementation,
gr-pr-followup, and gr-release documenting detection logic, envelope
structure, payload fields, and error envelopes for call/dispatch mode.

Closes OP#2487

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Five test cases covering orchestration detection, standalone skip, conductor
result consumption with convergence evaluation, missing result file handling,
and per-skill payload field verification for all four orchestration-aware skills.

Closes OP#2484

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merged feature/2473-skill-result-adapters into version branch.
Clean merge, no conflicts.

Closes OP#2486

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Closes OP#2353

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Closes OP#2354

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Closes OP#2355

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Closes OP#2356

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merged feature/2330-docs-testing into version branch.
Clean merge, no conflicts.

Closes OP#2366

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
chore(release): bump version to 1.30.0
All checks were successful
PR Validation / validate-branch (pull_request) Successful in 1s
CI / json-check (pull_request) Successful in 4s
PR Validation / validate-release-pr (pull_request) Successful in 4s
CI / security (pull_request) Successful in 14s
CI / manifest (pull_request) Successful in 1m33s
CI / lua-check (pull_request) Successful in 1m45s
CI / sast (pull_request) Successful in 3m8s
8b5fd29a1b
Refs OP#2357

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Collaborator

PR Review

Verdict: Acceptable

This review was conducted from a fresh context against the current branch state (31 files, ~7900 lines added). First review cycle — no prior reviews or followups exist.

Findings

Severity Count
Blocking 0
Important 3
Minor 5
Nitpick 2

Test Results

  • Tests: No test runner detected (generic ecosystem — neovim config + skill markdown files)
  • Coverage: N/A
  • Smoke Tests: No Docker or test infrastructure available. Graceful degradation applied.

Human Review Items

3 items flagged for human review — see items marked with HUMAN REVIEW below.

Key Findings

  1. [IMPORTANT] Inconsistent request_id format between SKILL.md examples and conductor spec — multiple files
  2. [IMPORTANT] validate-goto-schema.sh regex for ISO 8601 duration rejects valid durations — validate-goto-schema.sh:133
  3. [IMPORTANT] Shell injection surface in agents.yaml invoke templates — agents.yaml / SKILL.md

(Grouped by file below)


configs/bin/git-cleanup

  • [MINOR] Lines 95-130 (show_summary): Function is ~80 lines with mixed formatting, data collection, and sorting logic. Consider extracting the branch-collection loop and sort into a helper function (e.g., collect_and_sort_branches). The worktree display section (lines 140-180) could similarly be extracted.

  • [MINOR] Lines 295-330 (cleanup_interactive): Stale worktree pruning logic is duplicated between cleanup_interactive and prune_stale_worktrees. The detection loop (lines 295-310 and 355-370) is nearly identical. Consider extracting detect_stale_worktrees into a shared function and calling it from both.

  • [NITPICK] The script uses printf '%b\n' in usage() but printf with direct format strings elsewhere. Consistent use of one style would improve readability.


configs/claude/skills/gr-gravity/agents.yaml

  • [IMPORTANT] The invoke templates for AI agents use single-quote wrapping around {input}:

    invoke: "claude --print --model {model} -p '{input}'"
    

    If the serialized JSON input contains single quotes (e.g., in string values passed as params), the shell command will break or — worse — interpret unescaped content. The conductor's command construction section in SKILL.md does not document how single quotes in param values are escaped before substitution. This is a shell injection surface for local-cli agents (the schema-validator invoke template passes {input} directly as a file path argument: npx ajv validate -s {schema_path} -d {input}).

    Suggested fix: Document the escaping protocol in the conductor's Agent Invocation section — e.g., mandate that {input} is always properly shell-escaped before substitution, and that {placeholder} values from params are individually escaped. Alternatively, note that the conductor should write input to a temp file and pass the file path instead of inline JSON for AI agents.


configs/claude/skills/gr-gravity/scripts/validate-goto-schema.sh

  • [IMPORTANT] Line 133 — The ISO 8601 duration regex:

    if ! [[ "$timeout" =~ ^P(T[0-9]+[HMS])+$ || "$timeout" =~ ^P[0-9]+[DWMY](T[0-9]+[HMS])*$ ]]; then
    

    This regex rejects valid ISO 8601 durations like P1D (no time component) and P1DT2H (combined date and time). The first branch requires T at the start (so P7D would fail), and the second branch allows P7D but uses [DWMY] which includes Y (years) and M (months, but M collides with minutes in the time portion). The regex also does not handle multi-segment durations like PT1H30M.

    Suggested fix:

    if ! [[ "$timeout" =~ ^P([0-9]+[YWMD])*(T([0-9]+[HMS])+)?$ ]]; then
    

    This handles the full ISO 8601 duration syntax: optional date segments followed by optional time segments.

  • [MINOR] Lines 97-99 — The error() and warn() functions use ((ERRORS++)) and ((WARNINGS++)) which will return exit code 1 when the variable transitions from 0 to 1 (because ((0)) is falsy in bash). Combined with set -e, this could cause the script to exit prematurely on the first error/warning. This is a latent bug — currently mitigated because the functions use || or are called in contexts where the return code is consumed, but it is fragile.

    Suggested fix: Use ERRORS=$((ERRORS + 1)) instead of ((ERRORS++)) to avoid the exit-code-1 issue under set -e.


configs/claude/skills/gr-gravity/SKILL.md

  • [MINOR] The Result Branching section documents that cancelled status routes to on_error, and the decision table includes a cancelled row. However, the result envelope schema (result-envelope.md) lists cancelled as a valid status, but the Call Result Capture section (call-result-capture.md) and the error handling table do not mention cancelled at all. The conductor's error handling table (line ~570 of the SKILL.md additions) covers not found, timeout, crashes, and stack overflow but omits cancelled. Consider adding a row for cancellation or documenting where cancellation originates.

  • [NITPICK] Line 790 contains a rendering artifact: {result*envelope.payload if non-empty, otherwise "*(none)\_"} — the asterisks and underscores appear to be escaped markdown that will render oddly. Should likely be {result_envelope.payload if non-empty, otherwise "_(none)_"}.


configs/claude/skills/gr-pr-review/SKILL.md

  • [MINOR] The result envelope's verdict field uses "pass" | "warn" | "fail" values, while the conductor test cases and convergence conditions throughout the PR use "approved" | "needs_changes" as verdict values. The mapping table clarifies the translation, but having two different verdict vocabularies (one for the envelope, one for the convergence condition) creates a mismatch risk. A caller writing convergence_condition: "result.data.verdict == approved" would never converge because the envelope writes "pass" not "approved".

    Suggested fix: Either align the envelope verdict values with the convergence examples (use "approved" / "not_acceptable") or update all test cases and convergence examples to use "pass" / "fail" to match the actual envelope values.


Cross-File Consistency

  • HUMAN REVIEW: The request_id format differs between specifications. The conductor's Call ID Generation section specifies the format {target}_{YYYYMMDD}_{HHMMSS} (e.g., pr-review_20260318_143000), while the result envelope examples in result-envelope.md use a different format: goto-postflight-pr-review-001 and goto-impl-release-042. These are clearly different naming conventions. The test case files consistently use the {target}_{YYYYMMDD}_{HHMMSS} format. Please verify which format is canonical and update result-envelope.md examples to match.

  • HUMAN REVIEW: The PR introduces three new .gravity/ subdirectories (results/, heartbeats/, handoffs/) that should be gitignored. The ADR notes this requirement (line 1710: "add .gravity/results/, .gravity/heartbeats/, and .gravity/handoffs/ to .gitignore") but no .gitignore change is included in the diff. This may be intentional if .gravity/ is already gitignored at a higher level, but please verify.

  • HUMAN REVIEW: The schema-validator agent in agents.yaml uses npx ajv validate which requires ajv-cli to be installed. This project is a generic neovim config ecosystem with no package.json. The agent definition is a reference example, but if it is intended to be functional, the dependency needs to be documented or the agent should note it is illustrative only.


Scope Assessment

  • 31 files changed (~7900 additions, ~48 deletions)
  • 1 new shell script (git-cleanup) — well-structured, comprehensive branch/worktree management
  • 1 new validation script (validate-goto-schema.sh) — solid schema validation with minor regex issue
  • 1 new YAML config (agents.yaml) — clean schema with 3 reference agents
  • 16 new reference documents — thorough specification of call/return, dispatch, heartbeat, handoff, and agent protocols
  • 8 modified skill files — consistent addition of result envelope sections and bulk operation preferences
  • 1 manifest version bump (1.29.0 → 1.30.0)
  • Schema version bump (1 → 2) with documented backward compatibility

Architecture Notes

The three-mode goto extension (goto/call/dispatch) is well-designed with clear separation between protocol specification (SKILL.md conductor sections), schema definition (phase-manifest-schema.md), file format schemas (result-envelope.md, handoff-envelope.md, heartbeat-protocol.md), and behavioral test cases (6 test case files). The ADR provides strong rationale for design decisions including rejected alternatives. The protocol is file-based and dependency-free, consistent with the project's design philosophy.

The addition of result envelope writing to orchestration-aware skills (gr-implementation, gr-pr-review, gr-pr-followup, gr-release) follows a consistent pattern with clear detection, timing, and error handling. The bulk operation preference updates across gr-preflight, gr-refinement, gr-pr-followup, and gr-release are well-scoped incremental improvements.


Review by Gravity Bot — 2026-03-18 | Review cycle: 1

## PR Review **Verdict: Acceptable** This review was conducted from a fresh context against the current branch state (31 files, ~7900 lines added). First review cycle — no prior reviews or followups exist. ### Findings | Severity | Count | | -------- | ----- | | Blocking | 0 | | Important | 3 | | Minor | 5 | | Nitpick | 2 | ### Test Results - **Tests:** No test runner detected (generic ecosystem — neovim config + skill markdown files) - **Coverage:** N/A - **Smoke Tests:** No Docker or test infrastructure available. Graceful degradation applied. ### Human Review Items 3 items flagged for human review — see items marked with **HUMAN REVIEW** below. ### Key Findings 1. [IMPORTANT] Inconsistent `request_id` format between SKILL.md examples and conductor spec — multiple files 2. [IMPORTANT] `validate-goto-schema.sh` regex for ISO 8601 duration rejects valid durations — validate-goto-schema.sh:133 3. [IMPORTANT] Shell injection surface in `agents.yaml` invoke templates — agents.yaml / SKILL.md _(Grouped by file below)_ --- ### configs/bin/git-cleanup - **[MINOR]** Lines 95-130 (`show_summary`): Function is ~80 lines with mixed formatting, data collection, and sorting logic. Consider extracting the branch-collection loop and sort into a helper function (e.g., `collect_and_sort_branches`). The worktree display section (lines 140-180) could similarly be extracted. - **[MINOR]** Lines 295-330 (`cleanup_interactive`): Stale worktree pruning logic is duplicated between `cleanup_interactive` and `prune_stale_worktrees`. The detection loop (lines 295-310 and 355-370) is nearly identical. Consider extracting `detect_stale_worktrees` into a shared function and calling it from both. - **[NITPICK]** The script uses `printf '%b\n'` in `usage()` but `printf` with direct format strings elsewhere. Consistent use of one style would improve readability. --- ### configs/claude/skills/gr-gravity/agents.yaml - **[IMPORTANT]** The `invoke` templates for AI agents use single-quote wrapping around `{input}`: ```yaml invoke: "claude --print --model {model} -p '{input}'" ``` If the serialized JSON input contains single quotes (e.g., in string values passed as params), the shell command will break or — worse — interpret unescaped content. The conductor's command construction section in SKILL.md does not document how single quotes in param values are escaped before substitution. This is a shell injection surface for local-cli agents (the `schema-validator` invoke template passes `{input}` directly as a file path argument: `npx ajv validate -s {schema_path} -d {input}`). **Suggested fix:** Document the escaping protocol in the conductor's Agent Invocation section — e.g., mandate that `{input}` is always properly shell-escaped before substitution, and that `{placeholder}` values from params are individually escaped. Alternatively, note that the conductor should write input to a temp file and pass the file path instead of inline JSON for AI agents. --- ### configs/claude/skills/gr-gravity/scripts/validate-goto-schema.sh - **[IMPORTANT]** Line 133 — The ISO 8601 duration regex: ```bash if ! [[ "$timeout" =~ ^P(T[0-9]+[HMS])+$ || "$timeout" =~ ^P[0-9]+[DWMY](T[0-9]+[HMS])*$ ]]; then ``` This regex rejects valid ISO 8601 durations like `P1D` (no time component) and `P1DT2H` (combined date and time). The first branch requires `T` at the start (so `P7D` would fail), and the second branch allows `P7D` but uses `[DWMY]` which includes `Y` (years) and `M` (months, but `M` collides with minutes in the time portion). The regex also does not handle multi-segment durations like `PT1H30M`. **Suggested fix:** ```bash if ! [[ "$timeout" =~ ^P([0-9]+[YWMD])*(T([0-9]+[HMS])+)?$ ]]; then ``` This handles the full ISO 8601 duration syntax: optional date segments followed by optional time segments. - **[MINOR]** Lines 97-99 — The `error()` and `warn()` functions use `((ERRORS++))` and `((WARNINGS++))` which will return exit code 1 when the variable transitions from 0 to 1 (because `((0))` is falsy in bash). Combined with `set -e`, this could cause the script to exit prematurely on the first error/warning. This is a latent bug — currently mitigated because the functions use `||` or are called in contexts where the return code is consumed, but it is fragile. **Suggested fix:** Use `ERRORS=$((ERRORS + 1))` instead of `((ERRORS++))` to avoid the exit-code-1 issue under `set -e`. --- ### configs/claude/skills/gr-gravity/SKILL.md - **[MINOR]** The Result Branching section documents that `cancelled` status routes to `on_error`, and the decision table includes a `cancelled` row. However, the result envelope schema (`result-envelope.md`) lists `cancelled` as a valid status, but the Call Result Capture section (`call-result-capture.md`) and the error handling table do not mention `cancelled` at all. The conductor's error handling table (line ~570 of the SKILL.md additions) covers `not found`, `timeout`, `crashes`, and `stack overflow` but omits `cancelled`. Consider adding a row for cancellation or documenting where cancellation originates. - **[NITPICK]** Line 790 contains a rendering artifact: `{result*envelope.payload if non-empty, otherwise "*(none)\_"}` — the asterisks and underscores appear to be escaped markdown that will render oddly. Should likely be `{result_envelope.payload if non-empty, otherwise "_(none)_"}`. --- ### configs/claude/skills/gr-pr-review/SKILL.md - **[MINOR]** The result envelope's `verdict` field uses `"pass" | "warn" | "fail"` values, while the conductor test cases and convergence conditions throughout the PR use `"approved" | "needs_changes"` as verdict values. The mapping table clarifies the translation, but having two different verdict vocabularies (one for the envelope, one for the convergence condition) creates a mismatch risk. A caller writing `convergence_condition: "result.data.verdict == approved"` would never converge because the envelope writes `"pass"` not `"approved"`. **Suggested fix:** Either align the envelope verdict values with the convergence examples (use `"approved"` / `"not_acceptable"`) or update all test cases and convergence examples to use `"pass"` / `"fail"` to match the actual envelope values. --- ### Cross-File Consistency - **HUMAN REVIEW:** The `request_id` format differs between specifications. The conductor's Call ID Generation section specifies the format `{target}_{YYYYMMDD}_{HHMMSS}` (e.g., `pr-review_20260318_143000`), while the result envelope examples in `result-envelope.md` use a different format: `goto-postflight-pr-review-001` and `goto-impl-release-042`. These are clearly different naming conventions. The test case files consistently use the `{target}_{YYYYMMDD}_{HHMMSS}` format. Please verify which format is canonical and update `result-envelope.md` examples to match. - **HUMAN REVIEW:** The PR introduces three new `.gravity/` subdirectories (`results/`, `heartbeats/`, `handoffs/`) that should be gitignored. The ADR notes this requirement (line 1710: "add `.gravity/results/`, `.gravity/heartbeats/`, and `.gravity/handoffs/` to `.gitignore`") but no `.gitignore` change is included in the diff. This may be intentional if `.gravity/` is already gitignored at a higher level, but please verify. - **HUMAN REVIEW:** The `schema-validator` agent in `agents.yaml` uses `npx ajv validate` which requires `ajv-cli` to be installed. This project is a generic neovim config ecosystem with no `package.json`. The agent definition is a reference example, but if it is intended to be functional, the dependency needs to be documented or the agent should note it is illustrative only. --- ### Scope Assessment - 31 files changed (~7900 additions, ~48 deletions) - 1 new shell script (`git-cleanup`) — well-structured, comprehensive branch/worktree management - 1 new validation script (`validate-goto-schema.sh`) — solid schema validation with minor regex issue - 1 new YAML config (`agents.yaml`) — clean schema with 3 reference agents - 16 new reference documents — thorough specification of call/return, dispatch, heartbeat, handoff, and agent protocols - 8 modified skill files — consistent addition of result envelope sections and bulk operation preferences - 1 manifest version bump (1.29.0 → 1.30.0) - Schema version bump (1 → 2) with documented backward compatibility ### Architecture Notes The three-mode goto extension (goto/call/dispatch) is well-designed with clear separation between protocol specification (SKILL.md conductor sections), schema definition (phase-manifest-schema.md), file format schemas (result-envelope.md, handoff-envelope.md, heartbeat-protocol.md), and behavioral test cases (6 test case files). The ADR provides strong rationale for design decisions including rejected alternatives. The protocol is file-based and dependency-free, consistent with the project's design philosophy. The addition of result envelope writing to orchestration-aware skills (gr-implementation, gr-pr-review, gr-pr-followup, gr-release) follows a consistent pattern with clear detection, timing, and error handling. The bulk operation preference updates across gr-preflight, gr-refinement, gr-pr-followup, and gr-release are well-scoped incremental improvements. --- *Review by Gravity Bot — 2026-03-18 | Review cycle: 1*
fix(gravity): address PR review findings — escaping, regex, verdict vocab
All checks were successful
CI / lua-check (pull_request) Successful in 9s
CI / json-check (pull_request) Successful in 13s
CI / sast (pull_request) Successful in 12s
PR Validation / validate-branch (pull_request) Successful in 19s
CI / security (pull_request) Successful in 22s
CI / manifest (pull_request) Successful in 23s
PR Validation / validate-release-pr (pull_request) Successful in 8s
72dc9b7434
- Document invoke template escaping protocol to prevent shell injection
  in agent invocation (temp file for AI agents, printf %q for CLI agents)
- Fix ISO 8601 duration regex to accept P1D, PT1H30M, P1DT2H
- Align gr-pr-review verdict vocabulary to approved/needs_changes to
  match conductor convergence condition examples

Closes OP#2512, Closes OP#2513, Closes OP#2514

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Collaborator

PR Review Followup

Responding to review posted on 2026-03-18 (review comment).

Triage Summary

Severity Count Action
Blocking 0
Important 3 Fixed in this branch
Minor 5 Deferred to v1.31.0
Nitpick 2 Deferred to v1.31.0
Human Review 3 Flagged — requires human judgment

Fixes Applied

All 3 important findings resolved in commit 72dc9b7:

  1. [IMPORTANT] Shell injection surface in invoke templates — documented escaping protocol (temp file for AI agents, printf '%q' for CLI agents)
    • OP#2512: fix: document invoke template escaping protocol for agent registry
  2. [IMPORTANT] ISO 8601 duration regex rejects valid durations — corrected regex to ^P([0-9]+[YWMD])*(T([0-9]+[HMS])+)?$
    • OP#2513: fix: correct ISO 8601 duration regex in validate-goto-schema.sh
  3. [IMPORTANT] Verdict vocabulary mismatch (pass/fail vs approved/needs_changes) — aligned gr-pr-review envelope to use approved/needs_changes matching all conductor convergence examples
    • OP#2514: fix: align verdict vocabulary between result envelopes and convergence examples

Deferred to v1.31.0

The following items have been tracked for the next version:

  1. [MINOR] git-cleanup show_summary too long + stale worktree detection duplicated + [NITPICK] printf style inconsistency
    • OP#2515: refactor: git-cleanup function extraction and style consistency
  2. [MINOR] validate-goto-schema.sh ((ERRORS++)) exit code issue under set -e
    • OP#2516: fix: validate-goto-schema.sh arithmetic under set -e
  3. [MINOR] cancelled status undocumented in call-result-capture.md + [NITPICK] SKILL.md rendering artifact + request_id format inconsistency
    • OP#2517: docs: align cancelled status and request_id format across specifications

Human Review Items (Unchanged)

These items require human judgment and were not auto-triaged:

  1. HUMAN REVIEW: request_id format inconsistency between conductor spec and result-envelope.md examples — verify canonical format
  2. HUMAN REVIEW: Missing .gitignore entries for .gravity/results/, .gravity/heartbeats/, .gravity/handoffs/ — verify if .gravity/ is already gitignored at a higher level
  3. HUMAN REVIEW: schema-validator agent uses npx ajv which requires Node.js — verify if this is illustrative only or intended to be functional

Test Results

  • Tests: No test runner detected (generic ecosystem)
  • All fixes verified: Yes (files modified cleanly, no conflicts)

This followup was generated by the gr-pr-followup skill.

## PR Review Followup Responding to review posted on 2026-03-18 ([review comment](https://git.bros.ninja/mike/kickstart.nvim/pulls/56#issuecomment-1070)). ### Triage Summary | Severity | Count | Action | | -------- | ----- | ------ | | Blocking | 0 | — | | Important | 3 | Fixed in this branch | | Minor | 5 | Deferred to v1.31.0 | | Nitpick | 2 | Deferred to v1.31.0 | | Human Review | 3 | Flagged — requires human judgment | ### Fixes Applied All 3 important findings resolved in commit `72dc9b7`: 1. **[IMPORTANT]** Shell injection surface in invoke templates — documented escaping protocol (temp file for AI agents, `printf '%q'` for CLI agents) - OP#2512: fix: document invoke template escaping protocol for agent registry 2. **[IMPORTANT]** ISO 8601 duration regex rejects valid durations — corrected regex to `^P([0-9]+[YWMD])*(T([0-9]+[HMS])+)?$` - OP#2513: fix: correct ISO 8601 duration regex in validate-goto-schema.sh 3. **[IMPORTANT]** Verdict vocabulary mismatch (`pass`/`fail` vs `approved`/`needs_changes`) — aligned gr-pr-review envelope to use `approved`/`needs_changes` matching all conductor convergence examples - OP#2514: fix: align verdict vocabulary between result envelopes and convergence examples ### Deferred to v1.31.0 The following items have been tracked for the next version: 1. **[MINOR]** git-cleanup show_summary too long + stale worktree detection duplicated + **[NITPICK]** printf style inconsistency - OP#2515: refactor: git-cleanup function extraction and style consistency 2. **[MINOR]** validate-goto-schema.sh `((ERRORS++))` exit code issue under `set -e` - OP#2516: fix: validate-goto-schema.sh arithmetic under set -e 3. **[MINOR]** cancelled status undocumented in call-result-capture.md + **[NITPICK]** SKILL.md rendering artifact + request_id format inconsistency - OP#2517: docs: align cancelled status and request_id format across specifications ### Human Review Items (Unchanged) These items require human judgment and were not auto-triaged: 1. **HUMAN REVIEW:** `request_id` format inconsistency between conductor spec and result-envelope.md examples — verify canonical format 2. **HUMAN REVIEW:** Missing `.gitignore` entries for `.gravity/results/`, `.gravity/heartbeats/`, `.gravity/handoffs/` — verify if `.gravity/` is already gitignored at a higher level 3. **HUMAN REVIEW:** `schema-validator` agent uses `npx ajv` which requires Node.js — verify if this is illustrative only or intended to be functional ### Test Results - **Tests:** No test runner detected (generic ecosystem) - **All fixes verified:** Yes (files modified cleanly, no conflicts) --- _This followup was generated by the gr-pr-followup skill._
fix(gravity): align request_id examples to canonical format
All checks were successful
CI / json-check (pull_request) Successful in 3s
PR Validation / validate-branch (pull_request) Successful in 1s
CI / manifest (pull_request) Successful in 6s
CI / security (pull_request) Successful in 11s
PR Validation / validate-release-pr (pull_request) Successful in 4s
CI / lua-check (pull_request) Successful in 18s
CI / sast (pull_request) Successful in 1m15s
4a49bcf37f
Update result-envelope.md examples to use the conductor's canonical
request_id format ({target}_{YYYYMMDD}_{HHMMSS}) instead of the
ad-hoc goto-prefixed format.

Refs OP#2357

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mike Bros approved these changes 2026-03-18 06:23:05 +00:00
Mike Bros merged commit 4f6ea398ce into master 2026-03-18 06:23:20 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
mike/kickstart.nvim!56
No description provided.