MissionForge | Measurable refactor missions for IBM Bob

The problem

Bob re-reads the same files 2–3× per session

Read files broadly Explore codebase structure

Plan the refactor Bob reasons and outlines steps

Re-read during implementation Same files again — token cost ×2

Token waste

Run grep / find Verify scope

Re-read to verify Same files again — token cost ×3

Token waste

Re-read to summarize Final confirmation pass

Token waste

Bob measures the code.MissionForge measures the mission.

The CLI never reads source code. It is language-agnostic by construction.

Workflow

Six phases. Most CLI commands cost zero tokens.

01

Zero

Init

/mf-init

Create mission workspace, write goal and forbidden paths.

02

High

Decompose

/mf-decompose

Bob reads codebase once and writes sub-mission files + dependency plan.

03

Focused

Baseline

/mf-baseline

Bob measures before-state metrics for each sub-mission's scoped files.

04

Focused

Implement

Bob works

Bob implements within allowed paths. Forbidden paths enforced by the CLI.

05

Focused

Validate

/mf-validate

CLI runs git diff + tests. Bob fills final metric values.

06

Zero

Report

/mf-report

CLI templates the PR-ready evidence report. No Bob needed.

Decomposition

Without MissionForge, Bob is flying blind.

With it, every refactor has a contract, a dependency graph, and a validation gate. Here's the difference — and exactly how Bob and MissionForge work together to get there.

Without MissionForge

Bob free-exploring

↻

Files read 2–3× per session

Bob explores, implements, then re-reads the same files to verify — no stable place to store what it learned.

⊘

No scope boundary

Nothing stops Bob from touching files it shouldn't. You find out in code review — or you don't.

⟂

No before/after evidence

Did the refactor actually work? You have Bob's word for it. No baseline, no metric, no proof.

⤫

All or nothing execution

One large session that fails halfway means starting over. No sub-tasks, no recovery, no independent audit trail.

With MissionForge

Bob with a contract

▣

Bob reads each file once

MissionForge records findings to disk. Subsequent steps reference the mission file — no re-exploration.

⛃

Forbidden paths enforced by git diff

The CLI checks every changed file against the allowed list. Violations are flagged before validation can pass.

▤

Immutable baseline + final metrics

Bob measures once. The CLI commits the numbers. Validation compares final state against the baseline — no ambiguity.

⋈

Scoped sub-missions, dependency order

Bob decomposes the work. Each sub-mission passes its own gate. A failure retries without restarting the parent.

You

Bob

MissionForge CLI

Bob + MissionForge

YOU

You

Write the mission goal and safety boundary

Define what needs to change, what must never be touched, and what success looks like — in one YAML file. One command creates the workspace.

$ missionforge init MF-001

B+M

Bob + MissionForge

Bob reads the codebase once. MissionForge scopes the questions.

MissionForge generates focused prompts from the mission contract. Bob reads only what's in scope — storing findings to disk so they're never re-read.

$ missionforge decompose MF-001

BOB

Bob

Proposes the sub-mission breakdown

Bob writes sub-mission YAML files — each with its own scope, metrics, and dependencies. You review and approve before any code changes begin.

✓ MF-001-A Replace token validation ✓ MF-001-B Update session middleware ◎ MF-001-C Integration test (gates on A+B)

CLI

MissionForge CLI

Validates the graph. Resolves execution order.

Checks for cycles, overlapping paths, and invalid references. Computes topological order and tells you exactly what's ready to run.

→ MF-001-A and MF-001-B are ready. MF-001-C waits on both to pass.

B+M

Bob + MissionForge

Bob implements. MissionForge enforces scope + captures evidence.

After each sub-mission, the CLI runs git diff against forbidden paths, executes the test command, and records pass/fail with full evidence. Bob never re-reads to verify — MissionForge does it deterministically.

$ missionforge validate MF-001-A --capture

CLI

MissionForge CLI

Generates the PR-ready evidence report

Baselines, final metrics, test results, git diff, and scope audit — all templated automatically. Zero Bob tokens spent on reporting.

$ missionforge report MF-001

Live dependency state

Parent mission MF-001

Modernize the authentication layer

⊘ src/legacy-auth/** (forbidden)

◎ 2 aggregate metrics

⚑ ./run-tests.sh

latentparallel

MF-001-A Ready

Replace token validation

src/auth/token.ts

errors: 14→0

MF-001-B Ready

Update session middleware

src/middleware/session.ts

calls: 8→0

MF-001-C Waiting

End-to-end auth integration test

Gates on: MF-001-A + MF-001-B

auth_e2e_passes: false→true

$ missionforge next MF-001

→ A and B are ready. Execute either.

C is waiting on both to pass.

Mission contract

Goal, forbidden paths, metrics, and validation — in one file

id: MF-001
goal: |
  Modernize the authentication layer to remove
  all legacy token handling while preserving
  existing session behaviour end-to-end.

forbidden_paths:
  - src/payments/**
  - src/user-profile/**
  - src/admin/**

aggregate_metrics:
  - id: legacy_token_calls
    baseline_target: 22
    final_target: 0

  - id: auth_e2e_test_passes
    baseline_target: false
    final_target: true

test_command: ./run-tests.sh --suite auth

Goal prose

Plain English mission statement. Bob writes this with you before decomposition begins.

Forbidden paths

Global safety boundary. No sub-mission can touch these files — the CLI enforces it via git diff.

Aggregate metrics

Before/after numbers. Bob measures; the CLI records them and validates against targets.

Test command

Shell command the CLI runs for deterministic test evidence.

Mission Board

Kanban view, dependency diamond, stakeholder translation

The Mission Board visualizes CLI state in real time. Engineers see scope and metrics. PMs get a business-friendly translation of mission progress.

Parent + sub-mission cards

Expand any parent mission to reveal the dependency diamond with ready, in-progress, and blocked states visible at a glance.

Real-time polling

Board updates every 3–5 seconds as Bob and the CLI change mission state. No WebSockets required.

Stakeholder view toggle

One click translates technical evidence into a plain-English business summary, generated by watsonx.ai.

Benchmark

A simple, falsifiable comparison

Metric	Without MissionForge	With MissionForge
Bobcoin cost on same refactor	✗ Measured (baseline)	✓ Measured — target: lower
Forbidden files touched	✗ Inspected manually	✓ Enforced by CLI — zero violations
PR-attachable evidence report	✗ Not produced	✓ Generated automatically
Baseline metrics recorded	✗ None	✓ Structured JSON, immutable after commit
Per-metric pass/fail validation	✗ None	✓ Every sub-mission, every metric
Visible to non-engineers	✗ No	✓ Mission Board + Stakeholder Translation
Re-runnable validation	✗ No	✓ Yes — CLI is idempotent

One mission down.The next is already planned.

Phase 1: CLI Harness · Complete

Phase 2: Mission Board · Building

Phase 3: True Parallel Execution · Roadmap

Get started Read the docs

Bob measures the code. MissionForge measures the mission. The Board makes it visible. One mission at a time, until the system is done.