Skip to content

feat: add pipeline cost forecasting and budget approval gate#196

Closed
RyanD66 wants to merge 1 commit intosethdford:mainfrom
RyanD66:fix/issue-169
Closed

feat: add pipeline cost forecasting and budget approval gate#196
RyanD66 wants to merge 1 commit intosethdford:mainfrom
RyanD66:fix/issue-169

Conversation

@RyanD66
Copy link
Copy Markdown

@RyanD66 RyanD66 commented Mar 2, 2026

Summary

  • add pre-start cost forecast generation using complexity, historical completed runs, pipeline composition, iteration limits, and model routing
  • display forecast in CLI and persist to .claude/pipeline-artifacts/cost-forecast.json
  • add configurable pre-start approval gate (default threshold: $10) with new --skip-cost-approval flag for daemon/autonomous operation
  • emit pipeline.cost_forecast and blocked approval events; feed forecasted cost into validation loop for prediction accuracy tracking
  • add test coverage ensuring the new CLI flag is documented

Fixes #169

Summary by CodeRabbit

  • New Features

    • Cost forecasting integrated into pipeline startup to predict expenses before execution
    • Cost approval gate validates pipeline costs against configured threshold before proceeding
    • Added --skip-cost-approval CLI flag to bypass cost approval gates
  • Tests

    • Added test validation for --skip-cost-approval CLI help documentation

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 2, 2026

📝 Walkthrough

Walkthrough

Added cost forecasting and budget approval functionality to pipeline startup. The system computes predicted costs before pipeline execution, writes forecasts to artifacts, displays summaries, and enforces user approval if costs exceed a configurable threshold. Includes a corresponding CLI flag and test coverage.

Changes

Cohort / File(s) Summary
Cost Forecasting Implementation
scripts/sw-pipeline.sh
Added three functions (write_cost_forecast, forecast_pipeline_cost, require_cost_approval_if_needed) that compute cost predictions, persist forecasts to artifacts, and enforce pre-start approval gates. Integrated into pipeline_start flow. Added SKIP_COST_APPROVAL flag and COST_APPROVAL_THRESHOLD_USD threshold to defaults. Extended CLI parsing to accept --skip-cost-approval option and updated help text.
Test Coverage
scripts/sw-pipeline-test.sh
Added test_help_includes_skip_cost_approval function to verify CLI help output includes the new skip-cost-approval option and relevant descriptions. Registered test in main test suite.

Sequence Diagram

sequenceDiagram
    actor User
    participant CLI as CLI Parser
    participant Pipeline as Pipeline Start
    participant Forecaster as Cost Forecaster
    participant Artifacts as Artifacts Store
    participant Approver as Approval Gate
    
    User->>CLI: Execute pipeline with args
    CLI->>Pipeline: Invoke pipeline_start()
    Pipeline->>Forecaster: forecast_pipeline_cost()
    Note over Forecaster: Compute cost using<br/>stage config, complexity,<br/>historical data
    Forecaster-->>Pipeline: Return forecast JSON<br/>(predicted_cost, margin, duration)
    Pipeline->>Artifacts: write_cost_forecast(forecast_json)
    Artifacts-->>Pipeline: Forecast saved
    Pipeline->>Approver: require_cost_approval_if_needed(forecast)
    alt Cost exceeds threshold AND not skip flag
        Approver->>User: Display forecast summary
        Approver->>User: Prompt for approval
        User-->>Approver: Approve or Deny
        alt User denies
            Approver-->>Pipeline: Block pipeline start
        else User approves
            Approver-->>Pipeline: Proceed
        end
    else Threshold OK OR skip flag set
        Approver-->>Pipeline: Proceed without prompt
    end
    Pipeline->>User: Pipeline starts (or blocked)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Whiskers twitching with delight,
A forecaster brings clarity to the night!
Costs predicted, budgets blessed,
Before the pipeline's grand quest.
Approval gates now stand on guard,
No surprise bills in our yard! 💰

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 63.64% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description covers key objectives and includes a linked issue reference, but the description template's Test Plan and Shipwright Standards Checklist sections are not filled out. Complete the Test Plan section with specific test details and verify all Shipwright Standards Checklist items are addressed.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and accurately describes the main change: adding cost forecasting and budget approval functionality to the pipeline.
Linked Issues check ✅ Passed The PR implements core cost forecasting requirements including forecast generation, display, approval gate, --skip-cost-approval flag, and test coverage for the CLI flag.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing the cost forecasting feature: new shell functions, configuration flags, CLI options, test coverage, and artifact persistence.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
scripts/sw-pipeline-test.sh (1)

531-535: Strengthen the help assertion to validate semantics, not just token presence.

At Line 534, the check only verifies the flag string exists. This can still pass if the help text loses the “bypass/skip approval gate” meaning.

Proposed test hardening
 test_help_includes_skip_cost_approval() {
     invoke_pipeline --help
     assert_exit_code 0 "help should succeed" &&
-    assert_output_contains "skip-cost-approval" "help documents cost approval bypass"
+    assert_output_contains "--skip-cost-approval" "help includes skip-cost-approval option" &&
+    assert_output_contains "Skip pre-start cost approval gate|cost approval bypass" "help explains approval bypass behavior"
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/sw-pipeline-test.sh` around lines 531 - 535, The
test_help_includes_skip_cost_approval function currently only asserts the token
"skip-cost-approval" exists; update it to assert the flag's help line conveys
the intended meaning by checking the help output for both the flag token and a
semantic phrase (e.g., words like "skip", "bypass", or "ignore" together with
"approval" or "cost approval"). Locate the call to invoke_pipeline --help and
replace the single assert_output_contains for "skip-cost-approval" with an
assertion that matches the full help line (or matches a regex) containing the
flag plus a semantic description, ensuring the test fails if the explanatory
text is removed or altered.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/sw-pipeline.sh`:
- Around line 286-295: The cost forecast currently ignores per-stage routing by
using a single MODEL-derived rate; instead iterate the already-built model_plan
to compute blended input/output rates from COST_MODEL_RATES (fallbacks kept)
weighted by stage usage (e.g., count or per-stage token proportions), then
replace the single input_rate/output_rate used in base_cost and the later block
(the code computing base_cost, complexity_multiplier and the duplicate logic
around lines 324-346) with the blended rates; ensure you reference model_plan,
COST_MODEL_RATES, input_tokens, output_tokens, base_cost, input_rate,
output_rate and complexity_multiplier so the script sums per-stage contributions
and computes the final predicted_cost consistently.
- Around line 367-377: When the cost gate is accepted (the branch where read -rp
sets answer and the grep check passes), emit a complementary approval event—call
emit_event with a new event name like "pipeline.cost_approval_approved" and
include the same metadata used for blocked (predicted_cost=${predicted},
threshold=${threshold}, issue=${ISSUE_NUMBER:-0}) so telemetry is complete; add
this emit_event call in the success path right after the user confirms and
before returning 0, referencing the existing needs_approval check, the answer
variable, and the emit_event function.

---

Nitpick comments:
In `@scripts/sw-pipeline-test.sh`:
- Around line 531-535: The test_help_includes_skip_cost_approval function
currently only asserts the token "skip-cost-approval" exists; update it to
assert the flag's help line conveys the intended meaning by checking the help
output for both the flag token and a semantic phrase (e.g., words like "skip",
"bypass", or "ignore" together with "approval" or "cost approval"). Locate the
call to invoke_pipeline --help and replace the single assert_output_contains for
"skip-cost-approval" with an assertion that matches the full help line (or
matches a regex) containing the flag plus a semantic description, ensuring the
test fails if the explanatory text is removed or altered.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d9f1cb8 and e5f37d5.

📒 Files selected for processing (2)
  • scripts/sw-pipeline-test.sh
  • scripts/sw-pipeline.sh

Comment on lines +286 to +295
local model_plan model model_key input_rate output_rate
model_plan=$(jq -c '[.stages[] | select(.enabled==true) | {id: .id, model: (.config.model // .model // empty)}]' "$PIPELINE_CONFIG" 2>/dev/null || echo '[]')
model="${MODEL:-$(jq -r '.defaults.model // "sonnet"' "$PIPELINE_CONFIG" 2>/dev/null || echo sonnet)}"
model_key=$(echo "$model" | tr '[:upper:]' '[:lower:]')
input_rate=$(echo "$COST_MODEL_RATES" | jq -r ".${model_key}.input // 3" 2>/dev/null || echo "3")
output_rate=$(echo "$COST_MODEL_RATES" | jq -r ".${model_key}.output // 15" 2>/dev/null || echo "15")

local base_cost complexity_multiplier iteration_multiplier predicted_cost margin
base_cost=$(awk -v it="$input_tokens" -v ot="$output_tokens" -v ir="$input_rate" -v or="$output_rate" 'BEGIN{printf "%.4f", ((it/1000000)*ir)+((ot/1000000)*or)}')
complexity_multiplier=$(awk -v c="$complexity_score" 'BEGIN{printf "%.3f", 0.85 + (c/10)*0.5}')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Forecast cost calculation currently ignores per-stage model routing.

At Line 287 you build model_plan, but at Line 288-295 pricing is computed from a single model. For mixed-model stage configs, this can significantly skew predicted cost.

Proposed fix (routing-aware blended rates)
-    local model_plan model model_key input_rate output_rate
+    local model_plan model model_key input_rate output_rate
     model_plan=$(jq -c '[.stages[] | select(.enabled==true) | {id: .id, model: (.config.model // .model // empty)}]' "$PIPELINE_CONFIG" 2>/dev/null || echo '[]')
     model="${MODEL:-$(jq -r '.defaults.model // "sonnet"' "$PIPELINE_CONFIG" 2>/dev/null || echo sonnet)}"
     model_key=$(echo "$model" | tr '[:upper:]' '[:lower:]')
-    input_rate=$(echo "$COST_MODEL_RATES" | jq -r ".${model_key}.input // 3" 2>/dev/null || echo "3")
-    output_rate=$(echo "$COST_MODEL_RATES" | jq -r ".${model_key}.output // 15" 2>/dev/null || echo "15")
+    # Blend rates using per-stage model routing; fallback to selected/default model.
+    input_rate=$(jq -n \
+        --argjson mp "$model_plan" \
+        --argjson rates "$COST_MODEL_RATES" \
+        --arg fallback "$model_key" '
+        if ($mp|length) == 0 then
+          ($rates[$fallback].input // 3)
+        else
+          (($mp | map((.model // $fallback | ascii_downcase) as $m | ($rates[$m].input // 3)) | add) / ($mp|length))
+        end
+    ' 2>/dev/null || echo "3")
+    output_rate=$(jq -n \
+        --argjson mp "$model_plan" \
+        --argjson rates "$COST_MODEL_RATES" \
+        --arg fallback "$model_key" '
+        if ($mp|length) == 0 then
+          ($rates[$fallback].output // 15)
+        else
+          (($mp | map((.model // $fallback | ascii_downcase) as $m | ($rates[$m].output // 15)) | add) / ($mp|length))
+        end
+    ' 2>/dev/null || echo "15")

Also applies to: 324-346

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/sw-pipeline.sh` around lines 286 - 295, The cost forecast currently
ignores per-stage routing by using a single MODEL-derived rate; instead iterate
the already-built model_plan to compute blended input/output rates from
COST_MODEL_RATES (fallbacks kept) weighted by stage usage (e.g., count or
per-stage token proportions), then replace the single input_rate/output_rate
used in base_cost and the later block (the code computing base_cost,
complexity_multiplier and the duplicate logic around lines 324-346) with the
blended rates; ensure you reference model_plan, COST_MODEL_RATES, input_tokens,
output_tokens, base_cost, input_rate, output_rate and complexity_multiplier so
the script sums per-stage contributions and computes the final predicted_cost
consistently.

Comment on lines +367 to +377
if [[ "$needs_approval" == "true" ]]; then
echo -e " ${YELLOW}Approval required:${RESET} forecast exceeds threshold (\$$threshold)"
local answer=""
read -rp " Proceed with pipeline start? [y/N] " answer || true
if ! echo "$answer" | grep -qiE '^(y|yes)$'; then
warn "Pipeline start canceled by user (cost approval gate)"
emit_event "pipeline.cost_approval_blocked" "predicted_cost=${predicted}" "threshold=${threshold}" "issue=${ISSUE_NUMBER:-0}"
return 1
fi
fi
return 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Emit an explicit “approved” event when the user accepts the cost gate.

At Line 373, blocked decisions are emitted, but accepted decisions are not. This leaves approval telemetry incomplete.

Proposed fix
     if [[ "$needs_approval" == "true" ]]; then
         echo -e "  ${YELLOW}Approval required:${RESET} forecast exceeds threshold (\$$threshold)"
         local answer=""
         read -rp "  Proceed with pipeline start? [y/N] " answer || true
         if ! echo "$answer" | grep -qiE '^(y|yes)$'; then
             warn "Pipeline start canceled by user (cost approval gate)"
             emit_event "pipeline.cost_approval_blocked" "predicted_cost=${predicted}" "threshold=${threshold}" "issue=${ISSUE_NUMBER:-0}"
             return 1
         fi
+        emit_event "pipeline.cost_approval_approved" "predicted_cost=${predicted}" "threshold=${threshold}" "issue=${ISSUE_NUMBER:-0}" "mode=manual"
     fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if [[ "$needs_approval" == "true" ]]; then
echo -e " ${YELLOW}Approval required:${RESET} forecast exceeds threshold (\$$threshold)"
local answer=""
read -rp " Proceed with pipeline start? [y/N] " answer || true
if ! echo "$answer" | grep -qiE '^(y|yes)$'; then
warn "Pipeline start canceled by user (cost approval gate)"
emit_event "pipeline.cost_approval_blocked" "predicted_cost=${predicted}" "threshold=${threshold}" "issue=${ISSUE_NUMBER:-0}"
return 1
fi
fi
return 0
if [[ "$needs_approval" == "true" ]]; then
echo -e " ${YELLOW}Approval required:${RESET} forecast exceeds threshold (\$$threshold)"
local answer=""
read -rp " Proceed with pipeline start? [y/N] " answer || true
if ! echo "$answer" | grep -qiE '^(y|yes)$'; then
warn "Pipeline start canceled by user (cost approval gate)"
emit_event "pipeline.cost_approval_blocked" "predicted_cost=${predicted}" "threshold=${threshold}" "issue=${ISSUE_NUMBER:-0}"
return 1
fi
emit_event "pipeline.cost_approval_approved" "predicted_cost=${predicted}" "threshold=${threshold}" "issue=${ISSUE_NUMBER:-0}" "mode=manual"
fi
return 0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/sw-pipeline.sh` around lines 367 - 377, When the cost gate is
accepted (the branch where read -rp sets answer and the grep check passes), emit
a complementary approval event—call emit_event with a new event name like
"pipeline.cost_approval_approved" and include the same metadata used for blocked
(predicted_cost=${predicted}, threshold=${threshold}, issue=${ISSUE_NUMBER:-0})
so telemetry is complete; add this emit_event call in the success path right
after the user confirms and before returning 0, referencing the existing
needs_approval check, the answer variable, and the emit_event function.

@RyanD66 RyanD66 closed this by deleting the head repository Apr 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pipeline Cost Forecasting and Budget Approval Gate

1 participant