Skip to content

Report shows incorrect 'passed' status and omits final objective result #56

@surishubham

Description

@surishubham

Bug Report

Summary

The Kane CLI shareable report on Test Manager shows a passed status even though the actual test objective was never completed. The report only reflects that an intermediate step (login) succeeded, not the final goal of the prompt.

Steps to Reproduce

  1. Run kane-cli run "<objective with multiple steps>" --agent where the objective requires login + a subsequent action
  2. The agent successfully logs in but hits a blocker before completing the final objective (e.g., a paywall)
  3. Open the generated test_url report in Test Manager

Actual Behavior

  • The report shows the test as passed
  • The final objective/assertion (e.g., "click Upgrade Now and assert billing URL contains 'product_group_key=kane-ai-group'") is not reflected in the report
  • The report appears to mark the run as passed based on partial step completion (login succeeded) without evaluating whether the full prompt goal was achieved

Expected Behavior

  • The report should accurately reflect the final result of the complete prompt objective
  • If the agent could not complete the main goal (even if intermediate steps passed), the report status should be failed or incomplete
  • The full prompt/objective text should be visible in the report so reviewers can understand what was being tested

Run Details

Environment

  • kane-cli version: 0.3.4
  • Auth: OAuth, prod environment
  • OS: macOS Darwin 25.3.0

Impact

This is a trust/correctness issue — developers and reviewers using the shareable report URL to verify a PR (e.g., as a CI gate) will see a misleading "passed" result when the actual test goal was never verified. The report cannot be relied upon as evidence that the change works correctly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions