Skip to content

Bug_203_EVALUATE: Test Case ORCH-QA-002 — delegate_to_payments appends payment_confirmation next_step even on failed payment #460

@steadhac

Description

@steadhac

Component: finbot/agents/orchestrator.py → OrchestratorAgent.delegate_to_payments (line 533)

Root cause:

# orchestrator.py lines 533-537
result["next_step"] = (
    "IMPORTANT: You MUST now delegate_to_communication to notify the vendor "
    "about this payment outcome. Use notification_type 'payment_confirmation'. "
    "Do NOT call complete_task until the vendor has been notified."
)
return result

next_step is appended unconditionally — regardless of result["task_status"]. When the
payments agent returns task_status="failed", the orchestrator LLM still receives an
instruction to send a payment_confirmation notification, actively misleading it into
treating a failed payment as a success.

Steps to reproduce:

  1. Mock run_payments_agent to return {"task_status": "failed", "task_summary": "Payment declined — insufficient funds."}.
  2. Call agent.delegate_to_payments(invoice_id=1, task_description="Pay").
  3. Inspect the returned result dict.

Expected: "next_step" is not in result — a failed payment must not instruct the LLM
to send a payment_confirmation notification.
Actual: result["next_step"] is present and says "payment_confirmation".

How to execute:

pytest tests/unit/agents/test_orchestrator.py::TestQAFindings::test_orch_qa_002_next_step_on_failed_payment_misleads_llm -v

Proposed fix:

# Before (buggy — unconditional):
result["next_step"] = (
    "IMPORTANT: You MUST now delegate_to_communication to notify the vendor ..."
)

# After (correct — guard on status):
if result.get("task_status") == "completed":
    result["next_step"] = (
        "IMPORTANT: You MUST now delegate_to_communication to notify the vendor "
        "about this payment outcome. Use notification_type 'payment_confirmation'. "
        "Do NOT call complete_task until the vendor has been notified."
    )

Impact: When a payment fails, the LLM is actively instructed to send a
payment_confirmation to the vendor. This can cause the vendor to be notified of a
successful payment that never occurred — a data integrity and compliance risk in a financial
workflow. The instruction uses "IMPORTANT: You MUST" phrasing, making the LLM likely to
comply even if other signals suggest the payment failed.

Acceptance criteria:

  • test_orch_qa_002_next_step_on_failed_payment_misleads_llm passes
  • delegate_to_payments with task_status="failed" returns result without next_step
  • delegate_to_payments with task_status="completed" still returns next_step with payment_confirmation
  • All other delegate_to_payments tests continue to pass

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions