Skip to content

TodoWrite consumes 13.7% of tool call budget with no measurable benefit #128

@greynewell

Description

@greynewell

Problem

In our SWE-bench-verified evaluation, the agent spends 13.7% of all tool calls on TodoWrite (Claude Code's built-in task tracking tool), averaging 3.6 calls per task. This tool call overhead provides no measurable benefit — tasks with heavy TodoWrite usage don't resolve at higher rates.

Data

  • 3.6 TodoWrite calls per task on average (across all MCP tasks)
  • 13.7% of total tool budget consumed by TodoWrite
  • No correlation between TodoWrite usage and task resolution
  • With a 30-iteration limit, each wasted call is ~3.3% of the total budget

Root Cause

Claude Code's default behavior includes proactive task list management. When the agent receives a complex problem statement, it creates a todo list, updates it as it works, and marks items complete — all consuming tool call turns that could be spent on actual exploration and coding.

Impact

Recovering even half of these wasted calls would give the agent ~2 additional exploration or editing turns per task. Over 500 tasks, this is significant.

Recommended Fixes

  1. Add instruction to MCP server: "Do not use TodoWrite or task management tools — focus all tool calls on exploration and code editing"
  2. Include in agent_prompt: Add a line discouraging TodoWrite usage (though this must be balanced against the prompt length findings from Long agent_prompt suppresses parallel tool calling in Claude Code harness #123)
  3. Investigate: Whether this can be suppressed via Claude Code configuration rather than instructions

Labels

performance, swe-bench

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions