Skip to content

chore(pm): perf trace#2571

Draft
elrrrrrrr wants to merge 2 commits intonextfrom
chore-pm-perf-trace
Draft

chore(pm): perf trace#2571
elrrrrrrr wants to merge 2 commits intonextfrom
chore-pm-perf-trace

Conversation

@elrrrrrrr
Copy link
Contributor

Summary

Test Plan

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @elrrrrrrr, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the observability and diagnostic capabilities of the utoo-pm package manager. By integrating tracing-chrome, it enables the generation of detailed performance traces that can be visualized and analyzed using standard tools. Complementing this, a new Python script automates the analysis of these traces, producing structured reports that highlight potential bottlenecks. A comprehensive protocol document guides the use of these tools for systematic performance optimization, providing a robust framework for understanding and improving the performance characteristics of utoo-pm's core operations.

Highlights

  • Performance Tracing Integration: Integrated the tracing-chrome crate into utoo-pm as an optional dependency, enabling the generation of detailed Chrome Trace format data for performance analysis. This allows developers to visualize execution flows and identify bottlenecks using tools like chrome://tracing.
  • Automated Trace Analysis Script: Introduced a new Python script (analyze_pm_trace.py) designed to parse the generated Chrome Trace data. This script calculates key performance metrics related to network, file I/O, and decompression, and generates comprehensive Markdown reports, including a summary mode for comparing multiple traces.
  • Performance Analysis Protocol: Added a detailed Markdown document (utoopm-performance-agent.md) that outlines a structured protocol for diagnosing and optimizing utoo-pm performance. It defines a tiered diagnostic matrix (P0-P3) covering network, file I/O, decompression, and concurrency, along with an actionable workflow and optimization playbook.
  • Granular Span Instrumentation: Applied #[instrument] macros and .instrument(tracing::trace_span!()) to critical asynchronous functions and retry blocks across various pm crates. This provides fine-grained visibility into the execution times of operations such as dependency building, package installation, cloning, downloading, and unpacking.
  • Unified Tracing Guard: Implemented a TracingGuard enum in the logger utility to manage different tracing backend guards (standard file logging and Chrome trace). This ensures proper flushing and resource management, especially when switching between standard logging and performance tracing modes via environment variables.
Changelog
  • Cargo.lock
    • Added tracing-chrome to the dependency lock file.
  • agents/tools/analyze_pm_trace.py
    • Added new Python script for analyzing Chrome Trace data from utoo-pm.
    • Implemented single trace analysis mode to generate detailed performance reports.
    • Implemented summary mode to compare metrics across multiple trace files.
    • Calculates and reports on network, file I/O, and decompression performance metrics.
    • Generates markdown reports with executive summaries, workload distributions, and recommendations.
  • agents/utoopm-performance-agent.md
    • Added new documentation outlining the 'utoo-pm Performance Analysis Agent Protocol'.
    • Defined a tiered diagnostic matrix (P0-P3) for network, file I/O, decompression, and concurrency.
    • Provided an actionable diagnostic workflow, cache analysis, and optimization playbook.
    • Mapped operations to relevant files and tracing span names.
  • crates/pm/Cargo.toml
    • Added tracing-chrome as an optional dependency.
    • Introduced a tracing-chrome feature flag to conditionally enable the dependency.
  • crates/pm/src/cmd/deps.rs
    • Added #[instrument] macro to the build_deps function for tracing.
  • crates/pm/src/helper/lock.rs
    • Added #[instrument] macro to ensure_package_lock function for tracing.
    • Added #[instrument] macro to build_deps_with_download function for tracing.
  • crates/pm/src/service/install.rs
    • Added #[instrument] macro to clean_deps function for tracing.
    • Added #[instrument] macro to install_packages function, including depth_levels field.
    • Added #[instrument] macro to InstallService::install function for tracing.
  • crates/pm/src/service/rebuild.rs
    • Added #[instrument] macro to RebuildService::rebuild function for tracing.
  • crates/pm/src/util/cloner.rs
    • Added #[instrument] macro to the clone_dir function for tracing.
    • Instrumented platform-specific clone operations (clonefile, linux_clone, windows_copy) with dedicated trace spans.
    • Added #[instrument] macro to clone_package function, including pkg field for tracing.
  • crates/pm/src/util/downloader.rs
    • Added #[instrument] macro to the download function, including url field for tracing.
    • Instrumented HTTP requests within download with a dedicated trace span.
    • Added #[instrument] macro to try_unpack_stream_direct function for tracing.
    • Instrumented tar_extract and file_write_batch tasks with dedicated trace spans, including file count and total bytes.
  • crates/pm/src/util/linker.rs
    • Added #[instrument] macro to the link function for tracing.
  • crates/pm/src/util/logger.rs
    • Introduced TracingGuard enum to unify different tracing backend guards.
    • Modified init_tracing to support Chrome Trace output when TRACING_CHROME environment variable is set.
    • Added init_chrome_tracing function (conditional on tracing-chrome feature) for Chrome trace setup.
    • Refactored original tracing initialization into init_standard_tracing.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/pm-perf.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces performance tracing capabilities using tracing and tracing-chrome, which is a great addition for performance analysis and optimization. The changes include adding #[instrument] macros across the codebase, a new Python script for trace analysis, and documentation for the performance analysis protocol. The implementation is solid, but I've found a couple of issues: one correctness bug in the Python analysis script that leads to inaccurate I/O metrics, and another bug in the downloader's batching logic. My review includes suggestions to fix these issues.

Comment on lines +181 to +182
io_time_ms = clone_stats['duration'] / 1000.0
io_pct = (clone_stats['duration'] * 100) / max_thread_work
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The clone_stats['duration'] metric is likely incorrect. The logic for its calculation (lines 91-99) double-counts durations from nested spans (e.g., clone_package contains clone_dir), leading to an inflated total I/O time.

To get an accurate total I/O time, you should use the duration from cat_stats['P1: File I/O'], which is calculated correctly without double-counting. This will give you the true total time for all file I/O operations.

Suggested change
io_time_ms = clone_stats['duration'] / 1000.0
io_pct = (clone_stats['duration'] * 100) / max_thread_work
io_stats = cat_stats.get('P1: File I/O', {'duration': 0.0})
io_time_ms = io_stats['duration'] / 1000.0
io_pct = (io_stats['duration'] * 100) / max_thread_work

for task in write_tasks.drain(..) {
task.await??;
}
batch_size = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The total_bytes variable is not reset to 0 after a batch is processed. This will cause the batching condition total_bytes >= MAX_BATCH_BYTES to behave incorrectly for subsequent batches, potentially leading to smaller-than-intended batches and affecting performance.

                        batch_size = 0;
                        total_bytes = 0;

@github-actions
Copy link

github-actions bot commented Feb 3, 2026

utoo-pm Performance Report (ubuntu-latest)

Click to expand full report

utoo-pm Performance Summary

Generated: 2026-02-03 15:12:59
Trace Count: 8


Benchmark Results Comparison

Scenario Wall Time Network I/O Decompress Parallelism
ant-design-x_npmjs_cold_ubuntu-latest 25,621ms 21.9% 3.1% 30.2% 2.0x
ant-design-x_npmjs_warm_ubuntu-latest 9,310ms 19.0% 31.1% 0.0% 1.2x
ant-design-x_npmmirror_cold_ubuntu-latest 134,164ms 11.3% 6.1% 60.1% 0.2x
ant-design-x_npmmirror_warm_ubuntu-latest 38,492ms 15.1% 45.2% 0.0% 0.3x
ant-design_npmjs_cold_ubuntu-latest 21,121ms 22.2% 4.0% 28.9% 2.4x
ant-design_npmjs_warm_ubuntu-latest 8,022ms 18.9% 30.3% 0.4% 1.2x
ant-design_npmmirror_cold_ubuntu-latest 144,550ms 11.0% 5.3% 61.8% 0.1x
ant-design_npmmirror_warm_ubuntu-latest 9,868ms 15.1% 45.9% 0.0% 0.8x

By Project

ant-design

Type Registry Wall Time Network % I/O %
cold npmjs 21,121ms 22.2% 4.0%
warm npmjs 8,022ms 18.9% 30.3%
cold npmmirror 144,550ms 11.0% 5.3%
warm npmmirror 9,868ms 15.1% 45.9%

ant-design-x

Type Registry Wall Time Network % I/O %
cold npmjs 25,621ms 21.9% 3.1%
warm npmjs 9,310ms 19.0% 31.1%
cold npmmirror 134,164ms 11.3% 6.1%
warm npmmirror 38,492ms 15.1% 45.2%

Registry Comparison

Registry Avg Wall Time Avg Network % Scenarios
npmjs 16,019ms 20.5% 4
npmmirror 81,769ms 13.1% 4

Cold vs Warm Install

Type Avg Wall Time Avg Network % Avg I/O % Scenarios
cold 81,364ms 16.6% 4.6% 4
warm 16,423ms 17.0% 38.1% 4

Cache Speedup: 5.0x faster with warm cache


Summary generated by utoo-pm Performance Analysis Agent

@github-actions
Copy link

github-actions bot commented Feb 3, 2026

utoo-pm Performance Report (macos-latest)

Click to expand full report

utoo-pm Performance Summary

Generated: 2026-02-03 15:18:30
Trace Count: 8


Benchmark Results Comparison

Scenario Wall Time Network I/O Decompress Parallelism
ant-design-x_npmjs_cold_macos-latest 49,406ms 17.0% 37.0% 13.4% 2.1x
ant-design-x_npmjs_warm_macos-latest 29,044ms 1.4% 95.2% 0.0% 6.0x
ant-design-x_npmmirror_cold_macos-latest 73,935ms 10.3% 49.8% 25.3% 0.7x
ant-design-x_npmmirror_warm_macos-latest 30,176ms 2.6% 91.1% 0.0% 1.8x
ant-design_npmjs_cold_macos-latest 49,987ms 19.5% 31.6% 11.9% 2.0x
ant-design_npmjs_warm_macos-latest 18,732ms 3.5% 88.2% 0.0% 4.1x
ant-design_npmmirror_cold_macos-latest 103,073ms 12.2% 44.0% 27.9% 0.5x
ant-design_npmmirror_warm_macos-latest 15,044ms 2.3% 91.8% 0.0% 2.9x

By Project

ant-design

Type Registry Wall Time Network % I/O %
cold npmjs 49,987ms 19.5% 31.6%
warm npmjs 18,732ms 3.5% 88.2%
cold npmmirror 103,073ms 12.2% 44.0%
warm npmmirror 15,044ms 2.3% 91.8%

ant-design-x

Type Registry Wall Time Network % I/O %
cold npmjs 49,406ms 17.0% 37.0%
warm npmjs 29,044ms 1.4% 95.2%
cold npmmirror 73,935ms 10.3% 49.8%
warm npmmirror 30,176ms 2.6% 91.1%

Registry Comparison

Registry Avg Wall Time Avg Network % Scenarios
npmjs 36,792ms 10.3% 4
npmmirror 55,557ms 6.8% 4

Cold vs Warm Install

Type Avg Wall Time Avg Network % Avg I/O % Scenarios
cold 69,100ms 14.7% 40.6% 4
warm 23,249ms 2.5% 91.6% 4

Cache Speedup: 3.0x faster with warm cache


Summary generated by utoo-pm Performance Analysis Agent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant