Skip to content

Add hybrid profile support, resolution matching, and system metrics#55

Merged
izzet merged 12 commits intollnl:developfrom
izzet:feature/profile-support-resolution-match
Mar 21, 2026
Merged

Add hybrid profile support, resolution matching, and system metrics#55
izzet merged 12 commits intollnl:developfrom
izzet:feature/profile-support-resolution-match

Conversation

@izzet
Copy link
Collaborator

@izzet izzet commented Mar 20, 2026

Summary

  • Hybrid profile support: Parse DFTracer ph="C" counter events into a canonical profile schema, enabling analysis of selectively-aggregated traces alongside normal duration traces
  • Resolution matching: When analysis granularity is finer than the profile bucket width (e.g., 1s analysis vs 5s profiles), expand profile rows into sub-buckets with uniform or weighted distribution, then reconcile with trace HLM using a trace-wins deduplication policy
  • Per-layer reconciliation: Reconcile trace HLM and profile HLM independently per analysis layer, so layer-specific filter predicates are applied before merging
  • System metrics: Parse DFTracer service cat="sys" counter events (CPU utilization, memory stats) into per-time_range system metrics and join them onto flat views for I/O bottleneck correlation

Key changes

  • dftracer.py: New TYPE_PROFILE (6) and TYPE_SYSTEM (7) event types with _standardize_profile_partition and _standardize_system_partition for canonical schema mapping
  • analyzer.py: _expand_profile_buckets, _validate_and_expand_profiles, _reconcile_hlm, _compute_hybrid_hlm for the full resolution matching pipeline; system metrics joined in _process_flat_view
  • types.py: ReadTraceResult extended with profiles, profile_time_granularity, and system_metrics
  • config.py: New config options profile_distribution (uniform/weighted) and profile_time_granularity

Test plan

  • tests/test_hybrid_profiles.py: 21 tests covering profile parsing, coalescing, expansion (uniform + weighted), per-layer reconciliation, stat column preservation, edge cases
  • tests/test_system_metrics.py: 16 tests covering system event parsing, standardization, mixed traces, variable time granularity

izzet added 10 commits March 11, 2026 12:33
- Add comm and device layers to AI Logging preset with derived_metrics
  and layer_deps; broaden compute and data_loader conditions
- Retain min/max stat columns (time_min/max, size_min/max, offset_min/max)
  through profile standardization and coalescing
- Implement profile bucket expansion in Analyzer for finer-than-bucket
  analysis granularity with uniform and weighted distribution strategies
- Move granularity validation and expansion from DFTracer reader to
  Analyzer so resolution matching is format-agnostic
- Add profile_distribution and profile_time_granularity to AnalyzerConfig
- Add 15 new tests covering edge cases from production data: dur without
  dur_sum, missing fhash, non-POSIX categories, stat column preservation,
  coalescing min/max semantics, expansion, profile-only traces, and
  custom profile_time_granularity
@izzet izzet self-assigned this Mar 20, 2026
@izzet izzet added the enhancement New feature or request label Mar 20, 2026
@codecov
Copy link

codecov bot commented Mar 20, 2026

Codecov Report

❌ Patch coverage is 38.31169% with 285 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.45%. Comparing base (6d90ae0) to head (214f8ac).
⚠️ Report is 54 commits behind head on develop.

Files with missing lines Patch % Lines
python/dftracer/analyzer/dftracer.py 34.86% 142 Missing ⚠️
python/dftracer/analyzer/analyzer.py 32.46% 129 Missing ⚠️
python/dftracer/analyzer/types.py 64.00% 9 Missing ⚠️
python/dftracer/analyzer/output.py 84.61% 2 Missing ⚠️
python/dftracer/analyzer/analysis_utils.py 85.71% 1 Missing ⚠️
python/dftracer/analyzer/darshan.py 50.00% 1 Missing ⚠️
python/dftracer/analyzer/recorder.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop      #55      +/-   ##
===========================================
- Coverage    62.52%   58.45%   -4.08%     
===========================================
  Files           26       27       +1     
  Lines         2231     2903     +672     
===========================================
+ Hits          1395     1697     +302     
- Misses         836     1206     +370     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@izzet izzet merged commit a2cf358 into llnl:develop Mar 21, 2026
4 checks passed
@izzet izzet deleted the feature/profile-support-resolution-match branch March 21, 2026 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants