Skip to content

fix: bound diagnostic subprocess timeout behavior#267

Merged
rlaope merged 1 commit into
masterfrom
codex/performance-risk-pass-clean
Jun 8, 2026
Merged

fix: bound diagnostic subprocess timeout behavior#267
rlaope merged 1 commit into
masterfrom
codex/performance-risk-pass-clean

Conversation

@rlaope

@rlaope rlaope commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Summary

  • Bound AsProfExecutor to a single timeout budget using System.nanoTime() and remaining-time waits.
  • Drain jcmd diagnostic subprocess output on a daemon reader thread so no-output children cannot bypass the 10s timeout.
  • Route compilerqueue through the shared jcmd timeout path while preserving its in-process pid and error-return behavior.

Performance/Reliability Impact

  • Prevents async-profiler CLI captures from waiting nearly 2x the configured timeout when the child process outlives the progress loop.
  • Prevents web-console/server jcmd diagnostics from hanging on readAllBytes() before timeout enforcement.
  • Adds regression guards that fail on the old timeout-overrun/no-output-child patterns.

Validation

  • ./gradlew :argus-cli:test --tests io.argus.cli.provider.jdk.AsProfExecutorTest --quiet
  • ./gradlew :argus-server:test --tests io.argus.server.command.impl.DiagnosticUtilTest --tests io.argus.server.command.ServerCommandExecutorTest --quiet
  • ./gradlew :argus-cli:test :argus-diagnostics:test :argus-server:test :argus-agent:test :argus-aggregator:test :argus-spring-boot-starter:test :argus-instrument:test :argus-cli:fatJar --quiet

Notes

  • Work was done in clean worktree /Users/rlaope/Desktop/khope/Argus-performance-clean; the original worktree has pre-existing untracked duplicate Java files that block Gradle baseline evaluation there.

Async-profiler captures and jcmd-backed diagnostics sit on user-facing command paths, so their timeout contracts need to bound total wall time even when child processes keep running or produce no output.

Constraint: Argus must keep Java 11 compatibility and avoid new runtime dependencies for process supervision.

Rejected: Replacing the diagnostic command layer with async futures | too broad for a targeted performance-risk pass.

Confidence: high

Scope-risk: narrow

Directive: Keep stdout/stderr draining independent from timeout enforcement for long-running diagnostic subprocesses.

Tested: ./gradlew :argus-cli:test --tests io.argus.cli.provider.jdk.AsProfExecutorTest --quiet; ./gradlew :argus-server:test --tests io.argus.server.command.impl.DiagnosticUtilTest --tests io.argus.server.command.ServerCommandExecutorTest --quiet; ./gradlew :argus-cli:test :argus-diagnostics:test :argus-server:test :argus-agent:test :argus-aggregator:test :argus-spring-boot-starter:test :argus-instrument:test :argus-cli:fatJar --quiet

Not-tested: Live async-profiler/JFR capture against a production JVM under load.
Signed-off-by: rlaope <piyrw9754@gmail.com>
@rlaope rlaope merged commit bfe0337 into master Jun 8, 2026
18 of 19 checks passed
@rlaope rlaope deleted the codex/performance-risk-pass-clean branch June 8, 2026 07:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant