From 917f3ce440b30b7a0d51f911f5e4b7fc0381ecb9 Mon Sep 17 00:00:00 2001
From: DemchaAV
Date: Sun, 14 Jun 2026 19:02:44 +0100
Subject: [PATCH 01/10] chore(benchmarks): remove three redundant benchmark
mains
FullCvBenchmark duplicated the JMH TemplateCvJmhBenchmark (CV through
ModernProfessional) with a hand-rolled, JIT-noisier loop and no report.
GraphComposeBenchmark was an early-engine relic measuring the same
title+body+divider doc as CurrentSpeedBenchmark's engine-simple scenario.
ScalabilityBenchmark's thread-scaling sweep is folded into
CurrentSpeedBenchmark's full-profile throughput run (thread counts now
1,2,4,8,16).
Drop the matching run-benchmarks.ps1 steps and the benchmarks.md /
benchmarks/README.md entries. ComparativeBenchmark, the JMH benches, the
deterministic probes, and the soak/stress runners stay. Benchmark module
compiles; its 28 tests pass.
---
CHANGELOG.md | 7 ++
benchmarks/README.md | 6 +-
.../demcha/compose/CurrentSpeedBenchmark.java | 4 +-
.../com/demcha/compose/FullCvBenchmark.java | 84 ------------------
.../demcha/compose/GraphComposeBenchmark.java | 79 -----------------
.../demcha/compose/ScalabilityBenchmark.java | 88 -------------------
docs/operations/benchmarks.md | 9 +-
scripts/run-benchmarks.ps1 | 7 +-
8 files changed, 15 insertions(+), 269 deletions(-)
delete mode 100644 benchmarks/src/main/java/com/demcha/compose/FullCvBenchmark.java
delete mode 100644 benchmarks/src/main/java/com/demcha/compose/GraphComposeBenchmark.java
delete mode 100644 benchmarks/src/main/java/com/demcha/compose/ScalabilityBenchmark.java
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 19c44ff5f..e9f7124c2 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -337,6 +337,13 @@ Entries land here as they merge.
### Internal
+- **Benchmark suite cleanup (not shipped).** Removed three redundant
+ benchmark mains: `FullCvBenchmark` (superseded by the JMH
+ `TemplateCvJmhBenchmark`), `GraphComposeBenchmark` (early-engine relic
+ duplicating `CurrentSpeedBenchmark`'s `engine-simple` scenario), and
+ `ScalabilityBenchmark` (its thread-scaling sweep folded into
+ `CurrentSpeedBenchmark`'s full-profile throughput run, now `1,2,4,8,16`).
+ Dropped the matching `run-benchmarks.ps1` steps and doc entries.
- **Removed the `java.awt.*` / `java.util.*` co-wildcard in four files.**
`InvoiceTemplateComposer`, `ProposalTemplateComposer`,
`WeeklyScheduleTemplateComposer`, and the engine `PdfRenderingSystemECS`
diff --git a/benchmarks/README.md b/benchmarks/README.md
index f6041365c..e232c6e21 100644
--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@@ -62,15 +62,11 @@
| File | Role |
|---|---|
| `CurrentSpeedBenchmark` | Default scenario runner — what CI's `perf-smoke` job exercises. Takes a `-Dgraphcompose.benchmark.profile=smoke\|full\|stress` switch. |
-| `ComparativeBenchmark` | Renders the same fixtures through GraphCompose, iText, openHTMLToPDF, JasperReports. **Rough local comparison only** — see "When not to use" above. |
-| `FullCvBenchmark`, `ScalabilityBenchmark` | Fixture-specific runners for CV and table-heavy scenarios. |
-| `CanonicalBenchmarkSupport`, `BenchmarkSupport` | Shared fixture builders + measurement helpers. |
+| `ComparativeBenchmark` | Renders the same fixtures through GraphCompose, iText, openHTMLToPDF, JasperReports. **Rough local comparison only** — see "When not to use" above. || `CanonicalBenchmarkSupport`, `BenchmarkSupport` | Shared fixture builders + measurement helpers. |
| `BenchmarkReportWriter` | Writes JSON / CSV / text reports under `benchmarks/target/benchmarks/`. |
| `BenchmarkDiffTool` | Compares two JSON reports and prints a delta table. Useful for pre/post comparisons. |
| `BenchmarkMedianTool` | Median + dispersion across N runs of the same scenario. |
| `GraphComposeStressTest`, `EnduranceTest` | Long-running stress / endurance harnesses. |
-| `GraphComposeBenchmark` | Legacy entry point preserved for one downstream caller. New work should target `CurrentSpeedBenchmark`. |
-
## Running
From the repo root:
diff --git a/benchmarks/src/main/java/com/demcha/compose/CurrentSpeedBenchmark.java b/benchmarks/src/main/java/com/demcha/compose/CurrentSpeedBenchmark.java
index 2858d64a6..bbda30b8f 100644
--- a/benchmarks/src/main/java/com/demcha/compose/CurrentSpeedBenchmark.java
+++ b/benchmarks/src/main/java/com/demcha/compose/CurrentSpeedBenchmark.java
@@ -55,7 +55,9 @@ public final class CurrentSpeedBenchmark {
private static final int DEFAULT_FULL_WARMUP_ITERATIONS = 12;
private static final int DEFAULT_FULL_MEASUREMENT_ITERATIONS = 40;
private static final int DEFAULT_FULL_DOCS_PER_THREAD = 12;
- private static final String DEFAULT_FULL_THREAD_COUNTS = "1,2,4,8";
+ // The 16-thread tier is absorbed from the removed ScalabilityBenchmark so the
+ // full profile keeps a thread-scaling data point (smoke runs no throughput).
+ private static final String DEFAULT_FULL_THREAD_COUNTS = "1,2,4,8,16";
// Bumped from 2/5 to 30/100 so smoke runs reach a steady JIT state and the
// p95 calculation actually has enough samples to interpolate rather than
// collapsing to the maximum observed time. The smoke profile remains the
diff --git a/benchmarks/src/main/java/com/demcha/compose/FullCvBenchmark.java b/benchmarks/src/main/java/com/demcha/compose/FullCvBenchmark.java
deleted file mode 100644
index c035f96e3..000000000
--- a/benchmarks/src/main/java/com/demcha/compose/FullCvBenchmark.java
+++ /dev/null
@@ -1,84 +0,0 @@
-package com.demcha.compose;
-
-import com.demcha.compose.document.api.DocumentSession;
-import com.demcha.compose.document.templates.api.DocumentTemplate;
-import com.demcha.compose.document.templates.cv.presets.ModernProfessional;
-import com.demcha.compose.document.templates.cv.spec.CvSpec;
-import com.demcha.compose.document.theme.BusinessTheme;
-import org.apache.pdfbox.pdmodel.common.PDRectangle;
-
-import java.util.Arrays;
-
-public class FullCvBenchmark {
-
- private static final int WARMUP_ITERATIONS = Integer.getInteger("graphcompose.benchmark.fullCv.warmup", 100);
- private static final int MEASUREMENT_ITERATIONS = Integer.getInteger("graphcompose.benchmark.fullCv.iterations", 500);
-
- public static void main(String[] args) {
- BenchmarkSupport.configureQuietLogging();
- System.out.println("Starting FullCvBenchmark...");
-
- CvSpec cv = CanonicalBenchmarkSupport.canonicalCv();
- DocumentTemplate template = ModernProfessional.create(BusinessTheme.modern());
-
- System.out.println("Warming up JVM (JIT compilation, font cache warmup)...");
- for (int i = 0; i < WARMUP_ITERATIONS; i++) {
- generateCvInMemory(template, cv);
- }
-
- System.out.println("Measuring performance (" + MEASUREMENT_ITERATIONS + " iterations)...");
- long[] durationsNs = new long[MEASUREMENT_ITERATIONS];
-
- for (int i = 0; i < MEASUREMENT_ITERATIONS; i++) {
- long start = System.nanoTime();
- generateCvInMemory(template, cv);
- long end = System.nanoTime();
- durationsNs[i] = end - start;
- }
-
- printStatistics(durationsNs);
- }
-
- private static void generateCvInMemory(DocumentTemplate template, CvSpec cv) {
- try (DocumentSession document = GraphCompose.document()
- .pageSize(com.demcha.compose.document.api.DocumentPageSize.A4)
- .margin(15, 10, 15, 15)
- .create()) {
- template.compose(document, cv);
- document.toPdfBytes();
- } catch (Exception e) {
- throw new RuntimeException("Failed to generate PDF", e);
- }
- }
-
- private static void printStatistics(long[] durationsNs) {
- Arrays.sort(durationsNs);
-
- double[] durationsMs = Arrays.stream(durationsNs).mapToDouble(ns -> ns / 1_000_000.0).toArray();
-
- double min = durationsMs[0];
- double max = durationsMs[durationsMs.length - 1];
- double avg = Arrays.stream(durationsMs).average().orElse(0.0);
- double median = durationsMs[(int) (durationsMs.length * 0.5)];
- double p95 = durationsMs[(int) (durationsMs.length * 0.95)];
- double p99 = durationsMs[(int) (durationsMs.length * 0.99)];
-
- System.out.println("\nBenchmark results (milliseconds):");
- System.out.println("------------------------------------------------");
- System.out.printf("Min time: %.2f ms%n", min);
- System.out.printf("Average time: %.2f ms%n", avg);
- System.out.printf("Median (50%%): %.2f ms (typical response time)%n", median);
- System.out.printf("95th percentile: %.2f ms (95%% of runs finish within this)%n", p95);
- System.out.printf("99th percentile: %.2f ms (rare spikes or GC pressure)%n", p99);
- System.out.printf("Max time: %.2f ms%n", max);
- System.out.println("------------------------------------------------");
-
- if (median < 200) {
- System.out.println("Verdict: Excellent. The engine is very fast for this scenario.");
- } else if (median < 1000) {
- System.out.println("Verdict: Good. This is a healthy speed for complex generation.");
- } else {
- System.out.println("Verdict: Slow enough to investigate with a profiler.");
- }
- }
-}
diff --git a/benchmarks/src/main/java/com/demcha/compose/GraphComposeBenchmark.java b/benchmarks/src/main/java/com/demcha/compose/GraphComposeBenchmark.java
deleted file mode 100644
index f4717e66c..000000000
--- a/benchmarks/src/main/java/com/demcha/compose/GraphComposeBenchmark.java
+++ /dev/null
@@ -1,79 +0,0 @@
-package com.demcha.compose;
-
-import com.demcha.compose.engine.components.style.Margin;
-import org.apache.pdfbox.pdmodel.common.PDRectangle;
-
-import java.util.Arrays;
-
-public class GraphComposeBenchmark {
-
- private static final int WARMUP_ITERATIONS = Integer.getInteger("graphcompose.benchmark.coreEngine.warmup", 100);
- private static final int MEASUREMENT_ITERATIONS = Integer.getInteger("graphcompose.benchmark.coreEngine.iterations", 500);
-
- public static void main(String[] args) {
- BenchmarkSupport.configureQuietLogging();
- System.out.println("Starting GraphComposeBenchmark...");
-
- System.out.println("Warming up JVM (JIT compilation, font cache warmup)...");
- for (int i = 0; i < WARMUP_ITERATIONS; i++) {
- generateCvInMemory();
- }
-
- System.out.println("Measuring performance (" + MEASUREMENT_ITERATIONS + " iterations)...");
- long[] durationsNs = new long[MEASUREMENT_ITERATIONS];
-
- for (int i = 0; i < MEASUREMENT_ITERATIONS; i++) {
- long start = System.nanoTime();
- generateCvInMemory();
- long end = System.nanoTime();
- durationsNs[i] = end - start;
- }
-
- printStatistics(durationsNs);
- }
-
- private static void generateCvInMemory() {
- try {
- CanonicalBenchmarkSupport.renderSimpleBenchmarkDocument(
- PDRectangle.A4,
- Margin.of(24),
- "CoreEngineRoot",
- "GraphCompose Core Benchmark",
- "Analytical engineer focused on reliable platform design. "
- + "Testing paragraph breaking and layout calculation engine.");
- } catch (Exception e) {
- throw new RuntimeException("Failed to generate PDF", e);
- }
- }
-
- private static void printStatistics(long[] durationsNs) {
- Arrays.sort(durationsNs);
-
- double[] durationsMs = Arrays.stream(durationsNs).mapToDouble(ns -> ns / 1_000_000.0).toArray();
-
- double min = durationsMs[0];
- double max = durationsMs[durationsMs.length - 1];
- double avg = Arrays.stream(durationsMs).average().orElse(0.0);
- double median = durationsMs[(int) (durationsMs.length * 0.5)];
- double p95 = durationsMs[(int) (durationsMs.length * 0.95)];
- double p99 = durationsMs[(int) (durationsMs.length * 0.99)];
-
- System.out.println("\nBenchmark results (milliseconds):");
- System.out.println("------------------------------------------------");
- System.out.printf("Min time: %.2f ms%n", min);
- System.out.printf("Average time: %.2f ms%n", avg);
- System.out.printf("Median (50%%): %.2f ms (typical response time)%n", median);
- System.out.printf("95th percentile: %.2f ms (95%% of runs finish within this)%n", p95);
- System.out.printf("99th percentile: %.2f ms (rare spikes or GC pressure)%n", p99);
- System.out.printf("Max time: %.2f ms%n", max);
- System.out.println("------------------------------------------------");
-
- if (median < 100) {
- System.out.println("Verdict: Excellent. The engine is very fast for this scenario.");
- } else if (median < 500) {
- System.out.println("Verdict: Good. This is a healthy speed for a synchronous REST API.");
- } else {
- System.out.println("Verdict: Slow enough to investigate with a profiler.");
- }
- }
-}
diff --git a/benchmarks/src/main/java/com/demcha/compose/ScalabilityBenchmark.java b/benchmarks/src/main/java/com/demcha/compose/ScalabilityBenchmark.java
deleted file mode 100644
index b8e945ef6..000000000
--- a/benchmarks/src/main/java/com/demcha/compose/ScalabilityBenchmark.java
+++ /dev/null
@@ -1,88 +0,0 @@
-package com.demcha.compose;
-
-import com.demcha.compose.engine.components.style.Margin;
-import org.apache.pdfbox.pdmodel.common.PDRectangle;
-
-import java.util.ArrayList;
-import java.util.Arrays;
-import java.util.List;
-import java.util.concurrent.*;
-
-/**
- * Linear Scalability Test
- * Measures throughput (documents per second) as thread count increases.
- */
-public class ScalabilityBenchmark {
-
- private static final int DOCUMENTS_PER_THREAD = Integer.getInteger("graphcompose.scalability.documentsPerThread", 100);
- private static final int WARMUP_DOCS = Integer.getInteger("graphcompose.scalability.warmupDocs", 100);
- private static final String THREAD_COUNTS = System.getProperty("graphcompose.scalability.threads", "1,2,4,8,16");
-
- public static void main(String[] args) throws Exception {
- BenchmarkSupport.configureQuietLogging();
- System.out.println("Starting Scalability Benchmark: Linear Scalability");
- System.out.println("------------------------------------------------------------");
-
- // Warmup
- for (int i = 0; i < WARMUP_DOCS; i++) {
- generateOne();
- }
-
- int[] threadCounts = parseThreadCounts(THREAD_COUNTS);
- System.out.println(String.format("%-10s | %-15s | %-12s", "Threads", "Total Docs", "Throughput (docs/sec)"));
- System.out.println("------------------------------------------------------------");
-
- for (int threads : threadCounts) {
- runScalabilityTest(threads);
- }
- }
-
- private static void runScalabilityTest(int threads) throws Exception {
- int totalDocs = threads * DOCUMENTS_PER_THREAD;
- ExecutorService executor = Executors.newFixedThreadPool(threads);
-
- long startTime = System.nanoTime();
-
- List> futures = new ArrayList<>();
- for (int i = 0; i < totalDocs; i++) {
- futures.add(executor.submit(() -> {
- try {
- generateOne();
- } catch (Exception e) {
- e.printStackTrace();
- }
- }));
- }
-
- for (Future> future : futures) {
- future.get();
- }
-
- long endTime = System.nanoTime();
- executor.shutdown();
- executor.awaitTermination(1, TimeUnit.MINUTES);
-
- double durationSec = (endTime - startTime) / 1_000_000_000.0;
- double throughput = totalDocs / durationSec;
-
- System.out.println(String.format("%-10d | %-15d | %12.2f", threads, totalDocs, throughput));
- }
-
- private static void generateOne() throws Exception {
- CanonicalBenchmarkSupport.renderSimpleBenchmarkDocument(
- PDRectangle.A4,
- Margin.of(24),
- "ScalabilityRoot",
- "Scalability",
- "Scalability test message.");
- }
-
- private static int[] parseThreadCounts(String raw) {
- return Arrays.stream(raw.split(","))
- .map(String::trim)
- .filter(value -> !value.isEmpty())
- .mapToInt(Integer::parseInt)
- .filter(value -> value > 0)
- .toArray();
- }
-}
diff --git a/docs/operations/benchmarks.md b/docs/operations/benchmarks.md
index 315f4d523..775483384 100644
--- a/docs/operations/benchmarks.md
+++ b/docs/operations/benchmarks.md
@@ -36,15 +36,10 @@ The script prints numbered sections so you can map console output to the pipelin
1. `01-build-classpath`
Builds the test classpath once and writes `target/benchmark.classpath`.
2. `02-current-speed`
- Runs `CurrentSpeedBenchmark` in the selected profile.
+ Runs `CurrentSpeedBenchmark` in the selected profile. The full profile also
+ runs the thread-scaling throughput sweep (1 → 16 threads).
3. `03-comparative`
Runs the GraphCompose canonical vs iText 5 vs JasperReports comparison.
-4. `04-core-engine`
- Runs `GraphComposeBenchmark`.
-5. `05-full-cv`
- Runs `FullCvBenchmark`.
-6. `06-scalability`
- Runs the thread-scaling throughput benchmark.
7. `07-stress`
Runs the concurrent stability stress test.
8. `08-endurance`
diff --git a/scripts/run-benchmarks.ps1 b/scripts/run-benchmarks.ps1
index dbe162c08..e3d3947b6 100644
--- a/scripts/run-benchmarks.ps1
+++ b/scripts/run-benchmarks.ps1
@@ -5,8 +5,8 @@ Runs the local GraphCompose benchmark pipeline and stores timestamped logs and r
.DESCRIPTION
The wrapper performs a staged local run:
-01 build classpath, 02 current-speed, 03 comparative, 04 core engine, 05 full CV, 06 scalability,
-07 stress, optional 08 endurance, then 09/10 diff steps.
+01 build classpath, 02 current-speed, 03 comparative, 07 stress,
+optional 08 endurance, then 09/10 diff steps.
Current-speed diffs are profile-aware. The wrapper only compares reports
from the same current-speed profile (`smoke` or `full`) and skips the
@@ -368,9 +368,6 @@ try {
-InputPaths $comparativeRuns | Out-Null
}
- Invoke-JavaMain -Name "04-core-engine" -Classpath $javaClasspath -MainClass "com.demcha.compose.GraphComposeBenchmark"
- Invoke-JavaMain -Name "05-full-cv" -Classpath $javaClasspath -MainClass "com.demcha.compose.FullCvBenchmark"
- Invoke-JavaMain -Name "06-scalability" -Classpath $javaClasspath -MainClass "com.demcha.compose.ScalabilityBenchmark"
Invoke-JavaMain -Name "07-stress" -Classpath $javaClasspath -MainClass "com.demcha.compose.GraphComposeStressTest"
if ($IncludeEndurance) {
From 019f64b32cd23aa44a0694cd43604e11d2c88818 Mon Sep 17 00:00:00 2001
From: DemchaAV
Date: Sun, 14 Jun 2026 19:26:31 +0100
Subject: [PATCH 02/10] perf(benchmarks): persist compose/layout/render stages
+ a run summary.md
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The stage breakdown (per-template compose / layout / render medians) was
printed to the console and discarded. Promote it into the report:
runStageBreakdown returns a StageRow, CurrentSpeedReport carries a stages[]
array, and a stages CSV is written — so a diff can attribute a regression to
an engine stage, not just the blended total. Also write a per-run summary.md
(latency + stages + throughput tables) so a reviewer reads one file instead
of the JSON plus several CSVs.
Additive output only: diff/verdict/median read the report by field and ignore
the new array. Benchmark module compiles; 28 tests pass; verified on a smoke
run (stages[] present, summary.md readable, perf gate passes).
---
.../demcha/compose/BenchmarkReportWriter.java | 8 +
.../demcha/compose/CurrentSpeedBenchmark.java | 144 +++++++++++++++---
2 files changed, 131 insertions(+), 21 deletions(-)
diff --git a/benchmarks/src/main/java/com/demcha/compose/BenchmarkReportWriter.java b/benchmarks/src/main/java/com/demcha/compose/BenchmarkReportWriter.java
index 73e061d3d..51d2b2e42 100644
--- a/benchmarks/src/main/java/com/demcha/compose/BenchmarkReportWriter.java
+++ b/benchmarks/src/main/java/com/demcha/compose/BenchmarkReportWriter.java
@@ -60,6 +60,14 @@ Path writeCsv(String tableName, List headers, List> rows) t
return archived;
}
+ Path writeMarkdown(String name, String content) throws IOException {
+ Path latest = directory.resolve("latest-" + name + ".md");
+ Path archived = directory.resolve(name + "-" + timestamp + ".md");
+ Files.writeString(latest, content, StandardCharsets.UTF_8);
+ Files.writeString(archived, content, StandardCharsets.UTF_8);
+ return archived;
+ }
+
Path directory() {
return directory;
}
diff --git a/benchmarks/src/main/java/com/demcha/compose/CurrentSpeedBenchmark.java b/benchmarks/src/main/java/com/demcha/compose/CurrentSpeedBenchmark.java
index bbda30b8f..e3d877943 100644
--- a/benchmarks/src/main/java/com/demcha/compose/CurrentSpeedBenchmark.java
+++ b/benchmarks/src/main/java/com/demcha/compose/CurrentSpeedBenchmark.java
@@ -143,20 +143,21 @@ private void run() throws Exception {
// Stage breakdown: for each template scenario we time compose / layout
// / render separately so consumers can attribute regressions to the
- // engine vs. PDFBox. Engine-simple and feature-rich scenarios also
- // use the canonical pipeline and benefit from the same probe.
+ // engine vs. PDFBox. Only the template scenarios are probed here; the
+ // latency table above still covers every scenario.
+ List stageRows = new ArrayList<>();
if (profile != BenchmarkProfile.SMOKE || config.measurementIterations() >= 20) {
System.out.println();
System.out.println("Stage breakdown (median ms per stage)");
System.out.printf("%-18s | %12s | %12s | %12s | %12s%n",
"Scenario", "Compose", "Layout", "Render", "Total");
System.out.println("-".repeat(78));
- runStageBreakdown("invoice-template", () -> openInvoiceSession(),
- s -> invoiceTemplate.compose(s, invoice), config.measurementIterations());
- runStageBreakdown("cv-template", () -> openCvSession(),
- s -> cvTemplate.compose(s, cv), config.measurementIterations());
- runStageBreakdown("proposal-template", () -> openProposalSession(),
- s -> proposalTemplate.compose(s, proposal), config.measurementIterations());
+ stageRows.add(runStageBreakdown("invoice-template", () -> openInvoiceSession(),
+ s -> invoiceTemplate.compose(s, invoice), config.measurementIterations()));
+ stageRows.add(runStageBreakdown("cv-template", () -> openCvSession(),
+ s -> cvTemplate.compose(s, cv), config.measurementIterations()));
+ stageRows.add(runStageBreakdown("proposal-template", () -> openProposalSession(),
+ s -> proposalTemplate.compose(s, proposal), config.measurementIterations()));
}
List throughputRows = new ArrayList<>();
@@ -201,10 +202,13 @@ private void run() throws Exception {
config.docsPerThread(),
config.threadCounts(),
latencyRows,
+ stageRows,
throughputRows,
totalBenchmarkBytes);
System.out.println("Saved JSON benchmark report to " + summary.jsonPath());
- System.out.println("Saved CSV benchmark reports to " + summary.latencyCsvPath() + " and " + summary.throughputCsvPath());
+ System.out.println("Saved CSV benchmark reports to " + summary.latencyCsvPath() + ", "
+ + summary.stagesCsvPath() + ", and " + summary.throughputCsvPath());
+ System.out.println("Saved markdown summary to " + summary.summaryMarkdownPath());
if (enforceGate) {
PerformanceGateResult gateResult = evaluatePerformanceGate(profile, latencyRows);
@@ -363,10 +367,10 @@ private interface SessionComposer {
* median-ms-per-stage row so callers can attribute regressions to
* compose / layout / render independently.
*/
- private void runStageBreakdown(String scenario,
- SessionFactory factory,
- SessionComposer composer,
- int iterations) throws Exception {
+ private StageRow runStageBreakdown(String scenario,
+ SessionFactory factory,
+ SessionComposer composer,
+ int iterations) throws Exception {
int warmup = Math.max(2, Math.min(20, iterations / 5));
for (int i = 0; i < warmup; i++) {
try (DocumentSession session = factory.open()) {
@@ -398,12 +402,13 @@ private void runStageBreakdown(String scenario,
throw new AssertionError();
}
}
+ double composeMs = medianMs(composeNs);
+ double layoutMs = medianMs(layoutNs);
+ double renderMs = medianMs(renderNs);
+ double totalMs = medianMs(totalNs);
System.out.printf("%-18s | %12.3f | %12.3f | %12.3f | %12.3f%n",
- scenario,
- medianMs(composeNs),
- medianMs(layoutNs),
- medianMs(renderNs),
- medianMs(totalNs));
+ scenario, composeMs, layoutMs, renderMs, totalMs);
+ return new StageRow(scenario, round(composeMs), round(layoutMs), round(renderMs), round(totalMs));
}
private static double medianMs(long[] arr) {
@@ -677,16 +682,19 @@ private PathSummary writeReports(BenchmarkReportWriter.BenchmarkArtifacts artifa
int docsPerThread,
int[] threadCounts,
List latencyRows,
+ List stageRows,
List throughputRows,
long totalBenchmarkBytes) throws Exception {
+ String timestamp = LocalDateTime.now().format(TIMESTAMP_FORMAT);
CurrentSpeedReport report = new CurrentSpeedReport(
- LocalDateTime.now().format(TIMESTAMP_FORMAT),
+ timestamp,
profileId,
warmupIterations,
measurementIterations,
docsPerThread,
Arrays.stream(threadCounts).boxed().toList(),
latencyRows,
+ stageRows,
throughputRows,
totalBenchmarkBytes);
@@ -717,8 +725,88 @@ private PathSummary writeReports(BenchmarkReportWriter.BenchmarkArtifacts artifa
format(row.docsPerSecond()),
format(row.avgMillisPerDoc())))
.toList());
+ var stagesCsvPath = artifacts.writeCsv(
+ "stages",
+ List.of("scenario", "compose_ms", "layout_ms", "render_ms", "total_ms"),
+ stageRows.stream()
+ .map(row -> List.of(
+ row.scenario(),
+ format(row.composeMillis()),
+ format(row.layoutMillis()),
+ format(row.renderMillis()),
+ format(row.totalMillis())))
+ .toList());
+ var summaryMarkdownPath = artifacts.writeMarkdown(
+ "summary",
+ buildSummaryMarkdown(timestamp, profileId, latencyRows, stageRows,
+ throughputRows, totalBenchmarkBytes));
+
+ return new PathSummary(jsonPath.toString(), latencyCsvPath.toString(),
+ stagesCsvPath.toString(), throughputCsvPath.toString(),
+ summaryMarkdownPath.toString());
+ }
+
+ /**
+ * Renders a single human-readable summary of the run — the latency table,
+ * the per-stage compose/layout/render split (the only place the suite
+ * attributes time to engine stages vs. PDFBox), and the throughput table
+ * when present — so a reviewer reads one file instead of stitching the JSON
+ * and several CSVs together.
+ */
+ private static String buildSummaryMarkdown(String timestamp,
+ String profileId,
+ List latencyRows,
+ List stageRows,
+ List throughputRows,
+ long totalBenchmarkBytes) {
+ StringBuilder md = new StringBuilder();
+ md.append("# Current-speed benchmark — ").append(profileId).append(" profile\n\n");
+ md.append('`').append(timestamp).append("`\n\n");
+
+ md.append("## Latency (ms)\n\n");
+ md.append("| Scenario | Avg | p50 | p95 | Max | Docs/s | Avg KB | Peak MB |\n");
+ md.append("|---|---:|---:|---:|---:|---:|---:|---:|\n");
+ for (LatencyRow row : latencyRows) {
+ md.append("| ").append(row.scenario())
+ .append(" | ").append(format(row.avgMillis()))
+ .append(" | ").append(format(row.p50Millis()))
+ .append(" | ").append(format(row.p95Millis()))
+ .append(" | ").append(format(row.maxMillis()))
+ .append(" | ").append(format(row.docsPerSecond()))
+ .append(" | ").append(format(row.avgKilobytes()))
+ .append(" | ").append(format(row.peakHeapMb()))
+ .append(" |\n");
+ }
- return new PathSummary(jsonPath.toString(), latencyCsvPath.toString(), throughputCsvPath.toString());
+ if (!stageRows.isEmpty()) {
+ md.append("\n## Stages — template scenarios (median ms — compose / layout / render)\n\n");
+ md.append("| Scenario | Compose | Layout | Render | Total |\n");
+ md.append("|---|---:|---:|---:|---:|\n");
+ for (StageRow row : stageRows) {
+ md.append("| ").append(row.scenario())
+ .append(" | ").append(format(row.composeMillis()))
+ .append(" | ").append(format(row.layoutMillis()))
+ .append(" | ").append(format(row.renderMillis()))
+ .append(" | ").append(format(row.totalMillis()))
+ .append(" |\n");
+ }
+ }
+
+ if (!throughputRows.isEmpty()) {
+ md.append("\n## Throughput\n\n");
+ md.append("| Threads | Total docs | Docs/s | Avg doc ms |\n");
+ md.append("|---:|---:|---:|---:|\n");
+ for (ThroughputRow row : throughputRows) {
+ md.append("| ").append(row.threads())
+ .append(" | ").append(row.totalDocs())
+ .append(" | ").append(format(row.docsPerSecond()))
+ .append(" | ").append(format(row.avgMillisPerDoc()))
+ .append(" |\n");
+ }
+ }
+
+ md.append("\nByte guard: ").append(totalBenchmarkBytes).append('\n');
+ return md.toString();
}
private static double round(double value) {
@@ -772,6 +860,18 @@ private record ThroughputRow(String scenario,
double avgMillisPerDoc) {
}
+ /**
+ * Per-scenario compose / layout / render split (median ms). Persisted so a
+ * diff can attribute a regression to an engine stage rather than only the
+ * blended total — previously this was printed to the console and discarded.
+ */
+ private record StageRow(String scenario,
+ double composeMillis,
+ double layoutMillis,
+ double renderMillis,
+ double totalMillis) {
+ }
+
private record CurrentSpeedReport(String timestamp,
String profile,
int warmupIterations,
@@ -779,11 +879,13 @@ private record CurrentSpeedReport(String timestamp,
int docsPerThread,
List threadCounts,
List latency,
+ List stages,
List throughput,
long totalBytes) {
}
- private record PathSummary(String jsonPath, String latencyCsvPath, String throughputCsvPath) {
+ private record PathSummary(String jsonPath, String latencyCsvPath, String stagesCsvPath,
+ String throughputCsvPath, String summaryMarkdownPath) {
}
private record BenchmarkConfig(int warmupIterations,
From 2d2785208a73d5fd4a3337cf63d72b4a869be487 Mon Sep 17 00:00:00 2001
From: DemchaAV
Date: Sun, 14 Jun 2026 19:37:13 +0100
Subject: [PATCH 03/10] perf(benchmarks): diff consumes stages[] and reports
added/removed scenarios
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
BenchmarkDiffTool now (1) surfaces scenario set changes — addedScenarios /
removedScenarios — instead of silently intersecting, so a newly-added (or
dropped) scenario can no longer vanish from a diff unnoticed; and (2) diffs
the stages[] array, emitting per-scenario compose/layout/render/total percent
deltas (console block + stages-diff CSV) so a regression can be attributed to
an engine stage.
Backward-compatible: a report without stages[] yields an empty stage diff
(MissingNode iterates empty); latency/throughput delta rows stay
intersection-only; the diff report is terminal (median/verdict read producer
reports, not diffs). Adds a DiffToolTest case; 29 bench tests pass.
---
.../com/demcha/compose/BenchmarkDiffTool.java | 100 +++++++++++++++++-
.../demcha/compose/BenchmarkDiffToolTest.java | 61 +++++++++++
2 files changed, 160 insertions(+), 1 deletion(-)
diff --git a/benchmarks/src/main/java/com/demcha/compose/BenchmarkDiffTool.java b/benchmarks/src/main/java/com/demcha/compose/BenchmarkDiffTool.java
index 9b99d272f..0fb058bf8 100644
--- a/benchmarks/src/main/java/com/demcha/compose/BenchmarkDiffTool.java
+++ b/benchmarks/src/main/java/com/demcha/compose/BenchmarkDiffTool.java
@@ -93,6 +93,31 @@ private void diffCurrentSpeed(DiffInput input,
signedPercent(row.peakHeapMbDeltaPct()));
}
+ if (!report.addedScenarios().isEmpty() || !report.removedScenarios().isEmpty()) {
+ System.out.println();
+ System.out.println("Scenario set changes");
+ System.out.println(" Added in candidate: "
+ + (report.addedScenarios().isEmpty() ? "(none)" : String.join(", ", report.addedScenarios())));
+ System.out.println(" Removed from baseline: "
+ + (report.removedScenarios().isEmpty() ? "(none)" : String.join(", ", report.removedScenarios())));
+ }
+
+ if (!report.stages().isEmpty()) {
+ System.out.println();
+ System.out.println("Stage diff (pct delta per stage)");
+ System.out.printf("%-18s | %12s | %12s | %12s | %12s%n",
+ "Scenario", "Compose pct", "Layout pct", "Render pct", "Total pct");
+ System.out.println("-".repeat(78));
+ for (StageDiff row : report.stages()) {
+ System.out.printf("%-18s | %12s | %12s | %12s | %12s%n",
+ row.scenario(),
+ signedPercent(row.composeDeltaPct()),
+ signedPercent(row.layoutDeltaPct()),
+ signedPercent(row.renderDeltaPct()),
+ signedPercent(row.totalDeltaPct()));
+ }
+ }
+
System.out.println();
System.out.println("Throughput diff");
System.out.printf("%-18s | %8s | %12s | %14s%n",
@@ -143,10 +168,29 @@ private void diffCurrentSpeed(DiffInput input,
format(row.candidateAvgMillisPerDoc()),
format(row.avgMillisPerDocDeltaPct())))
.toList());
+ Path stagesCsv = artifacts.writeCsv(
+ "stages-diff",
+ List.of("scenario", "baseline_compose_ms", "candidate_compose_ms", "compose_delta_pct", "baseline_layout_ms", "candidate_layout_ms", "layout_delta_pct", "baseline_render_ms", "candidate_render_ms", "render_delta_pct", "baseline_total_ms", "candidate_total_ms", "total_delta_pct"),
+ report.stages().stream()
+ .map(row -> List.of(
+ row.scenario(),
+ format(row.baselineComposeMillis()),
+ format(row.candidateComposeMillis()),
+ format(row.composeDeltaPct()),
+ format(row.baselineLayoutMillis()),
+ format(row.candidateLayoutMillis()),
+ format(row.layoutDeltaPct()),
+ format(row.baselineRenderMillis()),
+ format(row.candidateRenderMillis()),
+ format(row.renderDeltaPct()),
+ format(row.baselineTotalMillis()),
+ format(row.candidateTotalMillis()),
+ format(row.totalDeltaPct())))
+ .toList());
System.out.println();
System.out.println("Saved JSON diff report to " + jsonPath);
- System.out.println("Saved CSV diff reports to " + latencyCsv + " and " + throughputCsv);
+ System.out.println("Saved CSV diff reports to " + latencyCsv + ", " + throughputCsv + ", and " + stagesCsv);
}
private void diffComparative(DiffInput input,
@@ -214,6 +258,29 @@ private CurrentSpeedDiffReport buildCurrentSpeedDiff(DiffInput input, JsonNode b
})
.toList();
+ Map baselineStages = indexBy(baseline.path("stages"), "scenario");
+ Map candidateStages = indexBy(candidate.path("stages"), "scenario");
+ List stageDiffs = intersectKeys(baselineStages, candidateStages).stream()
+ .map(key -> {
+ JsonNode before = baselineStages.get(key);
+ JsonNode after = candidateStages.get(key);
+ return new StageDiff(
+ key,
+ before.path("composeMillis").asDouble(),
+ after.path("composeMillis").asDouble(),
+ percentDelta(before.path("composeMillis").asDouble(), after.path("composeMillis").asDouble()),
+ before.path("layoutMillis").asDouble(),
+ after.path("layoutMillis").asDouble(),
+ percentDelta(before.path("layoutMillis").asDouble(), after.path("layoutMillis").asDouble()),
+ before.path("renderMillis").asDouble(),
+ after.path("renderMillis").asDouble(),
+ percentDelta(before.path("renderMillis").asDouble(), after.path("renderMillis").asDouble()),
+ before.path("totalMillis").asDouble(),
+ after.path("totalMillis").asDouble(),
+ percentDelta(before.path("totalMillis").asDouble(), after.path("totalMillis").asDouble()));
+ })
+ .toList();
+
Map baselineThroughput = indexThroughput(baseline.path("throughput"));
Map candidateThroughput = indexThroughput(candidate.path("throughput"));
List throughputDiffs = intersectKeys(baselineThroughput, candidateThroughput).stream()
@@ -237,7 +304,10 @@ private CurrentSpeedDiffReport buildCurrentSpeedDiff(DiffInput input, JsonNode b
input.candidatePath().toString(),
baseline.path("timestamp").asText(),
candidate.path("timestamp").asText(),
+ addedKeys(baselineLatency, candidateLatency),
+ removedKeys(baselineLatency, candidateLatency),
latencyDiffs,
+ stageDiffs,
throughputDiffs
);
}
@@ -294,6 +364,16 @@ private static List intersectKeys(Map left, Map addedKeys(Map baseline, Map candidate) {
+ return candidate.keySet().stream().filter(key -> !baseline.containsKey(key)).sorted().toList();
+ }
+
+ /** Keys present in {@code baseline} but not {@code candidate} (dropped scenarios). */
+ private static List removedKeys(Map baseline, Map candidate) {
+ return baseline.keySet().stream().filter(key -> !candidate.containsKey(key)).sorted().toList();
+ }
+
private static Iterable iterable(JsonNode array) {
return () -> new Iterator<>() {
private final Iterator delegate = array.iterator();
@@ -477,11 +557,29 @@ private record CurrentSpeedThroughputDiff(String scenario,
double avgMillisPerDocDeltaPct) {
}
+ private record StageDiff(String scenario,
+ double baselineComposeMillis,
+ double candidateComposeMillis,
+ double composeDeltaPct,
+ double baselineLayoutMillis,
+ double candidateLayoutMillis,
+ double layoutDeltaPct,
+ double baselineRenderMillis,
+ double candidateRenderMillis,
+ double renderDeltaPct,
+ double baselineTotalMillis,
+ double candidateTotalMillis,
+ double totalDeltaPct) {
+ }
+
private record CurrentSpeedDiffReport(String baselinePath,
String candidatePath,
String baselineTimestamp,
String candidateTimestamp,
+ List addedScenarios,
+ List removedScenarios,
List latency,
+ List stages,
List throughput) {
}
diff --git a/benchmarks/src/test/java/com/demcha/compose/BenchmarkDiffToolTest.java b/benchmarks/src/test/java/com/demcha/compose/BenchmarkDiffToolTest.java
index 783ad2479..d3319131c 100644
--- a/benchmarks/src/test/java/com/demcha/compose/BenchmarkDiffToolTest.java
+++ b/benchmarks/src/test/java/com/demcha/compose/BenchmarkDiffToolTest.java
@@ -93,6 +93,35 @@ void currentSpeedDiffKeepsOnlyScenariosPresentInBothRuns() throws Exception {
assertThat(diff.path("throughput").get(0).path("scenario").asText()).isEqualTo("shared");
}
+ @Test
+ void currentSpeedDiffSurfacesAddedRemovedScenariosAndStageDeltas() throws Exception {
+ System.setProperty("graphcompose.benchmark.root", tempDir.toString());
+ Path baseline = write("baseline.json", currentSpeedWithStages("full",
+ latency("shared", 10.0, 10.0, 100.0, 1.0, 100.0) + ","
+ + latency("only-in-baseline", 10.0, 10.0, 100.0, 1.0, 100.0),
+ stage("shared", 1.0, 2.0, 4.0, 7.0),
+ throughput("shared", 1, 50.0, 20.0)));
+ Path candidate = write("candidate.json", currentSpeedWithStages("full",
+ latency("shared", 10.0, 10.0, 100.0, 1.0, 100.0) + ","
+ + latency("only-in-candidate", 5.0, 5.0, 200.0, 0.5, 90.0),
+ stage("shared", 1.0, 2.0, 8.0, 11.0),
+ throughput("shared", 1, 50.0, 20.0)));
+
+ BenchmarkDiffTool.main(new String[]{baseline.toString(), candidate.toString()});
+
+ JsonNode diff = readDiff("current-speed");
+ // Loud set-changes: one-sided scenarios are surfaced, not silently dropped.
+ assertThat(toStrings(diff.path("addedScenarios"))).containsExactly("only-in-candidate");
+ assertThat(toStrings(diff.path("removedScenarios"))).containsExactly("only-in-baseline");
+ // The shared scenario is still the only intersected latency delta row.
+ assertThat(diff.path("latency").size()).isEqualTo(1);
+ // Stage diff: render 4 -> 8 = +100%, compose unchanged.
+ JsonNode stageDiff = diff.path("stages").get(0);
+ assertThat(stageDiff.path("scenario").asText()).isEqualTo("shared");
+ assertThat(stageDiff.path("renderDeltaPct").asDouble()).isCloseTo(100.0, within(EPS));
+ assertThat(stageDiff.path("composeDeltaPct").asDouble()).isCloseTo(0.0, within(EPS));
+ }
+
@Test
void currentSpeedDiffTreatsZeroBaselineAsHundredPercentAndZeroToZeroAsZero() throws Exception {
System.setProperty("graphcompose.benchmark.root", tempDir.toString());
@@ -228,6 +257,38 @@ private static String latency(String scenario,
""".formatted(scenario, scenario, avgMillis, p95Millis, docsPerSecond, avgKilobytes, peakHeapMb);
}
+ private static String currentSpeedWithStages(String profile, String latencyItems,
+ String stageItems, String throughputItems) {
+ return """
+ {
+ "timestamp": "2026-04-14 21:00:00",
+ "profile": "%s",
+ "latency": [%s],
+ "stages": [%s],
+ "throughput": [%s]
+ }
+ """.formatted(profile, latencyItems, stageItems, throughputItems);
+ }
+
+ private static String stage(String scenario, double composeMs, double layoutMs,
+ double renderMs, double totalMs) {
+ return """
+ {
+ "scenario": "%s",
+ "composeMillis": %s,
+ "layoutMillis": %s,
+ "renderMillis": %s,
+ "totalMillis": %s
+ }
+ """.formatted(scenario, composeMs, layoutMs, renderMs, totalMs);
+ }
+
+ private static java.util.List toStrings(JsonNode array) {
+ java.util.List values = new java.util.ArrayList<>();
+ array.forEach(node -> values.add(node.asText()));
+ return values;
+ }
+
private static String throughput(String scenario, int threads, double docsPerSecond, double avgMillisPerDoc) {
return """
{
From faec9e3f23c02eb54e2fa5fa5d6ab9fc94d1ae9c Mon Sep 17 00:00:00 2001
From: DemchaAV
Date: Sun, 14 Jun 2026 19:55:24 +0100
Subject: [PATCH 04/10] perf(benchmarks): add SVG-import feature benches (parse
/ read / node)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
First feature-object benchmarks for the v1.8 vector surface (the rest of the
suite is text/table only):
- SvgJmhBenchmark (forked JMH): SvgPath.parse of a real Material heart d,
SvgIcon.parse of a multi-layer icon, SvgIcon.node on a pre-parsed icon.
- SvgParseAllocProbe (deterministic ThreadMXBean alloc, median of 11): KB/op
for the same three operations.
- SvgBenchmarkFixtures: the heart d (vendored — the benchmark module can't
reach the test/example copies) and a synthetic multi-layer icon (gradient
bg + transformed groups + stroked curves) within the reader's supported
subset, so it always parses.
Run on demand, not per-PR: java -jar benchmarks/target/benchmarks.jar Svg.
Verified: compiles; both benches run — path parse ~3.6 us/op, icon read
~308 us/op (DOM-parse dominated, 114 KB/op), node build ~0.4 us/op / 2 KB/op.
---
.../demcha/compose/SvgBenchmarkFixtures.java | 55 +++++++++++
.../demcha/compose/SvgParseAllocProbe.java | 93 ++++++++++++++++++
.../demcha/compose/jmh/SvgJmhBenchmark.java | 97 +++++++++++++++++++
3 files changed, 245 insertions(+)
create mode 100644 benchmarks/src/main/java/com/demcha/compose/SvgBenchmarkFixtures.java
create mode 100644 benchmarks/src/main/java/com/demcha/compose/SvgParseAllocProbe.java
create mode 100644 benchmarks/src/main/java/com/demcha/compose/jmh/SvgJmhBenchmark.java
diff --git a/benchmarks/src/main/java/com/demcha/compose/SvgBenchmarkFixtures.java b/benchmarks/src/main/java/com/demcha/compose/SvgBenchmarkFixtures.java
new file mode 100644
index 000000000..120741433
--- /dev/null
+++ b/benchmarks/src/main/java/com/demcha/compose/SvgBenchmarkFixtures.java
@@ -0,0 +1,55 @@
+package com.demcha.compose;
+
+/**
+ * Shared SVG fixtures for the v1.8 vector-import benchmarks (path parse, whole
+ * icon read, icon → node build).
+ *
+ *
Self-contained on purpose: the benchmarks module cannot reach the
+ * main-module test constants or the examples module, so the heart path is
+ * vendored here (it also lives in {@code SvgPathTest} / {@code VectorPathExample}
+ * in their own modules). The icon is a synthetic but realistic multi-layer
+ * document — a gradient-filled background, a {@code translate}+{@code scale}
+ * group of filled paths and a stroked circle, and a {@code rotate} group with a
+ * polygon and a quadratic-curve stroke — so it exercises XML parse, {@code }
+ * transform accumulation, gradient resolution and per-layer path lowering the
+ * way a real exporter file would, while staying entirely within the reader's
+ * supported subset (so it never throws).
+ *
+ * @author Artem Demchyshyn
+ */
+public final class SvgBenchmarkFixtures {
+
+ /** Material "favorite" heart — the same {@code d} used in the SVG tests/examples. */
+ public static final String MATERIAL_HEART_D =
+ "M12 21.35l-1.45-1.32C5.4 15.36 2 12.28 2 8.5 2 5.42 4.42 3 7.5 3"
+ + "c1.74 0 3.41.81 4.5 2.09C13.09 3.81 14.76 3 16.5 3 19.58 3 22 5.42 22 8.5"
+ + "c0 3.78-3.4 6.86-8.55 11.54L12 21.35z";
+
+ /** Heart viewBox edge (square 24×24), passed to {@code SvgPath.parse}. */
+ public static final double HEART_VIEWBOX = 24.0;
+
+ /** A realistic multi-layer icon: gradient bg + transformed groups + stroked curves. */
+ public static final String MULTI_LAYER_ICON_SVG = """
+
+ """;
+
+ private SvgBenchmarkFixtures() {
+ }
+}
diff --git a/benchmarks/src/main/java/com/demcha/compose/SvgParseAllocProbe.java b/benchmarks/src/main/java/com/demcha/compose/SvgParseAllocProbe.java
new file mode 100644
index 000000000..b8df62a2b
--- /dev/null
+++ b/benchmarks/src/main/java/com/demcha/compose/SvgParseAllocProbe.java
@@ -0,0 +1,93 @@
+package com.demcha.compose;
+
+import com.demcha.compose.document.svg.SvgIcon;
+import com.demcha.compose.document.svg.SvgPath;
+
+import java.lang.management.ManagementFactory;
+import java.util.Arrays;
+import java.util.function.Supplier;
+
+/**
+ * Deterministic allocation probe for the v1.8 SVG-import path: warm
+ * (JIT-steady) bytes allocated per {@link SvgPath#parse}, per
+ * {@link SvgIcon#parse}, and per {@link SvgIcon#node} — the three operations
+ * with no analogue in the rest of the suite (which is text / table only).
+ *
+ *
Allocation counts are noise-free (unlike wall-clock or {@code peakHeapMb}),
+ * so this is the signal the "optimize the engine, not benchmarks" rule wants:
+ * a develop-vs-branch A/B shows a parse/read/node allocation change directly.
+ * No {@code src/main} changes.
The current-speed per-stage breakdown ({@code stages[]}) is not
+ * carried into the median aggregate — only latency and throughput are medianed.
+ * A median-vs-median diff therefore shows no compose/layout/render stage deltas;
+ * diff a single-run pair when you need stage attribution.
*/
public final class BenchmarkMedianTool {
diff --git a/benchmarks/src/main/java/com/demcha/compose/jmh/SvgJmhBenchmark.java b/benchmarks/src/main/java/com/demcha/compose/jmh/SvgJmhBenchmark.java
index f7a63b30c..58ed3f99f 100644
--- a/benchmarks/src/main/java/com/demcha/compose/jmh/SvgJmhBenchmark.java
+++ b/benchmarks/src/main/java/com/demcha/compose/jmh/SvgJmhBenchmark.java
@@ -24,7 +24,8 @@
* {@code DocumentSession}, no PDF render):
*
*
{@code parseSvgPath} — {@link SvgPath#parse} of a real Material icon
- * {@code d} string (arc→cubic conversion, normalization).
{@code readSvgIcon} — {@link SvgIcon#parse} of a multi-layer icon (XML
* parse, {@code } transform accumulation, gradient resolution, one
* {@link SvgPath} per layer).
diff --git a/docs/operations/benchmarks.md b/docs/operations/benchmarks.md
index 775483384..3611d877e 100644
--- a/docs/operations/benchmarks.md
+++ b/docs/operations/benchmarks.md
@@ -40,6 +40,10 @@ The script prints numbered sections so you can map console output to the pipelin
runs the thread-scaling throughput sweep (1 → 16 threads).
3. `03-comparative`
Runs the GraphCompose canonical vs iText 5 vs JasperReports comparison.
+
+ _Steps 04–06 (`core-engine`, `full-cv`, `scalability`) were retired. The
+ surviving steps keep their original `NN-` console prefixes, so the labels
+ jump from `03-` to `07-`._
7. `07-stress`
Runs the concurrent stability stress test.
8. `08-endurance`
diff --git a/docs/operations/performance.md b/docs/operations/performance.md
index ecf02c5b7..7fc02d480 100644
--- a/docs/operations/performance.md
+++ b/docs/operations/performance.md
@@ -1,7 +1,13 @@
# Performance — v1.4 numbers
-All numbers below come from `scripts/run-benchmarks.ps1` — the full local
-benchmark workflow that builds the test classpath once and runs
+> **Historical snapshot (v1.4).** The numbers and suite list below are frozen
+> as captured for v1.4 and are kept for reference. The pipeline has since
+> changed: the `core-engine`, `full-cv`, and `scalability` suites were retired,
+> and current numbers come from the `current-speed` / `comparative` / `stress`
+> pipeline plus the JMH suite. See [docs/operations/benchmarks.md](./benchmarks.md).
+
+All numbers below were captured from `scripts/run-benchmarks.ps1` — the full
+local benchmark workflow that built the test classpath once and ran
`current-speed`, `comparative`, `core-engine`, `full-cv`, `scalability`,
and `stress` suites in sequence. They were captured on a developer
laptop; CI machines are typically 1.5–2× slower. The benchmark
@@ -93,5 +99,9 @@ snapshots.
## Engine-only timings
+_The `GraphComposeBenchmark` and `FullCvBenchmark` mains below were retired
+after v1.4. Equivalent timings now come from the `CurrentSpeedBenchmark`
+`engine-simple` scenario and the JMH `TemplateCvJmhBenchmark`._
+
- `GraphComposeBenchmark` (engine-only, no PDF render): avg **1.04 ms**, p50 **0.97 ms**, p95 **1.64 ms**.
- `FullCvBenchmark` (full CV template, including render): avg **4.14 ms**, p50 **3.80 ms**, p95 **6.37 ms**.
diff --git a/scripts/ab-bench.ps1 b/scripts/ab-bench.ps1
index 5a3e4eb42..a237ec203 100644
--- a/scripts/ab-bench.ps1
+++ b/scripts/ab-bench.ps1
@@ -110,21 +110,10 @@ function Parse-Comparative($jsonPath) {
}
function Parse-Logs($logsDir) {
$o = @{}
- $scal = Join-Path $logsDir "06-scalability.log"
- if (Test-Path $scal) {
- foreach ($line in (Get-Content $scal)) {
- if ($line -match '^\s*(\d+)\s*\|\s*\d+\s*\|\s*([\d.]+)\s*$') {
- $o["scalability | $($matches[1])t | docs/s"] = [double]$matches[2]
- }
- }
- }
- foreach ($pair in @(@("04-core-engine.log", "core-engine"), @("05-full-cv.log", "full-cv"))) {
- $p = Join-Path $logsDir $pair[0]
- if (Test-Path $p) {
- $txt = Get-Content $p -Raw
- if ($txt -match 'Median[^\r\n]*?:\s*([\d.]+)\s*ms') { $o["$($pair[1]) | median ms"] = [double]$matches[1] }
- }
- }
+ # Steps 04-06 (core-engine, full-cv, scalability) were retired, so their logs
+ # are no longer produced. Current-speed throughput — including the
+ # thread-scaling series — is read from the JSON report by Parse-CurrentSpeed;
+ # only the surviving stress log is parsed here.
$stress = Join-Path $logsDir "07-stress.log"
if (Test-Path $stress) {
$txt = Get-Content $stress -Raw
diff --git a/scripts/run-benchmarks.ps1 b/scripts/run-benchmarks.ps1
index e3d3947b6..a0dd2c777 100644
--- a/scripts/run-benchmarks.ps1
+++ b/scripts/run-benchmarks.ps1
@@ -6,7 +6,9 @@ Runs the local GraphCompose benchmark pipeline and stores timestamped logs and r
.DESCRIPTION
The wrapper performs a staged local run:
01 build classpath, 02 current-speed, 03 comparative, 07 stress,
-optional 08 endurance, then 09/10 diff steps.
+optional 08 endurance, then 09/10 diff and 11 verdict steps. Steps 04-06
+(core-engine, full-cv, scalability) were retired; the surviving steps keep
+their original numeric prefixes, so the numbering jumps from 03 to 07.
Current-speed diffs are profile-aware. The wrapper only compares reports
from the same current-speed profile (`smoke` or `full`) and skips the
From b93c44ec62ce1a386889302cec4383f3b3f31405 Mon Sep 17 00:00:00 2001
From: DemchaAV
Date: Sun, 14 Jun 2026 23:15:51 +0100
Subject: [PATCH 10/10] docs(changelog): note the v1.8 feature-object benches,
stage output, and gate coverage
---
CHANGELOG.md | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/CHANGELOG.md b/CHANGELOG.md
index e9f7124c2..6cb0e7074 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -344,6 +344,28 @@ Entries land here as they merge.
`ScalabilityBenchmark` (its thread-scaling sweep folded into
`CurrentSpeedBenchmark`'s full-profile throughput run, now `1,2,4,8,16`).
Dropped the matching `run-benchmarks.ps1` steps and doc entries.
+- **Feature-object benchmarks for the v1.8 vector surface (not shipped).**
+ The suite previously exercised only text/table primitives. Added JMH render
+ benches and deterministic probes over the new vector features:
+ `SvgJmhBenchmark` (path parse / whole-file icon read / icon→node) plus a
+ `SvgParseAllocProbe`; `ChartJmhBenchmark` (bar + line + pie render) plus a
+ `ChartAllocProbe` (layout-compile allocation); `VectorRenderOperatorProbe`
+ (the same paths drawn flat vs. gradient vs. translucent, counted as PDF
+ content-stream operators); `IconRampJmhBenchmark` (icon-placement scaling,
+ `@Param` 8/32/128); and `MixedShowcaseJmhBenchmark` (one document combining
+ prose, inline sparklines, bar + pie charts, SVG icons and a gradient path).
+ Shared `SvgBenchmarkFixtures` / `ChartBenchmarkFixtures` hold the inputs so
+ each bench and its probe measure identical data.
+- **Current-speed report carries a stage breakdown and a run summary (not
+ shipped).** `CurrentSpeedBenchmark` persists a per-scenario compose / layout /
+ render split (`stages[]`, median ms) to the JSON and a `stages` CSV, and
+ writes a readable `summary.md`. `BenchmarkDiffTool` consumes `stages[]`,
+ prints a per-stage delta table, and reports the scenarios added/removed
+ between two runs.
+- **Every current-speed scenario is now covered by the smoke perf gate (not
+ shipped).** The `long-token` scenario previously had no SMOKE threshold and
+ silently escaped the gate; it now has one, and `CurrentSpeedScenarioGateTest`
+ fails the build if any scenario lacks a threshold.
- **Removed the `java.awt.*` / `java.util.*` co-wildcard in four files.**
`InvoiceTemplateComposer`, `ProposalTemplateComposer`,
`WeeklyScheduleTemplateComposer`, and the engine `PdfRenderingSystemECS`