Skip to content

Monitor and plot RSS memory and CPU usage during qlever index#277

Open
tanmay-9 wants to merge 33 commits into
qlever-dev:mainfrom
tanmay-9:compute-index-mem-usage
Open

Monitor and plot RSS memory and CPU usage during qlever index#277
tanmay-9 wants to merge 33 commits into
qlever-dev:mainfrom
tanmay-9:compute-index-mem-usage

Conversation

@tanmay-9
Copy link
Copy Markdown
Collaborator

@tanmay-9 tanmay-9 commented Apr 6, 2026

So far, the qlever index command gave no insight into how much memory an index build actually needs or which index phase is responsible for the peak.

With this change, every index build records RSS memory and CPU usage over time, writes <name>.resource-usage-log.tsv, and renders <name>.resource-usage-plot.png once the index build finishes. The plot shades each index build phase (parsing, vocabulary merge, conversion, each permutation, and the text index) as a separate band and annotates the memory peak, so resource usage can be attributed to a specific phase. For comparison across runs and settings, the plot is captioned with the git hash of the index binary, the STXXL_MEMORY setting, and the batch size. This works whether the index is built natively or in a container (docker / podman).

The sampling rate can be set with --resource-usage-interval and the plot density on long builds with --resource-usage-plot-max-points (the sampling itself is unaffected, only how many points are drawn). There is also a --resource-usage-plot-only option that renders the plot from an existing <name>.resource-usage-log.tsv without re-running the index build, which is useful for tweaking --resource-usage-plot-max-points.

numpy and matplotlib are used only to render the resource-usage plot and not to log the resource-usage during an index build. Because that is a narrow use case and matplotlib is heavy, pulling in several transitive dependencies, they are an optional plot extra (pip install "qlever[plot]") rather than core dependencies. This keeps the base install small. Without these libraries the index build still succeeds and writes the resource-usage log; only the plot is skipped, with a hint to install qlever[plot].

@tanmay-9 tanmay-9 changed the title Compute the physical memory usage used by the qlever index command Compute the physical memory used by the qlever index command Apr 7, 2026
@tanmay-9 tanmay-9 changed the title Compute the physical memory used by the qlever index command Track memory usage during qlever index Apr 8, 2026
tanmay-9 and others added 19 commits April 9, 2026 10:06
…u cores used and add downsampling (max_points=500) for plot
@tanmay-9 tanmay-9 changed the title Track memory usage during qlever index Monitor and plot RSS memory and CPU usage during qlever index May 28, 2026
Copy link
Copy Markdown
Collaborator

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1-1 with Tanmay

Comment thread src/qlever/resource_monitor.py Outdated
return f"{bytes_val / GB:.2f} GB"


def parse_memory_to_bytes(memory_string: str) -> int:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think any function that is, in principle, general-purpose and does not have a lot of context or none at all, should be in utils.py.

And now that we are talking about it, it probably makes sense to have a util directory with different .py files for the groups of utils. That should be a separate PR, which comes before or after this one.

Comment thread src/qlever/usage_plot.py
)


def compute_phase_boundaries(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be one function that analyzes a given log file (potentially partial) and the output of which can then be used both here and for the index-stats command. It's fine if the output of the function contains more information than is used by the respective caller (as long as it's not outrageously more or outrageously costly to compute, which I don't think will be the case here)

Comment thread src/qlever/usage_plot.py Outdated
return match.group(1) if match else None


def parse_qleverfile(qleverfile_path: Path) -> dict[str, str]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we have functionality for that already? What if the respective variable in the Qleverfile is overridden by a command-line argument

Comment thread src/qlever/usage_plot.py Outdated
return phases


def parse_git_hash(log_path: Path) -> str | None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a util function

Comment thread src/qlever/usage_plot.py Outdated
return " | ".join(parts) if parts else None


def draw_usage_plot(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between "draw" and "render"? Maybe it's just the naming ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants