Skip to content
This repository was archived by the owner on Jan 14, 2026. It is now read-only.

ADBDEV-8953: Fix shared memory allocation and usage in diskquota#55

Merged
RekGRpth merged 37 commits intogpdbfrom
ADBDEV-8953
Dec 16, 2025
Merged

ADBDEV-8953: Fix shared memory allocation and usage in diskquota#55
RekGRpth merged 37 commits intogpdbfrom
ADBDEV-8953

Conversation

@RekGRpth
Copy link
Member

@RekGRpth RekGRpth commented Dec 5, 2025

Fix shared memory allocation and usage in diskquota

When initializing the postmaster, the init_disk_quota_shmem function requested
shared memory from the PostgreSQL core (RequestAddinShmemSpace) with the size
calculated in the DiskQuotaShmemSize function. This function

  1. added diskquota_launcher_shmem_size to the shared memory size only on the
    coordinator, while the init_launcher_shmem function also over-initialized
    diskquota_launcher_shmem_size on segments.

  2. for the active_tables_map hashmap size, incorrectly used the size of the
    DiskQuotaActiveTableEntry structure, instead of using the size of the
    DiskQuotaActiveTableFileEntry structure.

  3. for table_size_map incorrectly used (MAX_NUM_TABLE_SIZE_ENTRIES /
    diskquota_max_monitored_databases + 100) * diskquota_max_monitored_databases
    for the number of hashmap elements. However, in each worker, the
    init_disk_quota_model function then allocates this hashmap for
    MAX_NUM_TABLE_SIZE_ENTRIES elements.

This all needs to be fixed, and some checks should be added to ensure that the
shared memory size does not exceed the requested size. To avoid confusion about
which size is being requested for what purpose, additional defines have been
added.

Note: This patch changes the diskquota.max_table_segments GUC value from
cluster to per-database, which increases shared memory consumption. This only
fixes the current behavior before the patch and will be fixed in the next
patch.

Tests are not provided because it is quite difficult to exhaust shared memory
without exceeding the per-database limit.

Ticket: ADBDEV-8953

@RekGRpth RekGRpth marked this pull request as ready for review December 5, 2025 06:29
@dkovalev1
Copy link

Tests are not provided because it is quite difficult to exhaust shared memory without exceeding the per-database limit.

diskquota_shmem_size calculation looks a bit tricky, and memory allocation depends on GUC values diskquota.max_table_segments, diskquota.max_active_tables, diskquota.max_workers. IMHO it's a bit unwise to have such a piece of unchecked code.

@RekGRpth
Copy link
Member Author

Tests are not provided because it is quite difficult to exhaust shared memory without exceeding the per-database limit.

diskquota_shmem_size calculation looks a bit tricky, and memory allocation depends on GUC values diskquota.max_table_segments, diskquota.max_active_tables, diskquota.max_workers. IMHO it's a bit unwise to have such a piece of unchecked code.

But it helps to track whether the requested allocation has been forgotten.

@dkovalev1
Copy link

I do agree with this check, and I agree that it checks if the program designed and built correctly, not for a user actions. And for that its place in Assert.

But again, this assert won't cover most of cases in real usage.

@RekGRpth
Copy link
Member Author

I do agree with this check, and I agree that it checks if the program designed and built correctly, not for a user actions. And for that its place in Assert.

But again, this assert won't cover most of cases in real usage.

Provide examples of real cases that are not covered by the new assertions.

@dkovalev1
Copy link

Provide examples of real cases that are not covered by the new assertions.

1000-node user cluster

@RekGRpth
Copy link
Member Author

instead of UB

There was no UB! The user received warning and error:

2025-11-13 12:23:03.206126,,,p260906,th-1211049536,,,,0,con31,,seg-1,,dx15,,sx1,"WARNING","53200","out of shared memory",,,,,,,0,,"shmem.c",229,
2025-11-13 12:23:03.206170,,,p260906,th-1211049536,,,,0,con31,,seg-1,,dx15,,sx1,"ERROR","53200","out of shared memory",,,,,,,0,,"dynahash.c",964,

@RekGRpth
Copy link
Member Author

Provide examples of real cases that are not covered by the new assertions.

1000-node user cluster

And what exactly do you think could go wrong with such a cluster?

src/quotamodel.c Outdated
#ifdef USE_ASSERT_CHECKING
Assert(DiskquotaLauncherShmem);

if (!DiskquotaLauncherShmem->isDynamicWorker) Assert(pg_atomic_read_u64(diskquota_shmem_size) >= 0);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsigned value is always >= 0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reworked

@RekGRpth RekGRpth merged commit 6ebff01 into gpdb Dec 16, 2025
2 checks passed
@RekGRpth RekGRpth deleted the ADBDEV-8953 branch December 16, 2025 06:24
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants