Preload nested RBAC associations to reduce N+1 queries#1322
Conversation
| next if attr_base.blank? | ||
| next if virtual_attribute_accessor(type, attr_name) | ||
| next if attr_base_uses_rbac?(attr_base) | ||
| # Removed RBAC check to allow preloading of RBAC-participating associations. |
There was a problem hiding this comment.
I realize this PR is still WIP, but I'm not a fan of these types of comments because they don't make sense to future readers and read more like a commit message. If we do need a comment here it should probably read something like
# NOTE: We intentionally do not do an RBAC check here as it prevents preloading of RBAC-participating
# associations leading to N+1 queries. Core RBAC may still re-filter these associations when accessed
This way it shows to future readers that we don't want them to include RBAC here, or if they choose to they have to take the N+1 into consideration.
There was a problem hiding this comment.
Oh yeah, this is absolutely going away.. it's explaining the why for when I come back to it to review it.
|
If there isn't already one, please also add a spec somewhere that shows that RBAC is actually being applied on these associations in the end. |
| else | ||
| next if attr_base.blank? | ||
| next if virtual_attribute_accessor(type, attr_name) | ||
| next if attr_base_uses_rbac?(attr_base) |
There was a problem hiding this comment.
(obviously, this comment will just go way though and I'll update the new comment that will land from the other PR to mention the specific exceptions that will remain)
27533a5 to
25a12fd
Compare
|
ok, this should be ready for re-review since #1324 is now merged and introduced some new comments explaining the |
25a12fd to
2ad49d3
Compare
Removed RBAC check in 'determine_include_for_find' to allow eager loading of nested RBAC associations (hardware.host.name, ext_management_system.name, storage.name, ems_cluster.name). Previously these were skipped, causing a query for each VM. Now they're eager-loaded and RBAC re-filters in bulk. For example, in a specific database setup with 108 vms returned, and the following API request: http://localhost:3000/api/vms?expand=resources&attributes=name,hardware.host.name,ext_management_system.name,storage.name The query count looks like this: Before: Completed 200 OK in 4530ms (Views: 0.3ms | ActiveRecord: 1401.9ms (2207 queries, 1152 cached) | GC: 479.2ms) After: Completed 200 OK in 3168ms (Views: 0.3ms | ActiveRecord: 541.5ms (787 queries, 763 cached) | GC: 461.6ms) Result: Dramatically reduced queries from 2207 to 787 (64% reduction). Fixes ManageIQ#1321
2ad49d3 to
55868a2
Compare
|
Checked commit jrafanie@55868a2 with ruby 3.3.10, rubocop 1.86.0, haml-lint 0.73.0, and yamllint 1.37.1 |
|
is there a spec that shows RBAC post-filtering the VMs? If not, I think we need that and show that it's not incurring more database queries. |
| # After fix: 9 queries (3 base + 6 RBAC re-filtering for 6 VMs) | ||
| # Before fix: was 21 queries (N+1 for each host access without preloading) |
There was a problem hiding this comment.
This wording is weird to me - I don't like comments that talk about a fix - that can really just live in the git commit body and shown via the spec itself.
| # After fix: 9 queries (3 base + 6 RBAC re-filtering for 6 VMs) | |
| # Before fix: was 21 queries (N+1 for each host access without preloading) | |
| # Query count expectations: 9 queries (3 base + 6 RBAC re-filtering for 6 VMs) |
| # Before fix: was 21 queries (N+1 for each host access without preloading) | ||
| # Note: RBAC associations are preloaded (visible in LEFT OUTER JOIN) but | ||
| # ManageIQ core RBAC re-filters them when accessed, causing additional queries. | ||
| # This is still better than the original N+1 (21 queries). |
There was a problem hiding this comment.
And this line should go away too.
| # This is still better than the original N+1 (21 queries). |
| # - RBAC associations (host, ems, cluster, storage) are preloaded via LEFT OUTER JOINs | ||
| # - ManageIQ core RBAC re-filters associations when accessed, causing additional queries | ||
| # - Query count: ~15 queries (base queries + RBAC re-filtering) | ||
| # - Without preloading: Would be ~20+ queries per VM (N+1 pattern) |
There was a problem hiding this comment.
Again, we don't need this "before this fix" type comments - I don't find them helpful to future readers (at least not in the code - fine in the git commit message)
There was a problem hiding this comment.
Thanks, yes. I cleaned up some of the comments but should make this more concise too.
Removed RBAC check in
determine_include_for_findto allow eager loadingof nested RBAC associations (hardware.host.name, ext_management_system.name,
storage.name, ems_cluster.name). Previously these were skipped, causing a
query for each VM. Now they're eager-loaded and RBAC re-filters in bulk.
For example, in a specific database setup with 108 vms returned, and the
following API request:
http://localhost:3000/api/vms?expand=resources&attributes=name,hardware.host.name,ext_management_system.name,storage.name
The query count looks like this:
Before:
Completed 200 OK in 4530ms (Views: 0.3ms | ActiveRecord: 1401.9ms (2207 queries, 1152 cached) | GC: 479.2ms)
After:
Completed 200 OK in 3168ms (Views: 0.3ms | ActiveRecord: 541.5ms (787 queries, 763 cached) | GC: 461.6ms)
Reduction of queries from 2207 to 787 (64%)
Fixes #1321