Conversation
- Added logging for VM names during optimization analysis to improve traceability. - Updated CPU-only analysis logic to use average CPU as the primary metric when memory metrics are unavailable, refining underutilization detection. - Enhanced reporting for underutilized VMs, providing clearer recommendations based on average and peak CPU metrics. - Improved documentation within the script to clarify the analysis approach and thresholds used for identifying underutilization.
…nalysis - Added comprehensive analysis for server errors, throttling, user errors, and storage capacity in the service bus metrics script, providing detailed metrics and recommendations for each issue. - Improved queue and topic health scripts to include analysis of disabled queues/topics and message backlog, enhancing visibility into operational issues. - Introduced context, investigation steps, and recommendations for each identified issue, aiding in troubleshooting and resolution. - Updated runbook to reflect new timeout settings for improved reliability during execution of cost health analysis tasks.
There was a problem hiding this comment.
Bug: Undefined variables used in subscription backlog analysis
The MESSAGE BACKLOG ANALYSIS section references $status and $max_delivery_count variables that are only defined inside the dead-letter check block (lines 272-273). When a subscription has high active message count but NOT high dead-letter count, these variables will be unset. Since the script uses set -u, this causes a script failure. The service_bus_queue_health.sh file correctly fetches these variables within the active message count block, but this pattern was not followed in the topic health script.
codebundles/azure-servicebus-health/service_bus_topic_health.sh#L332-L334
- Updated the message imbalance calculation to use `bc` for float-safe arithmetic, enhancing accuracy in metrics analysis. - Improved comments for clarity on the calculation process, ensuring better understanding of the script's functionality.
| local savings_note="N/A" | ||
|
|
||
| # Calculate potential savings if we have capacity data | ||
| if [[ "$access_tier" == "Hot" ]] && (( $(echo "$capacity_gb > 0" | bc -l) )); then |
There was a problem hiding this comment.
Bug: Hot tier accounts miscounted when capacity data unavailable
The hot_tier_accounts counter is only incremented when both the access tier is "Hot" AND capacity_gb > 0. However, if there are Hot tier storage accounts whose capacity metrics are unavailable (returning 0), they won't be counted. Later, at lines 601-608, when hot_tier_accounts is 0, the message incorrectly states "No Hot tier accounts found" even though Hot tier accounts may exist - they just lack capacity data. This causes misleading output and incorrect severity assignment.
Additional Locations (1)
- Updated the service bus metrics script to ensure that calculations for total errors, throttled requests, incoming messages, outgoing messages, and user errors default to zero when no data is available, improving reliability and preventing potential errors during execution. - Enhanced the analysis of storage metrics to include similar default handling, ensuring consistent behavior across the script and better handling of edge cases.
Note
Enhances Service Bus diagnostics, introduces data-driven storage savings (tiering/redundancy) and improved VM underutilization logic, and raises runbook timeouts for cost analyses.
service_bus_metrics.sh): Adds detailed calculations and rich context forServerErrors,ThrottledRequests,UserErrors, andSize(totals/max/averages, error rates, message imbalance) with actionable recommendations and updated issue titles.service_bus_queue_health.sh): Adds disabled-queue detection with counts/timestamps; expands backlog and size analyses with additional metrics (scheduled/transfer counts, config context) and structured guidance.service_bus_topic_health.sh): Adds disabled-topic analysis; augments topic size checks (subscriptions, partitioning, status) and subscription checks (dead-letter, backlogs, disabled state) with detailed remediation steps.analyze_storage_optimization.sh):analyze_vm_optimization.sh):runbook.robot):Written by Cursor Bugbot for commit 63566f7. This will update automatically on new commits. Configure here.