fix(05,12): raise o3-deep-research TPM capacity from 10 to 200 by corticalstack · Pull Request #13 · corticalstack/awesome-foundry-nextgen

corticalstack · 2026-05-27T12:28:45Z

Summary

Raise the o3-deep-research model deployment SKU capacity from 10 (10K TPM) to 200 (200K TPM) in both Bicep files that define it:

05-foundry-project-pattern-setup/05-02-deploy-foundry-core-gateway/main.bicep (the hub deploy)
12-foundry-iq-deep-research/main.bicep (the optional standalone deploy)

The original 10K cap throttles multi-step deep-research runs with 429 errors before they complete. 200K gives realistic headroom for the agentic deep-research loop while staying well under the Norway East o3-DeepResearch subscription quota (limit 3000 - confirmed via az cognitiveservices usage list -l norwayeast).

GlobalStandard billing is pay-per-token, so raising the capacity ceiling does not change baseline cost; it only raises the rate-limit cap.

Existing deployments

Bumping the Bicep value does not update already-deployed resources. To update a live deployment in place without redeploying the whole template:

az cognitiveservices account deployment update \
  -g rg-foundry-core-{suffix} \
  -n aif-research-{suffix} \
  --deployment-name o3-deep-research \
  --sku-capacity 200

Patch release 0.8.5.

Stacking note

This PR is stacked on fix/deep-research-apim-key-lookup (PR #12). Base set to that branch so the diff shows only the Bicep + version-bump changes. Merge #12 first, then this PR.

Test plan

Fresh az deployment group create of either Bicep produces an o3-deep-research deployment with sku.capacity = 200
12-02-deep-research-loop.ipynb completes a multi-step research run without 429s
az cognitiveservices usage list -l norwayeast still shows the OpenAI.GlobalStandard.o3-DeepResearch total well under its limit

The original SKU capacity of 10 (= 10K TPM) throttled multi-step deep-research runs with 429 errors before completion. Raised to 200 (= 200K TPM) in both Bicep files that define the deployment: - 05-foundry-project-pattern-setup/05-02-deploy-foundry-core-gateway/main.bicep - 12-foundry-iq-deep-research/main.bicep The new value stays well under the Norway East o3-DeepResearch subscription quota (limit 3000). Existing live deployments must be updated separately, either by a fresh bicep apply or via: az cognitiveservices account deployment update \ -g rg-foundry-core-{suffix} -n aif-research-{suffix} \ --deployment-name o3-deep-research --sku-capacity 200

corticalstack added 2 commits May 27, 2026 14:28

chore: bump version to 0.8.5 and add release notes

df62cb3

corticalstack merged commit ae26671 into fix/deep-research-apim-key-lookup May 27, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(05,12): raise o3-deep-research TPM capacity from 10 to 200#13

fix(05,12): raise o3-deep-research TPM capacity from 10 to 200#13
corticalstack merged 2 commits into
fix/deep-research-apim-key-lookupfrom
fix/o3-deep-research-tpm-capacity

corticalstack commented May 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

corticalstack commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Existing deployments

Stacking note

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

corticalstack commented May 27, 2026 •

edited

Loading