Skip to content

fix(05,12): raise o3-deep-research TPM capacity from 10 to 200#13

Merged
corticalstack merged 2 commits into
fix/deep-research-apim-key-lookupfrom
fix/o3-deep-research-tpm-capacity
May 27, 2026
Merged

fix(05,12): raise o3-deep-research TPM capacity from 10 to 200#13
corticalstack merged 2 commits into
fix/deep-research-apim-key-lookupfrom
fix/o3-deep-research-tpm-capacity

Conversation

@corticalstack
Copy link
Copy Markdown
Owner

@corticalstack corticalstack commented May 27, 2026

Summary

Raise the o3-deep-research model deployment SKU capacity from 10 (10K TPM) to 200 (200K TPM) in both Bicep files that define it:

  • 05-foundry-project-pattern-setup/05-02-deploy-foundry-core-gateway/main.bicep (the hub deploy)
  • 12-foundry-iq-deep-research/main.bicep (the optional standalone deploy)

The original 10K cap throttles multi-step deep-research runs with 429 errors before they complete. 200K gives realistic headroom for the agentic deep-research loop while staying well under the Norway East o3-DeepResearch subscription quota (limit 3000 - confirmed via az cognitiveservices usage list -l norwayeast).

GlobalStandard billing is pay-per-token, so raising the capacity ceiling does not change baseline cost; it only raises the rate-limit cap.

Existing deployments

Bumping the Bicep value does not update already-deployed resources. To update a live deployment in place without redeploying the whole template:

az cognitiveservices account deployment update \
  -g rg-foundry-core-{suffix} \
  -n aif-research-{suffix} \
  --deployment-name o3-deep-research \
  --sku-capacity 200

Patch release 0.8.5.

Stacking note

This PR is stacked on fix/deep-research-apim-key-lookup (PR #12). Base set to that branch so the diff shows only the Bicep + version-bump changes. Merge #12 first, then this PR.

Test plan

  • Fresh az deployment group create of either Bicep produces an o3-deep-research deployment with sku.capacity = 200
  • 12-02-deep-research-loop.ipynb completes a multi-step research run without 429s
  • az cognitiveservices usage list -l norwayeast still shows the OpenAI.GlobalStandard.o3-DeepResearch total well under its limit

The original SKU capacity of 10 (= 10K TPM) throttled multi-step
deep-research runs with 429 errors before completion. Raised to
200 (= 200K TPM) in both Bicep files that define the deployment:

- 05-foundry-project-pattern-setup/05-02-deploy-foundry-core-gateway/main.bicep
- 12-foundry-iq-deep-research/main.bicep

The new value stays well under the Norway East o3-DeepResearch
subscription quota (limit 3000). Existing live deployments must be
updated separately, either by a fresh bicep apply or via:

  az cognitiveservices account deployment update \
    -g rg-foundry-core-{suffix} -n aif-research-{suffix} \
    --deployment-name o3-deep-research --sku-capacity 200
@corticalstack corticalstack merged commit ae26671 into fix/deep-research-apim-key-lookup May 27, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant