Skip to content

Fix z.ai 5-hour token quota discarded when API returns multiple TOKENS_LIMIT entries#632

Closed
takumi3488 wants to merge 4 commits intosteipete:mainfrom
takumi3488:feat/zai-three-quota-display
Closed

Fix z.ai 5-hour token quota discarded when API returns multiple TOKENS_LIMIT entries#632
takumi3488 wants to merge 4 commits intosteipete:mainfrom
takumi3488:feat/zai-three-quota-display

Conversation

@takumi3488
Copy link
Copy Markdown
Contributor

@takumi3488 takumi3488 commented Apr 1, 2026

Summary

The z.ai API returns up to three quota entries — a 5-hour TOKENS_LIMIT, a weekly TOKENS_LIMIT, and a TIME_LIMIT (MCP). The parse loop used a single tokenLimit variable, so the second entry silently overwrote the first and the 5-hour quota was lost.

  • Add ZaiLimitUnit.weeks (rawValue 6) so the weekly entry gets a correct windowMinutes
  • Collect all TOKENS_LIMIT entries and sort by window size: shorter window → sessionTokenLimit (tertiary slot), longer → tokenLimit (primary slot)
  • Map sessionTokenLimit to UsageSnapshot.tertiary and enable supportsOpus/opusLabel: "5-hour" in the descriptor so MenuCardView, MenuDescriptor, and CLI all render the new row
  • Wire zaiSessionDetail text into the tertiary card metric
  • Backwards-compatible: when the API returns only one TOKENS_LIMIT + TIME_LIMIT, sessionTokenLimit is nil and the tertiary row is suppressed

Validation

  • swift build passes
  • Added ZaiThreeLimitTests: 3-entry parse, unit:6 enum, and 2-entry backward-compat fallback
  • Added ZaiMenuCardTests, a new test in CodexPresentationCharacterizationTests, and a new test in CLISnapshotTests covering the tertiary row at all three rendering layers

@takumi3488 takumi3488 changed the title feat(zai): display 5-hour token quota as tertiary row Fix z.ai 5-hour token quota discarded when API returns multiple TOKENS_LIMIT entries Apr 1, 2026
@takumi3488 takumi3488 marked this pull request as draft April 1, 2026 05:50
@takumi3488 takumi3488 marked this pull request as ready for review April 1, 2026 06:00
@ratulsarna
Copy link
Copy Markdown
Collaborator

I validated this against real Z.ai data from my local CodexBar setup, using the same configured token/account the app is
currently using.

What I found:

  • My live Z.ai payload does not reproduce the 3-limit shape described in this PR.
  • It consistently returns only:
    • TIME_LIMIT (unit=5, number=1)
    • TOKENS_LIMIT (unit=3, number=5)
  • I verified this from both the app logs and a direct fetch using CodexBar’s current config.

I also compared main vs this branch against that real account, and both behave the same:

  • primary = 5-hour token window
  • secondary = MCP/time window
  • no tertiary row

So from the real data I have, I can’t confirm an actual bug that this PR fixes.

The approach seems reasonable if Z.ai really does return two TOKENS_LIMIT entries for some accounts, but I wasn’t able to reproduce that. If you have a redacted real response showing the extra weekly TOKENS_LIMIT entry, that would help validate both the problem and the fix.

@ratulsarna ratulsarna added the question Further information is requested label Apr 8, 2026
@takumi3488
Copy link
Copy Markdown
Contributor Author

@ratulsarna

Thanks for validating! Apologies for not mentioning this upfront — I was testing exclusively on a Coding Plan account and didn't realize the 3-limit response is plan-specific. Should have included a sample payload in the PR description from the start.

Here's the real response from a Coding Plan account (level: "max"):

{
    "code": 200,
    "msg": "Operation successful",
    "data": {
        "limits": [
            {
                "type": "TOKENS_LIMIT",
                "unit": 3,
                "number": 5,
                "percentage": 6,
                "nextResetTime": 1775640704859
            },
            {
                "type": "TOKENS_LIMIT",
                "unit": 6,
                "number": 1,
                "percentage": 12,
                "nextResetTime": 1775843407998
            },
            {
                "type": "TIME_LIMIT",
                "unit": 5,
                "number": 1,
                "usage": 4000,
                "currentValue": 0,
                "remaining": 4000,
                "percentage": 0,
                "nextResetTime": 1777830607981,
                "usageDetails": [
                    {
                        "modelCode": "search-prime",
                        "usage": 0
                    },
                    {
                        "modelCode": "web-reader",
                        "usage": 0
                    },
                    {
                        "modelCode": "zread",
                        "usage": 0
                    }
                ]
            }
        ],
        "level": "max"
    },
    "success": true
}

Key observations:

  • Two TOKENS_LIMIT entries: 5-hour window (unit: 3) and 1-week window (unit: 6)
  • unit: 6 (weeks) doesn't exist in main's ZaiLimitUnit, so it falls through to .unknown
  • On main, the loop overwrites the first TOKENS_LIMIT with the second, losing the 5-hour quota

Here's the visual difference on a Coding Plan account:

main
only shows Tokens (weekly) + MCP, the 5-hour quota is lost:
main

This PR — correctly shows all three: Tokens (weekly) + MCP + 5-hour:
pr-632

I also verified that non-Coding Plan accounts (single TOKENS_LIMIT + TIME_LIMIT) are unaffected by this change. When only one TOKENS_LIMIT is returned, sessionTokenLimit stays nil, and all rendering paths (MenuDescriptor, MenuCardView, CLIRenderer) guard on supportsOpus && tertiary != nil, so the tertiary row is simply suppressed — same behavior as main.

Your account likely returns only one TOKENS_LIMIT because it's on a different plan tier, which is exactly this backward-compat path.

The z.ai API can return two TOKENS_LIMIT entries — one for the 5-hour
window (unit:3/number:5) and one for the weekly window (unit:6/number:1).
Previously the second entry silently overwrote the first, discarding the
5-hour quota entirely.

Changes:
- Add ZaiLimitUnit.weeks (rawValue 6) with windowMinutes = n×7×24×60
- Add ZaiUsageSnapshot.sessionTokenLimit for the shorter-window entry
- Rewrite parseUsageSnapshot to collect all TOKENS_LIMIT entries and
  sort by windowMinutes: shorter → sessionTokenLimit (tertiary), longer
  → tokenLimit (primary), preserving the existing display unchanged for
  APIs that return only one TOKENS_LIMIT
- Map sessionTokenLimit to UsageSnapshot.tertiary in toUsageSnapshot()
- Enable supportsOpus + opusLabel "5-hour" in ZaiProviderDescriptor so
  MenuCardView/MenuDescriptor/CLIRenderer render the new tertiary row
- Wire zaiSessionDetail text into the tertiary card metric
- Add ZaiThreeLimitTests covering 3-entry parsing, unit:6 enum, and
  backward-compatible 2-entry fallback
@takumi3488 takumi3488 force-pushed the feat/zai-three-quota-display branch from 3daf900 to 9a39d0b Compare April 8, 2026 08:22
@takumi3488
Copy link
Copy Markdown
Contributor Author

@ratulsarna

Correction on my previous comment — I said the 3-limit response was "plan-specific" (Coding Plan vs other plans), but that's not accurate.

Per the Z.ai DevPack FAQ:

Users who subscribe and enable auto-renewal before February 12 will enjoy unlimited weekly usage for the duration of their subscription.

So the actual difference is subscription start date, not plan tier. Your account is likely also on Coding Plan, but since you subscribed before Feb 12, you have unlimited weekly usage — meaning Z.ai doesn't return the weekly TOKENS_LIMIT (unit: 6) entry at all.

Accounts that subscribed after Feb 12 receive a weekly quota, which adds the second TOKENS_LIMIT entry to the response, resulting in the 3-limit shape this PR handles.

Sorry for the confusion — the backward-compat analysis still holds (accounts without the weekly limit are unaffected), but the root cause of the difference is subscription date, not plan tier.

@ratulsarna
Copy link
Copy Markdown
Collaborator

ratulsarna commented Apr 8, 2026

Opened a superseding PR here: #662

Thanks again, @takumi3488, for the original fix and for the extra payload/context that helped validate the edge case and carry this forward.

@ratulsarna ratulsarna closed this Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

question Further information is requested

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants