docs(blog): Anthropic /v1/messages streaming performance improvements#245
Closed
oss-agent-shin wants to merge 1 commit into
Closed
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
Author
|
@greptileai please review |
Contributor
Author
|
Manual review requested from @yassin-kortam (ticket owner / source-PR author #28289) and @mubashir1osmani (approver on prior shin docs PR #234). Greptile & Veria are not installed on this repo, so the standard automated review gate cannot run — manual approval is the established pattern for |
Collaborator
|
duplicate of #223 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Adds a performance blog post for
litellm-docscovering the Anthropic/v1/messagesstreaming hot-path optimizations that shipped in BerriAI/litellm#28289. The post walks through the four buckets of overhead the optimization removed (no-op hooks, double-work, O(tokens) end-of-stream reconstruction, hot-path debug logging), the parity guarantees / tests, the headline benchmark numbers, and how to reproduce the benchmark with the newscripts/benchmark_anthropic_messages_perf.pyharness.blog/anthropic_messages_streaming_perf/index.mdanthropic-messages-streaming-perfperformance,anthropic,streaming,proxycomponentized_deploymentpost (frontmatter shape,{/* truncate */}cut, key-takeaways + conclusion sections) per the Linear spec/img/blog/anthropic_messages_streaming_perf/...paths so the assets can be dropped in later without churning the proseLinear ticket
Resolves LIT-3333
Why pushed via Contents API
The current agent
GITHUB_TOKENlacksrepo+workflowscopes, sogit pushis not available. The single new file was uploaded throughPUT /repos/{owner}/{repo}/contents/{path}(one commit, one file). The merge-base diff is exactly the blog directory addition.Evidence
This change is pure documentation — one new markdown file under
blog/, no executable code touched, no JS/TS, no Python, no config, no schema. There is no runtime surface to capture before/after on. Stating that explicitly per the Step 3 rule (no silent omissions).Source content verified against the source PR body:
The four optimization sections in the post each map 1:1 to a bullet under "What this PR does" in source PR #28289:
Pre-Submission checklist
componentized_deployment) — no new template neededyassin) already exists inblog/authors.yml— no auth change neededSession: https://litellm-agent-platform.onrender.com/sessions/eeb578f7-75b8-4877-a237-33efdf1b158c
Verification (ship-pr)
Manual run because
BerriAI/litellm-docshas neither Greptile nor Veria installed (verified by inspecting reviews on 5 prior merged PRs incl. shin PR #234 —vercel[bot]is the only bot author on the repo's recent PRs).grep -i -E "(rocket money|tempus|barracuda|cornell|verizon|nvidia|netapp|adobe|playtika|kraken)"returns no matches. PASS.main(the correct base forlitellm-docs). Thelitellm_oss_agent_shin_daily_branchrule applies toBerriAI/litellm, notlitellm-docs— the docs repo has no.github/workflows/and no "Verify PR source branch" check, and recently-merged shin PR docs(blog): incident report for Prisma reconnect freezing event loop (LIT-2614) #234 also usedbase=main. PASS.GITHUB_TOKENlacksrepo+workflowscopes, so a singlePUT /contents/{path}was used to commit the blog post. One file, one commit. Noted up-thread under "Why pushed via Contents API". PASS.4847349364reportsstate=successfor commit78133e27. That confirms the new blog file passed the docusaurus blog plugin (frontmatter + MDX-in-md), the YAML frontmatter parses,{/* truncate */}is recognized, and the author handle resolves againstblog/authors.yml. PASS.Vercel Preview Commentsis the recorded CI check. This PR matches that baseline. PASS.componentized_deploymenttemplate shape (slug / title / date / authors / description / tags / hide_table_of_contents). PASS.<!-- TODO(yassin): replace with ... -->), so docusaurus does not try to resolve any missing PNGs and there are no broken-image warnings. Real assets can be dropped intostatic/img/blog/anthropic_messages_streaming_perf/later by uncommenting the placeholders. PASS.scripts/benchmark_anthropic_messages_perf.pyin the source diff. PASS.Resolves LIT-3333in PR body; PR link posted back to the Linear ticket viacommentCreate. PASS.blog/authors.ymlasyassin— no auth-config change needed. PASS.Greptile / Veria gate
Greptile and Veria are not installed on
BerriAI/litellm-docs.@greptileai please reviewwas posted as a no-op for parity with the BerriAI/litellm flow; no Greptile review will arrive. Past doc PRs (#234, #239, etc.) shipped through direct human approval — same path here. Filed for reviewer transparency, not as a gate-bypass.Slack post
Slack MCP (
mcp__lap-slack__*) is not exposed to this agent session — known platform constraint, tracked across many previously-filed issues. The#eng-pr-reviewsStep-5 post cannot be made from this session; reviewer is being requested via this PR's GitHub review-request mechanism instead.