Jk/optimizations#5
Conversation
Replace substring check `"azure.com" in url` with `urlparse`-based hostname validation to prevent bypass via crafted URLs (CWE-20). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- get_child(): pass state directly to constructor instead of calling set_handlers/add_tags/add_metadata sequentially (-22%) - add_tags(): use set-based dedup instead of O(n) list scans per tag, early return when tags list is empty (-70% on duplicate tags) - handle_event/ahandle_event: early return when handlers list is empty - _configure(): skip throwaway CallbackManager construction on the common path where inheritable_callbacks is provided These are the top bottlenecks identified by profiling langgraph's react_agent benchmark, where langchain_core callbacks account for ~35% of runtime. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…et config keys - Add copy-on-write (COW) to BaseCallbackManager.copy() — defer shallow copies of handlers/tags/metadata until first mutation - Cache inspect.signature() results for BaseChatModel._generate and _agenerate to avoid repeated introspection per invoke - Cache signature(self._run) and _get_runnable_config_param in ChildTool.__init__ to avoid per-invocation introspection - Convert CONFIG_KEYS and COPIABLE_KEYS from lists to frozensets for O(1) membership checks - Fast path in _format_for_tracing when no messages have list content Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Policy Check Failed✗ 3/3 policy checks failed: • Need 2 more approval(s) (0/2) — comment LGTM or approve via review To merge this PR:
|
PR SummaryWhat Changed
Key Changes by AreaCallbacks: Replaced eager copying with lazy copy-on-write in Chat Models: Added module-level cache for run manager parameter checks and early return in message formatting when no list content exists. Tools: Added per-instance caching for Security: Fixed Files Changed
Review Focus Areas
ArchitectureDesign Decisions: Copy-on-write trades memory efficiency for slight code complexity. Caching introspection results accepts memory overhead for reduced CPU usage in hot paths. The Azure fix prioritizes security over backward compatibility with malformed URLs. Risks: Copy-on-write relies on correct mutation detection. Missing a mutation path would cause shared state bugs. Cache keys for introspection must remain stable across Python versions. Merge StatusMERGEABLE — PR Score 65/100, above threshold (50). All gates passed. |
| clone = self.__class__.__new__(self.__class__) | ||
| clone.handlers = self.handlers | ||
| clone.inheritable_handlers = self.inheritable_handlers | ||
| clone.parent_run_id = self.parent_run_id | ||
| clone.tags = self.tags | ||
| clone.inheritable_tags = self.inheritable_tags | ||
| clone.metadata = self.metadata | ||
| clone.inheritable_metadata = self.inheritable_metadata | ||
| clone._cow = True # noqa: SLF001 |
There was a problem hiding this comment.
The copy-on-write clone bypasses __init__, so subclass-specific state is silently dropped; use the normal constructor or a subclass hook to preserve required attributes.
Suggested fix
return self.__class__(
handlers=self.handlers.copy(),
inheritable_handlers=self.inheritable_handlers.copy(),
parent_run_id=self.parent_run_id,
tags=self.tags.copy(),
inheritable_tags=self.inheritable_tags.copy(),
metadata=self.metadata.copy(),
inheritable_metadata=self.inheritable_metadata.copy(),
)Prompt for AI assistance
Copy the prompt below and paste it into ChatGPT, Claude, or any LLM:
You are an expert python developer with deep knowledge of security, performance, and best practices.
### Context
File: libs/core/langchain_core/callbacks/base.py
Lines: 992-1000
Issue Type: functional-high
Severity: high
Issue Description:
The copy-on-write clone bypasses `__init__`, so subclass-specific state is silently dropped; use the normal constructor or a subclass hook to preserve required attributes.
Current Code:
clone = self.__class__.__new__(self.__class__)
clone.handlers = self.handlers
clone.inheritable_handlers = self.inheritable_handlers
clone.parent_run_id = self.parent_run_id
clone.tags = self.tags
clone.inheritable_tags = self.inheritable_tags
clone.metadata = self.metadata
clone.inheritable_metadata = self.inheritable_metadata
clone._cow = True # noqa: SLF001
---
### Instructions
1. Fix the issue described above
2. Maintain the exact indentation and code style from the original
3. Follow python best practices and language-specific idioms
4. Ensure the fix addresses the root cause, not just the symptoms
5. Add brief inline comments explaining the fix if needed
### Constraints
- Do not change functionality beyond fixing the identified issue
- Preserve existing variable names and function signatures unless they are part of the problem
- Ensure the fix is production-ready
---
Security Scan Summary
No critical security issues detected Scan completed in 24.7sSecurity scan powered by Codity.ai |
License Compliance Scan
All licenses are low-risk and compliant Powered by Codity.ai · Docs |
Greptile SummaryThis PR applies a set of micro-optimizations across
Confidence Score: 3/5The COW implementation in base.py has a real defect that can silently corrupt a cloned callback manager's tags and metadata when set_handlers is called on the original after copying; the remaining changes are straightforward and correct. The set_handlers method clears the COW flag without first materializing copies of the shared tags, metadata, inheritable_tags, and inheritable_metadata. Any code that copies a manager, calls set_handlers on the original, and then calls add_tags or add_metadata will unknowingly mutate state that the clone still holds a reference to. This is a subtle bug that won't surface in unit tests that never exercise this exact ordering, but could produce hard-to-diagnose tracing corruption in real workloads. libs/core/langchain_core/callbacks/base.py — the set_handlers method and the new-based copy() implementation deserve a close second look before merging. Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller
participant Original as BaseCallbackManager (original)
participant Clone as BaseCallbackManager (clone)
Caller->>Original: copy()
Note over Original: _cow = True
Original-->>Clone: "new clone (_cow = True, shared lists)"
alt Mutation on Clone
Caller->>Clone: add_tags(["x"])
Clone->>Clone: _cow_copy() → copies all shared lists
Note over Clone: _cow = False, now owns its own lists
Clone->>Clone: tags.extend(["x"]) ✅ safe
else set_handlers then add_tags on Original
Caller->>Original: set_handlers([h])
Note over Original: _cow = False ← BUG: skips _cow_copy()
Note over Original,Clone: tags/metadata still SHARED
Caller->>Original: add_tags(["y"])
Original->>Original: _cow_copy() → no-op (_cow is False)
Original->>Original: tags.extend(["y"]) ← mutates SHARED list
Note over Clone: clone.tags now unexpectedly contains "y" 🐛
end
|
| def copy(self) -> Self: | ||
| """Return a copy of the callback manager.""" | ||
| return self.__class__( | ||
| handlers=self.handlers.copy(), | ||
| inheritable_handlers=self.inheritable_handlers.copy(), | ||
| parent_run_id=self.parent_run_id, | ||
| tags=self.tags.copy(), | ||
| inheritable_tags=self.inheritable_tags.copy(), | ||
| metadata=self.metadata.copy(), | ||
| inheritable_metadata=self.inheritable_metadata.copy(), | ||
| ) | ||
| """Return a copy of the callback manager. | ||
|
|
||
| Uses copy-on-write: the copy shares underlying lists/dicts until | ||
| either the original or the copy is mutated. | ||
| """ | ||
| self._cow = True | ||
| clone = self.__class__.__new__(self.__class__) | ||
| clone.handlers = self.handlers | ||
| clone.inheritable_handlers = self.inheritable_handlers | ||
| clone.parent_run_id = self.parent_run_id | ||
| clone.tags = self.tags | ||
| clone.inheritable_tags = self.inheritable_tags | ||
| clone.metadata = self.metadata | ||
| clone.inheritable_metadata = self.inheritable_metadata | ||
| clone._cow = True # noqa: SLF001 | ||
| return clone |
There was a problem hiding this comment.
__new__-based clone skips subclass __init__
copy() uses self.__class__.__new__(self.__class__) and manually copies only the seven attributes declared in BaseCallbackManager.__init__. Any subclass that extends __init__ with additional instance attributes (e.g., self.ended, self.parent_run_manager in CallbackManagerForChainGroup) will produce a clone that raises AttributeError on first access.
The two current subclasses that add extra attributes (CallbackManagerForChainGroup, AsyncCallbackManagerForChainGroup) both override copy() and are safe today, but this is an invisible footgun for any future subclass that forgets to do the same. A docstring note or a __init_subclass__ guard would prevent silent breakage.
Code Quality Report — test-org-codity/langchain · PR #5Scanned: 2026-05-19 19:31 UTC | Score: 51/100 | Provider: github Executive Summary
Top Findings[CQ-LLM-001]
|
| File | Critical | High | Medium | Low | Total |
|---|---|---|---|---|---|
libs/core/langchain_core/callbacks/base.py |
0 | 2 | 4 | 1 | 7 |
libs/core/langchain_core/tools/base.py |
0 | 0 | 0 | 1 | 1 |
Recommendations
- Resolve High severity issues, especially error handling gaps and performance bottlenecks.
- Run automated tests after applying fixes to verify no regressions.
Fixes #
Read the full contributing guidelines: https://docs.langchain.com/oss/python/contributing/overview
If you paste a large clearly AI generated description here your PR may be IGNORED or CLOSED!
Thank you for contributing to LangChain! Follow these steps to have your pull request considered as ready for review.
Fixes #xxline at the top is required for external contributions — update the issue number and keep the keyword. This links your PR to the approved issue and auto-closes it on merge.make format,make lintandmake testfrom the root of the package(s) you've modified.Additional guidelines:
uv.lockfiles or add dependencies topyproject.tomlfiles (even optional ones) unless you have explicit permission to do so by a maintainer.Social handles (optional)
Twitter: @
LinkedIn: https://linkedin.com/in/