-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Problem
When synchronizing Testing as Code (TAC) test requirements to GitHub issues, command outputs and parsed data are directly embedded in the issue body via the Jinja2 template (tac_issues_body.j2). When these outputs are extremely large, they can exceed GitHub's issue body character limit (~65,536 characters), causing issue creation to fail with a 422 Unprocessable Entity error.
Affected Code Locations:
- Template:
github_ops_manager/templates/tac_issues_body.j2:10,19,29,38,58 - Schema:
github_ops_manager/schemas/tac.py:12-13(command_outputandparsed_outputfields) - Issue Creation:
github_ops_manager/github/adapter.py:129-152 - Issue Sync:
github_ops_manager/synchronize/issues.py:101-107
Current Implementation
The current workflow:
-
Data Collection: Test requirements collect
command_outputandparsed_outputintoTestingAsCodeCommandmodel fields -
Template Rendering: The
tac_issues_body.j2template renders these outputs directly into markdown code blocks:```cli {{ command_data.command_output }}{{ command_data.parsed_output }} -
Issue Creation: The rendered body is passed to
github_adapter.create_issue()with no size validation
Result: Large outputs cause the entire issue body to exceed GitHub's limit, and the issue creation fails.
Proposed Solution
Implement a simplified file attachment system using GitHub's native issue attachment functionality:
- Detects large content during issue body rendering (using a constant threshold)
- Uploads large content as native GitHub issue attachments
- Omits large content from the template entirely (attachments are visible in GitHub's attachment section)
Key Design Principle: Keep it simple - if content is small enough, show it inline. If too large, upload as attachment and don't show anything in the body (GitHub will display attachments separately).
Data Flow
-
Input (YAML test requirement):
commands: - command: "show version" command_output: "very long output..." # Could exceed 10KB parsed_output: "very long parsed data..." # Could exceed 10KB
-
During synchronization (
render_issue_bodies()):- Read
command_outputandparsed_outputfrom test requirement - If content > 10KB: Upload to GitHub as attachment, set content to
None - If content <= 10KB: Keep content as-is for inline display
- Read
-
Template rendering:
- Template receives modified data:
- Small content:
command_outputcontains data → show inline - Large content:
command_outputisNone→ show nothing (attachment visible in GitHub UI)
- Small content:
- Simple
{% if %}check: only render if content exists
- Template receives modified data:
GitHub Issue Attachments
GitHub supports native file attachments on issues. When files are attached to an issue, they:
- Appear in a dedicated "Attachments" section in the GitHub UI
- Are automatically visible to anyone viewing the issue
- Have permanent, stable URLs on GitHub's CDN
- Don't need to be linked in the issue body
Workflow:
- Upload attachments before or after issue creation
- Attachments are automatically associated with the issue
- No need to reference them in the issue body - GitHub handles display
Implementation Plan
Phase 1: Core Infrastructure
Task 1.1: Add Constants Module
File: github_ops_manager/utils/constants.py
Add constant for attachment threshold:
"""Application-wide constants."""
# Testing as Code attachment thresholds
TAC_MAX_INLINE_OUTPUT_SIZE = 10_000 # 10KB - upload anything larger as attachmentTask 1.2: Add GitHub Issue Attachment Upload to Adapter
File: github_ops_manager/github/adapter.py
Add issue attachment method to GitHubKitAdapter:
async def upload_issue_attachment(
self,
issue_number: int,
content: str,
filename: str,
) -> str:
"""Upload content as a GitHub issue attachment.
Args:
issue_number: The issue number to attach the file to
content: The file content to upload
filename: Filename for the attachment
Returns:
URL to the uploaded attachment on GitHub's CDN
"""
# Implementation depends on GitHub API research
# See "Implementation Notes" section below
return attachment_urlAbstract Method: Add to github_ops_manager/github/abc.py:168:
@abstractmethod
async def upload_issue_attachment(
self,
issue_number: int,
content: str,
filename: str,
) -> str:
"""Upload content as a GitHub issue attachment."""
passPhase 2: Content Processing Logic
Task 2.1: Create Attachment Upload Utility
New File: github_ops_manager/utils/attachments.py
"""Utilities for handling large content attachments in GitHub issues."""
import structlog
from github_ops_manager.github.adapter import GitHubKitAdapter
from github_ops_manager.utils.constants import TAC_MAX_INLINE_OUTPUT_SIZE
logger = structlog.get_logger(__name__)
async def process_large_content_for_attachment(
content: str | None,
filename: str,
github_adapter: GitHubKitAdapter,
issue_number: int,
max_inline_size: int = TAC_MAX_INLINE_OUTPUT_SIZE
) -> str | None:
"""Process content and upload as attachment if too large.
Args:
content: The content to process
filename: Filename for attachment if uploaded
github_adapter: GitHub client for uploading
issue_number: Issue number to attach to
max_inline_size: Max size (chars) before uploading as attachment
Returns:
- If content is None: None
- If content <= threshold: content (for inline display)
- If content > threshold: None (uploaded as attachment, omitted from body)
"""
if content is None:
return None
if len(content) <= max_inline_size:
return content
logger.info(
"Content exceeds inline threshold, uploading as attachment",
filename=filename,
content_size=len(content),
threshold=max_inline_size,
issue_number=issue_number
)
# Upload full content as issue attachment
await github_adapter.upload_issue_attachment(
issue_number=issue_number,
content=content,
filename=filename
)
# Return None to omit from template (attachment visible in GitHub UI)
return NoneTask 2.2: Modify Issue Synchronization Flow
File: github_ops_manager/synchronize/issues.py
The synchronization flow needs to be updated to:
- Create the issue first (to get issue number)
- Process attachments using the issue number
- Update issue body if needed (after attachments uploaded)
New approach:
async def sync_github_issues(desired_issues: list[IssueModel], github_adapter: GitHubKitAdapter) -> AllIssueSynchronizationResults:
"""For each YAML issue, decide whether to create, update, or no-op, and call the API accordingly."""
# ... existing code to fetch issues and decide actions ...
for desired_issue in desired_issues:
github_issue = github_issue_by_title.get(desired_issue.title)
decision = await decide_github_issue_sync_action(desired_issue, github_issue)
if decision == SyncDecision.CREATE:
# Create issue first to get issue number
github_issue = await github_adapter.create_issue(
title=desired_issue.title,
body=desired_issue.body,
labels=desired_issue.labels,
assignees=desired_issue.assignees,
milestone=desired_issue.milestone,
)
# NEW: Process attachments after issue creation
if desired_issue.data and 'commands' in desired_issue.data:
await process_tac_attachments(desired_issue, github_issue.number, github_adapter)
number_of_created_github_issues += 1
results.append(IssueSynchronizationResult(desired_issue, github_issue, decision))
# ... rest of update/noop logic ...New function:
async def process_tac_attachments(
issue: IssueModel,
issue_number: int,
github_adapter: GitHubKitAdapter
) -> None:
"""Process and upload large TAC outputs as attachments.
Args:
issue: The issue model with TAC data
issue_number: GitHub issue number
github_adapter: GitHub adapter for uploads
"""
from github_ops_manager.utils.attachments import process_large_content_for_attachment
for command_data in issue.data.get('commands', []):
# Process command_output
if command_data.get('command_output'):
await process_large_content_for_attachment(
command_data['command_output'],
f"{command_data['command']}_output.txt",
github_adapter,
issue_number
)
# Process parsed_output
if command_data.get('parsed_output'):
await process_large_content_for_attachment(
command_data['parsed_output'],
f"{command_data['command']}_parsed.json",
github_adapter,
issue_number
)Task 2.3: Update Issue Body Rendering
File: github_ops_manager/synchronize/issues.py:131-154
Modify render_issue_bodies() to process content size before rendering:
async def render_issue_bodies(
issues_yaml_model: IssuesYAMLModel,
) -> IssuesYAMLModel:
"""Render issue bodies using a provided Jinja2 template.
This coroutine mutates the input object and returns it.
"""
logger.info("Rendering issue bodies using template", template_path=issues_yaml_model.issue_template)
try:
template = construct_jinja2_template_from_file(issues_yaml_model.issue_template)
except jinja2.TemplateSyntaxError as exc:
logger.error("Encountered a syntax error with the provided issue template", issue_template=issues_yaml_model.issue_template, error=str(exc))
raise
for issue in issues_yaml_model.issues:
if issue.data is not None:
# NEW: Remove large content before template rendering
if 'commands' in issue.data:
from github_ops_manager.utils.constants import TAC_MAX_INLINE_OUTPUT_SIZE
for command_data in issue.data['commands']:
# Remove command_output if too large
if command_data.get('command_output') and len(command_data['command_output']) > TAC_MAX_INLINE_OUTPUT_SIZE:
command_data['command_output'] = None
# Remove parsed_output if too large
if command_data.get('parsed_output') and len(command_data['parsed_output']) > TAC_MAX_INLINE_OUTPUT_SIZE:
command_data['parsed_output'] = None
# Render with modified data (large content removed)
render_context = issue.model_dump()
try:
issue.body = template.render(**render_context)
except jinja2.UndefinedError as exc:
logger.error("Failed to render issue body with template", issue_title=issue.title, error=str(exc))
raise
return issues_yaml_modelTask 2.4: Update Template with Simple Conditionals
File: github_ops_manager/templates/tac_issues_body.j2
Update template to only show content if it exists (simplified):
{% for command_data in commands %}
Sample output of `{{ command_data.command }}`:
{% if command_data.parser_used != "YamlPathParse" %}
{% if command_data.command_output %}
```cli
{{ command_data.command_output }}{% endif %}
{% endif %}
{% if command_data.parser_used=="Genie"%}
A Genie Parser exists for this show command, and results in data like so:
You MUST use a Genie Parser for this {{ command_data.command }} command. Pay attention to the Parsing Requirements.
{% if command_data.parsed_output %}
{{ command_data.parsed_output }}{% endif %}
{% endif %}
{%if command_data.parser_used=="YamlPathParse"%}
The data for the command or API call {{ command_data.command }} is already in a structured and valid YAML or JSON format, which means we can use Robot's "YamlPath Parse" keyword. The data can be accessed using the following schema (which is the same as the raw output):
You MUST use YamlPath Parse keyword for this {{ command_data.command }} command or API call. Pay attention to the Parsing Requirements.
{% if command_data.parsed_output %}
{{ command_data.parsed_output }}{% endif %}
{% endif %}
{% if command_data.parser_used=="NXOSJSON"%}
Run the command as | json-pretty native (for example: show ip interface brief | json-pretty native), with a resulting JSON body like so:
{% if command_data.parsed_output %}
{{ command_data.parsed_output }}{% endif %}
{% endif %}
{% if command_data.parser_used in [None, '', 'Regex'] %}
A RegEx Pattern exists for this show command, and results in data like so:
You MUST use a RegEx Pattern (and Robot's Get Regexp Matches keyword) for this {{ command_data.command }} command. Pay attention to the Parsing Requirements.
{% if command_data.genai_regex_pattern %}
{{ command_data.genai_regex_pattern }}
{% else %}
{% endif %}Mocked Regex Data:
{% if command_data.parsed_output %}
{{ command_data.parsed_output }}{% endif %}
{% endif %}
{% endfor %}
**Note**: Template uses simple `{% if command_data.command_output %}` checks. If content is `None` (because it was too large), nothing is rendered. The attachment will be visible in GitHub's attachment section automatically.
### Phase 3: Configuration & Integration
#### Task 3.1: Update Synchronization Driver
**File**: `github_ops_manager/synchronize/driver.py`
No changes needed - `render_issue_bodies()` signature stays the same (doesn't need github_adapter).
### Phase 4: Testing & Documentation
#### Task 4.1: Unit Tests
**New File**: `tests/unit/test_attachments.py`
- Test `process_large_content_for_attachment()` with various content sizes
- Test that content <= threshold returns content unchanged
- Test that content > threshold returns None and uploads
- Mock GitHub adapter attachment upload calls
#### Task 4.2: Integration Tests
**New File**: `tests/integration/test_tac_large_outputs.py`
- Test end-to-end TAC issue creation with large outputs
- Verify attachments are uploaded to GitHub
- Verify issue body does NOT contain large content
- Verify small content remains in issue body
- Test with different parser types (Genie, YamlPath, Regex, etc.)
#### Task 4.3: Update Documentation
**File**: `README.md`
Add section explaining:
- Attachment handling for large outputs (>10KB)
- Uses native GitHub issue attachments
- Large content is omitted from body (visible in GitHub's attachment section)
- Small content (<=10KB) remains inline
## Design Decisions
1. **Upload Method**: Native GitHub Issue Attachments
- Uses GitHub's user-attachments CDN
- Permanent, stable URLs
- Native GitHub integration
- Attachments visible in dedicated section
2. **Size Threshold**: 10KB (10,000 characters)
- Defined in `github_ops_manager/utils/constants.py`
- Configurable via constant
3. **No Links in Body**: Keep it simple
- Content <= 10KB: Show inline in body
- Content > 10KB: Upload as attachment, show nothing in body
- GitHub displays attachments automatically
4. **Two-Phase Flow**: Create issue, then attach
- Create issue first to get issue number
- Upload attachments using issue number
- Large content removed during template rendering
- Attachments processed after issue creation
5. **Backward Compatibility**: Graceful degradation
- Existing test requirements continue to work
- If attachment upload fails, log warning but don't fail
- Template conditionals handle None content gracefully
## Success Criteria
- [ ] TAC issues with command outputs >10KB are created successfully
- [ ] Large outputs are uploaded as native GitHub issue attachments
- [ ] Issue bodies do NOT contain large content (omitted, not linked)
- [ ] Small outputs (<= 10KB) remain inline in issue body
- [ ] Constant `TAC_MAX_INLINE_OUTPUT_SIZE` controls threshold
- [ ] Attachments visible in GitHub's attachment section automatically
- [ ] Tests verify end-to-end attachment flow
- [ ] No breaking changes to existing TAC workflows
- [ ] Clear error messages if attachment upload fails
## Related Files
- `github_ops_manager/utils/constants.py` - Size threshold constant (NEW)
- `github_ops_manager/utils/attachments.py` - Attachment processing logic (NEW)
- `github_ops_manager/templates/tac_issues_body.j2` - Template rendering (MODIFIED - add simple {% if %} checks)
- `github_ops_manager/schemas/tac.py` - Data models (NO CHANGES NEEDED)
- `github_ops_manager/synchronize/issues.py` - Issue synchronization (MODIFIED - add attachment processing)
- `github_ops_manager/github/adapter.py` - GitHub API client (MODIFIED - add upload_issue_attachment)
- `github_ops_manager/github/abc.py` - Abstract base class (MODIFIED - add abstract method)
## Implementation Notes
The exact implementation of `upload_issue_attachment()` will need to research GitHub's API for uploading issue attachments. Possible approaches:
1. **Issue comments with attachments**: Create a comment with the attachment, GitHub handles storage
2. **GraphQL API**: Use GitHub's GraphQL API for attachment upload
3. **Asset upload endpoint**: Use asset upload mechanism
Research needed to determine the best approach for uploading attachments to existing issues.
## Next Steps
1. Research GitHub API for issue attachment upload mechanisms
2. Review and approve this implementation plan
3. Create subtasks for each phase
4. Begin implementation with Phase 1