-
Notifications
You must be signed in to change notification settings - Fork 5
feat: Add section validation warnings for recommended manuscript sections #275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ions - New SectionValidator checks for standard sections required by most journals - Warns (doesn't error) if sections are missing: • Data Availability • Code Availability • Author Contributions • Acknowledgements • Funding • Competing Interests - Helps authors ensure manuscripts are complete before submission - Integrated into 'rxiv validate' command as first validator in chain - Flexible pattern matching supports multiple section naming variations
Code Review: Section Validation WarningsThank you for this contribution! This is a valuable feature that aligns well with journal requirements. Strengths
CRITICAL: Missing Test Coverage (Severity: High)The PR adds a new validator but includes no unit tests. According to CLAUDE.md, the project aims for 80% test coverage. Required: tests/unit/test_section_validator.py should include:
See tests/unit/test_doi_validator.py for validator test patterns. Enhancement: Consider Using ErrorCode Enum (Severity: Medium)Other validators use ErrorCode from core.error_codes. Your code uses the deprecated _create_error() method. Consider defining section-specific error codes for consistency (e.g., MISSING_SECTION). See figure_validator.py:77 and citation_validator.py:74 for examples. Pattern Matching: Edge Cases (Severity: Low)Question: Is it intentional to only match h2 headers and not h3? This might be worth documenting. Your regex handles multiple spaces and case variations correctly. Security Review: PASS
Before Merging (Required)
Overall Assessment: Conditional ApprovalThe implementation is solid and well-designed, but missing test coverage is a blocker. Add tests, then re-request review. Great work on the feature design! The validator pattern is well-executed and will be genuinely useful for authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a new SectionValidator that checks manuscripts for six standard sections commonly required by journals (Data Availability, Code Availability, Author Contributions, Acknowledgements, Funding, and Competing Interests). The validator issues non-blocking warnings when recommended sections are missing, helping authors ensure manuscript completeness before journal submission.
Key Changes:
- New section validation system with flexible pattern matching supporting multiple section name variations
- Integration as the first validator in the validation workflow ("Structure" validation)
- Smart handling of combined sections (e.g., "Data and Code Availability")
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
src/rxiv_maker/validators/section_validator.py |
New validator class that checks for six recommended manuscript sections using regex patterns and generates warnings for missing sections |
src/rxiv_maker/engines/operations/validate.py |
Integrates SectionValidator into the validation workflow as the first validator, with proper imports in both relative and absolute import blocks |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "author contributions": { | ||
| "patterns": [ | ||
| r"##\s+Author\s+Contributions?", | ||
| r"##\s+Contributions?", |
Copilot
AI
Dec 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pattern r"##\s+Contributions?" is too broad and could match any section titled "## Contributions" or "## Contribution" without the word "Author". Consider making this pattern more specific to avoid false positives, or remove it if only "Author Contributions" should be matched.
| r"##\s+Contributions?", |
| class SectionValidator(BaseValidator): | ||
| """Validator for checking recommended manuscript sections.""" | ||
|
|
||
| # Required sections that should generate warnings if missing | ||
| RECOMMENDED_SECTIONS = { | ||
| "data availability": { | ||
| "patterns": [ | ||
| r"##\s+Data\s+Availability", | ||
| r"##\s+Data\s+and\s+Code\s+Availability", | ||
| ], | ||
| "suggestion": "Add a '## Data Availability' section describing where your data can be accessed", | ||
| }, | ||
| "code availability": { | ||
| "patterns": [ | ||
| r"##\s+Code\s+Availability", | ||
| r"##\s+Data\s+and\s+Code\s+Availability", | ||
| r"##\s+Software\s+Availability", | ||
| ], | ||
| "suggestion": "Add a '## Code Availability' section with links to code repositories", | ||
| }, | ||
| "author contributions": { | ||
| "patterns": [ | ||
| r"##\s+Author\s+Contributions?", | ||
| r"##\s+Contributions?", | ||
| ], | ||
| "suggestion": "Add an '## Author Contributions' section describing each author's role", | ||
| }, | ||
| "acknowledgements": { | ||
| "patterns": [ | ||
| r"##\s+Acknowledgements?", | ||
| r"##\s+Acknowledgments?", | ||
| ], | ||
| "suggestion": "Add an '## Acknowledgements' section to thank contributors", | ||
| }, | ||
| "funding": { | ||
| "patterns": [ | ||
| r"##\s+Funding", | ||
| r"##\s+Financial\s+Support", | ||
| r"##\s+Grant\s+Information", | ||
| ], | ||
| "suggestion": "Add a '## Funding' section declaring funding sources or stating 'This research received no external funding.'", | ||
| }, | ||
| "competing interests": { | ||
| "patterns": [ | ||
| r"##\s+Competing\s+Interests?", | ||
| r"##\s+Conflicts?\s+of\s+Interest", | ||
| r"##\s+Disclosure", | ||
| ], | ||
| "suggestion": "Add a '## Competing Interests' section, even if just to state 'The authors declare no competing interests.'", | ||
| }, | ||
| } | ||
|
|
||
| def validate(self) -> ValidationResult: | ||
| """Check for recommended manuscript sections. | ||
|
|
||
| Returns: | ||
| ValidationResult with warnings for missing sections | ||
| """ | ||
| errors: list[ValidationError] = [] | ||
| metadata = { | ||
| "missing_sections": [], | ||
| "found_sections": [], | ||
| } | ||
|
|
||
| # Find manuscript file | ||
| try: | ||
| manuscript_file = find_manuscript_md(self.manuscript_path) | ||
| except (FileNotFoundError, ValueError) as e: | ||
| return ValidationResult( | ||
| validator_name=self.name, | ||
| errors=[ | ||
| self._create_error( | ||
| level=ValidationLevel.ERROR, | ||
| message=f"Could not find manuscript file: {e}", | ||
| ) | ||
| ], | ||
| metadata=metadata, | ||
| ) | ||
|
|
||
| # Read manuscript content | ||
| content = self._read_file_safely(manuscript_file) | ||
| if content is None: | ||
| return ValidationResult( | ||
| validator_name=self.name, | ||
| errors=[ | ||
| self._create_error( | ||
| level=ValidationLevel.ERROR, | ||
| message=f"Could not read manuscript file: {manuscript_file}", | ||
| ) | ||
| ], | ||
| metadata=metadata, | ||
| ) | ||
|
|
||
| # Check for each recommended section | ||
| for section_name, section_info in self.RECOMMENDED_SECTIONS.items(): | ||
| found = False | ||
| for pattern in section_info["patterns"]: | ||
| if re.search(pattern, content, re.IGNORECASE): | ||
| found = True | ||
| metadata["found_sections"].append(section_name) | ||
| break | ||
|
|
||
| if not found: | ||
| metadata["missing_sections"].append(section_name) | ||
| errors.append( | ||
| self._create_error( | ||
| level=ValidationLevel.WARNING, | ||
| message=f"Recommended section '{section_name.title()}' not found", | ||
| file_path=manuscript_file, | ||
| suggestion=section_info["suggestion"], | ||
| ) | ||
| ) | ||
|
|
||
| return ValidationResult( | ||
| validator_name=self.name, | ||
| errors=errors, | ||
| metadata=metadata, | ||
| ) |
Copilot
AI
Dec 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new SectionValidator lacks test coverage. The repository has comprehensive test coverage for other validators (see tests/unit/test_validators.py), and this new validator should have corresponding tests to verify pattern matching, warning generation, and edge cases like missing files or combined sections.
|
|
||
| # Find manuscript file | ||
| try: | ||
| manuscript_file = find_manuscript_md(self.manuscript_path) |
Copilot
AI
Dec 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Path object returned by find_manuscript_md should be converted to a string before passing to file_path parameter. The ValidationError.file_path is typed as str | None, but find_manuscript_md returns a Path object. For consistency with other validators and proper type handling, convert the Path to string using str().
| self._create_error( | ||
| level=ValidationLevel.WARNING, | ||
| message=f"Recommended section '{section_name.title()}' not found", | ||
| file_path=manuscript_file, |
Copilot
AI
Dec 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Path object returned by find_manuscript_md should be converted to a string before passing to file_path parameter. The ValidationError.file_path is typed as str | None, and should match the expected type. Convert using str(manuscript_file).
| file_path=manuscript_file, | |
| file_path=str(manuscript_file), |
| class SectionValidator(BaseValidator): | ||
| """Validator for checking recommended manuscript sections.""" | ||
|
|
||
| # Required sections that should generate warnings if missing |
Copilot
AI
Dec 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment states "Required sections" but the constant is named "RECOMMENDED_SECTIONS". The comment should say "Recommended sections" to match the variable name and the actual behavior (warnings, not errors).
| # Required sections that should generate warnings if missing | |
| # Recommended sections that should generate warnings if missing |
Summary
This PR adds a new that checks manuscripts for standard sections commonly required by journals and issues warnings (not errors) when sections are missing.
Changes
New Validator
Integration
Copyright: (C) 1999 ImageMagick Studio LLC
License: https://imagemagick.org/script/license.php
Features: Cipher DPC HDRI Modules OpenMP
Delegates (built-in): bzlib cairo fontconfig freetype heic jng jp2 jpeg jxl lcms lqr ltdl lzma openexr png raw rsvg tiff uhdr webp xml zip zlib zstd
Compiler: clang (17.0.0)
Usage: import [options ...] [ file ]
Image Settings:
-adjoin join images into a single multi-image file
-border include window border in the output image
-channel type apply option to select image channels
-colorspace type alternate image colorspace
-comment string annotate image with comment
-compress type type of pixel compression when writing the image
-define format:option
define one or more image format options
-density geometry horizontal and vertical density of the image
-depth value image depth
-descend obtain image by descending window hierarchy
-display server X server to contact
-dispose method layer disposal method
-dither method apply error diffusion to image
-delay value display the next image after pausing
-encipher filename convert plain pixels to cipher pixels
-endian type endianness (MSB or LSB) of the image
-encoding type text encoding type
-filter type use this filter when resizing an image
-format "string" output formatted image characteristics
-frame include window manager frame
-gravity direction which direction to gravitate towards
-identify identify the format and characteristics of the image
-interlace type None, Line, Plane, or Partition
-interpolate method pixel color interpolation method
-label string assign a label to an image
-limit type value Area, Disk, Map, or Memory resource limit
-monitor monitor progress
-page geometry size and location of an image canvas
-pause seconds seconds delay between snapshots
-pointsize value font point size
-quality value JPEG/MIFF/PNG compression level
-quiet suppress all warning messages
-regard-warnings pay attention to warning messages
-repage geometry size and location of an image canvas
-respect-parentheses settings remain in effect until parenthesis boundary
-sampling-factor geometry
horizontal and vertical sampling factor
-scene value image scene number
-screen select image from root window
-seed value seed a new sequence of pseudo-random numbers
-set property value set an image property
-silent operate silently, i.e. don't ring any bells
-snaps value number of screen snapshots
-support factor resize support: > 1.0 is blurry, < 1.0 is sharp
-synchronize synchronize image to storage device
-taint declare the image as modified
-transparent-color color
transparent color
-treedepth value color tree depth
-verbose print detailed information about the image
-virtual-pixel method
Constant, Edge, Mirror, or Tile
-window id select window with this id or name
root selects whole screen
Image Operators:
-annotate geometry text
annotate the image with text
-colors value preferred number of colors in the image
-crop geometry preferred size and location of the cropped image
-encipher filename convert plain pixels to cipher pixels
-extent geometry set the image size
-geometry geometry preferred size or location of the image
-help print program options
-monochrome transform image to black and white
-negate replace every pixel with its complementary color
-quantize colorspace reduce colors in this colorspace
-resize geometry resize the image
-rotate degrees apply Paeth rotation to the image
-strip strip image of all profiles and comments
-thumbnail geometry create a thumbnail of the image
-transparent color make this color transparent within the image
-trim trim image edges
-type type image type
Miscellaneous Options:
-debug events display copious debugging information
-help print program options
-list type print a list of supported option arguments
-log format format of debugging information
-version print version information
By default, 'file' is written in the MIFF image format. To
specify a particular image format, precede the filename with an image
format name and a colon (i.e. ps:image) or specify the image type as
the filename suffix (i.e. image.ps). Specify 'file' as '-' for
standard input or output.
Behavior
Testing
Tested on both manuscripts:
Example output:
Benefits
Related