fix(csv): correct to_csv quoting documentation and expose QUOTE_NONE#21517
fix(csv): correct to_csv quoting documentation and expose QUOTE_NONE#21517a-hirota wants to merge 4 commits into
Conversation
f182bd8 to
5040b31
Compare
There was a problem hiding this comment.
Pull request overview
This pull request fixes incorrect CSV documentation and exposes the quoting parameter to allow users to control field quoting behavior in CSV output. The changes enable pandas-compatible CSV quoting with support for csv.QUOTE_MINIMAL (default) and csv.QUOTE_NONE.
Changes:
- Corrected documentation stating the default quoting behavior is
csv.QUOTE_MINIMAL(notcsv.QUOTE_NONNUMERIC) - Added
quotingparameter toDataFrame.to_csv()andcudf.io.csv.to_csv() - Exposed libcudf's existing
quote_style::NONEsupport through pylibcudf's CsvWriterOptionsBuilder
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| python/pylibcudf/pylibcudf/libcudf/io/csv.pxd | Added C++ binding declaration for quoting() method in csv_writer_options_builder |
| python/pylibcudf/pylibcudf/io/csv.pyx | Implemented quoting() method in CsvWriterOptionsBuilder with documentation |
| python/pylibcudf/pylibcudf/io/csv.pyi | Added type stub for quoting() method with QuoteStyle parameter |
| python/pylibcudf/pylibcudf/io/csv.pxd | Added Cython declaration for quoting() method |
| python/cudf/cudf/utils/ioutils.py | Updated docstring to document the quoting parameter and correct default behavior |
| python/cudf/cudf/tests/input_output/test_csv.py | Added comprehensive tests for quoting functionality including pandas compatibility, special characters, unsupported styles, and edge cases |
| python/cudf/cudf/io/csv.py | Added quoting parameter with validation and mapping to QuoteStyle enum |
| python/cudf/cudf/core/dataframe.py | Added quoting parameter to DataFrame.to_csv() with default handling |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
0b93f48 to
79c68a0
Compare
|
/ok to test 79c68a0 |
…option The existing documentation incorrectly stated that to_csv follows "Pandas csv.QUOTE_NONNUMERIC", but the actual default behavior is csv.QUOTE_MINIMAL. This PR corrects the documentation. Additionally, libcudf already supports quote_style::NONE in the CSV writer (cpp/src/io/csv/writer_impl.cu), but this was not exposed in the Python API. This PR adds the `quoting` parameter to DataFrame.to_csv() to allow users to use csv.QUOTE_NONE, matching pandas API. Changes: - Fix incorrect docstring (QUOTE_NONNUMERIC → QUOTE_MINIMAL) - Add quoting parameter to to_csv() exposing libcudf's existing functionality - Add quoting method to CsvWriterOptionsBuilder in pylibcudf - Add tests for quoting functionality (pandas compatibility verified) Supported quoting styles: - csv.QUOTE_MINIMAL (default): Quote only fields with special characters - csv.QUOTE_NONE: Never quote fields
79c68a0 to
dc27c6a
Compare
|
/ok to test dc27c6a |
|
/ok to test eda5888 |
📝 WalkthroughWalkthroughCSV writing now accepts a quoting option from ChangesCSV quoting support
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@python/cudf/cudf/tests/input_output/test_csv.py`:
- Around line 2310-2312: The pytest.raises match pattern in the CSV test uses a
regex with metacharacters but is written as a normal string, which triggers Ruff
RUF043. Update the relevant pytest.raises call in test_csv.py to use a raw
string for the match argument, keeping the existing NotImplementedError
assertion and the same regex text. Use the pytest.raises block around the
quoting-related test as the locator.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: c70d63e8-f7fa-4f13-a845-6005a5e0af4f
📒 Files selected for processing (8)
python/cudf/cudf/core/dataframe.pypython/cudf/cudf/io/csv.pypython/cudf/cudf/tests/input_output/test_csv.pypython/cudf/cudf/utils/ioutils.pypython/pylibcudf/pylibcudf/io/csv.pxdpython/pylibcudf/pylibcudf/io/csv.pyipython/pylibcudf/pylibcudf/io/csv.pyxpython/pylibcudf/pylibcudf/libcudf/io/csv.pxd
|
/ok to test fccf1f8 |
Description
The existing documentation incorrectly stated that
to_csvfollows "Pandas csv.QUOTE_NONNUMERIC", but the actual default behavior iscsv.QUOTE_MINIMAL. This PR corrects the documentation.Additionally, libcudf already supports
quote_style::NONEin the CSV writer (cpp/src/io/csv/writer_impl.cu), but this was not exposed in the Python API. This PR adds thequotingparameter toDataFrame.to_csv()to allow users to usecsv.QUOTE_NONE, matching pandas API.Changes
quotingparameter toto_csv()exposing libcudf's existing functionalityquotingmethod toCsvWriterOptionsBuilderin pylibcudfSupported quoting styles
csv.QUOTE_MINIMAL(default)csv.QUOTE_NONEChecklist