Skip to content

fix(mcp): relax column name regex, improve generate_chart validation errors and examples#39866

Draft
aminghadersohi wants to merge 4 commits intoapache:masterfrom
aminghadersohi:amin/sc-105412-fix-generate-chart-schema-rigidity
Draft

fix(mcp): relax column name regex, improve generate_chart validation errors and examples#39866
aminghadersohi wants to merge 4 commits intoapache:masterfrom
aminghadersohi:amin/sc-105412-fix-generate-chart-schema-rigidity

Conversation

@aminghadersohi
Copy link
Copy Markdown
Contributor

Summary

Addresses validation rigidity in the generate_chart MCP tool that caused unnecessary failures when using valid but unconventionally-named columns.

Changes:

  1. Relax column name regex — Remove the pattern=r"^[a-zA-Z0-9_][a-zA-Z0-9_\s\-\.]*$" constraint from ColumnRef.name, FilterConfig.column, and BigNumberChartConfig.temporal_column. Many real-world column names (digit-prefixed like 1Q_revenue, hyphenated like order-date) were silently rejected with cryptic pydantic errors. The existing sanitize_name() / sanitize_column() validators already block XSS and SQL injection — the regex added no security value and only hurt usability.

  2. Extend docstring examples — Add generate_chart usage examples for all supported chart types: pie, big_number (with and without trendline), pivot_table, mixed_timeseries, handlebars. Previously only xy and table had examples.

  3. Improve validation error messages — Extract _format_single_error helper from _enhance_validation_error and make the fallback produce type-specific, actionable messages for string_pattern_mismatch, literal_error, missing, and value_error pydantic error types instead of raw internal strings.

  4. Tests — New TestColumnRefNameRelaxedPattern and TestFilterConfigColumnRelaxedPattern classes verify digit-prefixed and hyphenated column names now pass, and that XSS/SQL injection is still blocked by sanitize_name().

Testing

  • Unit tests: pytest tests/unit_tests/mcp_service/chart/test_chart_schemas.py -x (including new test classes)
  • Manual: generate_chart with a column named 1Q_revenue or order-date succeeds instead of returning a pattern mismatch error

…errors and examples

- Remove overly strict regex pattern from ColumnRef.name, FilterConfig.column,
  and BigNumberChartConfig.temporal_column — sanitize_name/sanitize_column
  already handle XSS/SQL injection; the pattern rejected valid column names
  like "1Q_revenue" (digit-prefixed) or "order-date" (hyphenated)
- Extend generate_chart docstring with usage examples for all supported chart
  types: pie, big_number (with/without trendline), pivot_table,
  mixed_timeseries, handlebars
- Improve _enhance_validation_error fallback in SchemaValidator to produce
  type-specific, actionable messages instead of raw pydantic error strings
  (extract _format_single_error helper to reduce cyclomatic complexity)
- Add tests verifying digit-prefixed/hyphenated column names now pass,
  and that XSS/SQL injection is still blocked by sanitize_name()
@codecov
Copy link
Copy Markdown

codecov Bot commented May 4, 2026

Codecov Report

❌ Patch coverage is 10.34483% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.31%. Comparing base (673634f) to head (a66b8d6).
⚠️ Report is 18 commits behind head on master.

Files with missing lines Patch % Lines
...t/mcp_service/chart/validation/schema_validator.py 0.00% 22 Missing ⚠️
superset/mcp_service/chart/schemas.py 42.85% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #39866      +/-   ##
==========================================
- Coverage   64.37%   63.31%   -1.06%     
==========================================
  Files        2569     2582      +13     
  Lines      134745   136495    +1750     
  Branches    31278    31468     +190     
==========================================
- Hits        86739    86421     -318     
- Misses      46508    48561    +2053     
- Partials     1498     1513      +15     
Flag Coverage Δ
hive 39.28% <10.34%> (-0.39%) ⬇️
mysql 58.99% <10.34%> (-0.95%) ⬇️
postgres 59.07% <10.34%> (-0.95%) ⬇️
presto 40.99% <10.34%> (-0.44%) ⬇️
python 59.31% <10.34%> (-2.24%) ⬇️
sqlite 58.70% <10.34%> (-0.94%) ⬇️
unit ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the MCP generate_chart experience by loosening overly-restrictive column-name validation, expanding tool docstring examples to cover more chart types, and making schema validation errors more actionable for callers.

Changes:

  • Removed strict regex constraints on several column-name fields to allow real-world names (digit-prefixed, hyphenated, etc.) while relying on sanitizers/validators.
  • Added generate_chart docstring JSON examples for additional chart types (pie, big_number, pivot_table, mixed_timeseries, handlebars).
  • Refactored and enhanced Pydantic validation error formatting to produce more actionable messages and deduplicated suggestions.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
superset/mcp_service/chart/schemas.py Removes regex constraints for column-related fields in schema models.
superset/mcp_service/chart/validation/schema_validator.py Extracts per-error formatting helper and improves default validation error details/suggestions.
superset/mcp_service/chart/tool/generate_chart.py Extends tool docstring with usage examples for more supported chart types.
tests/unit_tests/mcp_service/chart/test_chart_schemas.py Adds unit tests confirming relaxed ColumnRef/FilterConfig column name acceptance and basic sanitizer behavior.

Comment thread superset/mcp_service/chart/schemas.py
Comment thread superset/mcp_service/chart/schemas.py
Comment thread superset/mcp_service/chart/tool/generate_chart.py
- FilterConfig.column: add check_sql_keywords=True to sanitize_column
  (Copilot review: sanitize_column was missing SQL keyword checking)
- BigNumberChartConfig.temporal_column: add sanitize_temporal_column
  field_validator using sanitize_user_input with check_sql_keywords=True
  (Copilot review: no validator after regex removal left field unprotected)
- generate_chart docstring IMPORTANT: list all chart types, not just xy/table
  (Copilot review: IMPORTANT section was misleading after adding more examples)
- Fix test_xss_attempt_blocked: nh3 strips HTML tags instead of rejecting,
  so rename to test_xss_tags_are_stripped (asserts tag is removed) and add
  test_event_handler_injection_blocked (on...= patterns ARE rejected)
- Fix _format_single_error literal_error: preserve pydantic 'Input should be'
  message instead of replacing with custom format (broke existing test
  test_non_value_error_pydantic_body_is_surfaced)
- Add test_sql_injection_in_filter_column_blocked to verify FilterConfig
  now rejects SQL injection column names
@netlify
Copy link
Copy Markdown

netlify Bot commented May 5, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit 38972f0
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/69f98c847e6c7a0008ace23f
😎 Deploy Preview https://deploy-preview-39866--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

- Remove unused 'type: ignore[return-value]' from sanitize_temporal_column
  (mypy correctly infers the return type; comment was unnecessary)
- Fix test_xss_tags_are_stripped → test_script_tag_blocked: nh3 strips the
  entire script element including its content, leaving an empty string that
  the allow_empty=False guard then rejects with ValidationError
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants