fix(recommandation): Fix chart recommandation#39886
fix(recommandation): Fix chart recommandation#39886alexandrusoare wants to merge 3 commits intomasterfrom
Conversation
Code Review Agent Run #04a47bActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #39886 +/- ##
==========================================
- Coverage 64.35% 63.87% -0.49%
==========================================
Files 2569 2583 +14
Lines 134680 136664 +1984
Branches 31254 31506 +252
==========================================
+ Hits 86679 87289 +610
- Misses 46505 47859 +1354
- Partials 1496 1516 +20
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Code Review Agent Run #b70632Actionable Suggestions - 0Additional Suggestions - 1
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
| temporal = [c for c in columns if c.data_type == "temporal"] | ||
| numeric = [c for c in columns if c.data_type == "numeric"] | ||
| categorical = [c for c in columns if c.data_type in ("string", "boolean")] |
There was a problem hiding this comment.
This can probably be one loop right? Would that be better for the performance overhead?
There was a problem hiding this comment.
These iterate over the chart's columns — typically 2-10 items. Three passes on <10 elements is negligible; a single loop with conditional appends would be harder to read for no meaningful gain. Let me know your thoughts
There was a problem hiding this comment.
Yeah, my concern was about charts that have much more than that, but not sure if that is something that happens often
Code Review Agent Run #d3843dActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
SUMMARY
recommended_visualizationsinget_chart_dataresponses are based purely on column count and column name string matching. A table chart with 2 columns gets "scatter plot" recommended; any column with "time" in the name triggers "line chart" even if it's not actually temporal.Root Cause
Two problems:
DataColumn.data_typeinference uses a Pythonisinstanceheuristic that checks forint/float/bool. Datetime values from SQL arrive as strings, so temporal columns are always classified as"string". Meanwhile, the query result already containscoltypes— a list ofGenericDataTypeenum values (NUMERIC, STRING, TEMPORAL, BOOLEAN) derived from actual SQL types viaextract_dataframe_dtypes()— but this field was never read.chart.viz_type(so it recommends the same type you already have), actual data types, or column cardinality (so high-cardinality ID columns trigger "scatter plot").Fix
coltypesfrom the query result and use it to populateDataColumn.data_typewith accurate SQL-derived types (falling back to the isinstance heuristic whencoltypesis unavailable)_recommend_visualizations(viz_type, columns, row_count)which:BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
ADDITIONAL INFORMATION