- #139: Added
byargument toplot_bar.
- #148: Address CRAN removal due to vignette build failure.
- #111: Continuous distributions can now be plotted with different scales, i.e., histogram, density, boxplot, scatterplot.
- #126: Cleaned up labels in legend guide.
- #127 (PR): Added option to plot columns with missing values only in
plot_missing. - Cleaned up code for
create_report.
- #109: Fixed a bug causing unordered bar charts.
- #114: Removed redundant message in
dummify. - #116: Fixed pandoc document conversion error 99.
- #120: Fixed type
logicalbeing parsed assymbolinconfigure_report. - #121: Fixed missing value bug when
split_columns(..., binary_as_factor = TRUE). - #130 (PR):
plot_prcompnow drops columns with zero variance.
- #92: Added
update_columnsto transform any selected columns.
- #87: Added
configure_reportfunction to customize report content. - #89: Added option to customize
geom_textandgeom_labelarguments. - #91:
create_reportnow displays full report directory after completion. - #95: Added better exception handling for
plot_bar. - #98: Added band customization to
plot_missing. - #100: Switched
geom_texttogeom_label. - #103: Report title can now be customized in
create_report. - #108: Added option to treat binary features as discrete in
plot_bar,plot_histogram,plot_densityandplot_boxplot. - Updated d3.min.js to v5.9.2.
- #88: Added
plot_introto report config. - #90: Added first plot in
plot_prcompto output andpage_0. - #94: Fixed typo for PCA.
- #86: Replaced
gridExtra::grid.arrangewith facets. - Added seeds to vignette and README for re-producible examples.
- Hid all internal functions.
- #42: Applied S3 methods for plotting functions.
- #77:
dummifynow works on selected columns. - #78: All ggplot objects from
plot_*are now invisibly returned. As a result, extractedprofile_missingfromplot_missingfor missing value profiles. - #83: Removed all deprecated functions.
- #85: Users can now specify number of rows/columns for plot page layout.
plot_prcompnow passedscale. = TRUEtoprcompby default.- Added
sampled_rowsargument toplot_scatterplot. - Added option to parallelize plot object construction.
- Updated default config for
create_report.
- #74: Fixed a bug causing
create_reportfailure due to zero complete rows. - #75: Fixed a bug in
plot_strwhen plotting data.frame with more than 100 columns. - #82: Removed hard-coded scales from all plot functions.
- Fixed a bug causing wrong column indices in
split_columns. - Fixed a bug using standard deviation instead of variance in
plot_prcomp.
- Updated vignette for better clarity.
- #71: Added better error handler for
plot_prcomp.
- #69: Fixed bug causing
create_reportfailure (specifically fromplot_prcomp) whenyis specified. - Added more unit tests for
create_reportandplot_prcomp.
- #15: Added
plot_prcompto visualize principal component analysis. - #54: Extracted
dummifyfromplot_correlationas a new function. - #59: Added
introducefor basic metadata.
- #41:
create_reportcan now be customized. - #53: Added page number for plots that span multiple pages.
- #56: Added support for theme and customization for individual components.
- #62:
plot_barnow supports optional measures (in addition to categorical frequency) using argumentwith. - #66: Feature engineering functions works on other classes in addition to just data.table.
plot_missing:- Percentage text labels from output plot now has 2 decimals to prevent small percentages from being truncated to 0%.
- Added example to quickly drop columns with too many missing values.
- Added
.ignoreCatand.getAllMissingto helper.
- #55: Fixed bugs and updated vignette with latest functions.
- #57: Fixed
plot_strbug for not supporting S4 objects. - #63: Fixed
plot_histogramandplot_densitynot working with column names containing spaces.
- #48: Added
plot_scatterplotto visualize relationship of one feature against all other. - #50: Added
plot_boxplotto visualize continuous distributions broken down by another feature.
- #44: Added option to exclude categories in
group_category. - #45: Added title option for all plots.
- #46: Added option to exclude columns in
set_missing. - #49 [Breaking Change]: Switched package to tidyverse style. All old functions are in
.Deprecatedmode. List of name changes in alphabetical order:BarDiscrete->plot_barCollapseCategory->group_categoryCorrelationContinuous->plot_correlation(..., type = "continuous")CorrelationDiscrete->plot_correlation(..., type = "discrete")DensityContinuous->plot_densityDropVar->drop_columnsGenerateReport->create_reportHistogramContinuous->plot_histogramPlotMissing->plot_missingPlotStr->plot_strSetNaTo->set_missingSplitColType->split_columns
- #52: Combined
CorrelationContinuousandCorrelationDiscreteinto one function, and added option to view correlation of all features at once. - Optimized layout for multiple plots.
- #47: Fixed color scale for correlation heatmap.
- #32: Fixed pandoc requirement error in unit test on cran.
- #34: Fixed error message when
quietis not supplied. In addition, report directory are printed throughmessage()instead ofcat(). - #35: Fixed rprojroot not found error.
- #12: Added vignette: dataexplorer-intro.
- #36: Fixed warnings from data.table in
DropVar. - #37: Changed all
cat()tomessage(). - #38: Added option to order bars in
BarDiscrete. - #39: Extended
SetNaToto discrete features. - Added more examples to README.md.
- #25: Added
SetNaToto quickly reset missing numerical values. - #29: Added
DropVarto quickly drop variables by either name or column position.
- #24:
CorrelationDiscretenow displays all factor levels instead of full rank matrix frommodel.matrix.
- #11: Functions with return values will now match the input class and set it back.
- #22: Added documentation for
num_all_missinginSplitColType. - #23: Added additional measures (in addition to frequency) to
CollapseCategory. - #26: Removed density estimation section from report template.
- #31: Added flexibility to name the new category in
CollapseCategory.
- #30: In
CollapseCategory,update = TRUEwill only work with input data asdata.table. However, it is still possible to view the frequency distribution with any input data class, as long asupdate = FALSE.
- #20: Fixed permission denied bug due to intermediates_dir argument in
knitr::render.
- #16: Improved handling of missing values.
- #18:
GenerateReportnow handles data without discrete or continuous features.
- #14: Updated rmarkdown template for
GenerateReport. - #1: Features with all
NAvalues will be ignored inBarDiscrete.
- Fixed a major bug in
GenerateReportfunction due to package renaming.
GenerateReportwill now print the directory of the report to console.
- Added function
CollapseCategoryto collapse sparse categories for discrete features. - Added correlation heatmap for both continuous and discrete features.
- Added density plot for continuous features.
- Fixed a bug in
BarDiscreteandCorrelationDiscretefor not plotting non-factor class. - Minor changes for CRAN re-submission.
- Changed grid layout for
BarDiscreteandHistogramContinuous. - Features with all missing values will be ignored.
- Switched position between continuous and discrete features in report template.
- Renamed package name to DataExplorer.
- Added NEWS.md.
- Removed
BoxplotContinuous.