Skip to content

Evovest/MLBenchmarks.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLBenchmarks.jl

This repo provides Julia based benchmarks for ML algo on tabular data. It was developed to support both NeuroTabModels.jl and EvoTrees.jl projects.

Methodology

For each dataset and algo, the following methodology is followed:

  • Data is split in three parts: train, eval and test
  • A random grid of 16 hyper-parameters is generated
  • For each parameter configuration, a model is trained on train data until the evaluation metric tracked against the eval stops improving (early stopping)
  • The trained model is evaluated against the test data
  • The metric presented below are the ones obtained on the test for the model that generated the best eval metric.

Datasets

Datasets are now sourced from OpenML, usingOpenML:

data_map = Dict(
    :titanic => 40945,
    :higgs_11M => 45570,
    :higgs_1M => 42769,
    :boston => 531,
    :year => 44027,
    :microsoft => 45579,
    :sberbank => 46898, #TODO
    :allstate_claims => 45046, #TODO
    :creditcard => 1597 #TODO
)

Legacy datasets from older release

The following selection of common tabular datasets is covered:

  • Year: min squared error regression
  • MSRank: ranking problem with min squared error regression
  • YahooRank: ranking problem with min squared error regression
  • Higgs: 2-level classification with logistic regression
  • Boston Housing: min squared error regression
  • Titanic: 2-level classification with logistic regression

Algorithms

Comparison is performed against the following algos (implementation in link) considered as state of the art on tabular data problems tasks:

Boston

model_type train_time test_mse test_gini
catboost 0.175 0.194 0.945
evotrees 0.198 0.254 0.935
lightgbm 0.314 0.326 0.934
neurotrees 4.58 0.269 0.925
tabm 5.34 0.224 0.934
xgboost 0.0846 0.265 0.93

Titanic

model_type train_time test_logloss test_gini
catboost 0.0759 0.375 0.802
evotrees 0.0399 0.362 0.806
lightgbm 0.209 0.363 0.809
neurotrees 3.15 0.373 0.815
tabm 6.45 0.383 0.774
xgboost 0.0195 0.37 0.795

Year

model_type train_time test_mse test_gini
catboost 65.7 0.621 0.664
evotrees 79.9 0.613 0.666
lightgbm 104.0 0.607 0.67
neurotrees 519.0 0.594 0.68
tabm 27.9 0.616 0.669
xgboost 42.6 0.614 0.666

Microsoft

model_type train_time test_mse test_gini
catboost 186.0 0.73 0.561
evotrees 97.8 0.722 0.567
lightgbm 38.7 0.717 0.571
neurotrees 1110.0 0.76 0.529
tabm 345.0 0.773 0.515
xgboost 42.1 0.719 0.57

Higgs

model_type train_time test_logloss test_gini
catboost 150.0 0.494 0.674
evotrees 55.3 0.496 0.67
lightgbm 67.0 0.495 0.673
neurotrees 291.0 0.487 0.686
tabm 37.6 0.497 0.671
xgboost 35.5 0.496 0.67

References

About

ML models benchmarks on public dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages