Skip to content

Commit 16342ab

Browse files
authored
Merge pull request #113 from PythonPredictions/develop
Release (develop -> master) of milestone 1.1.0.
2 parents a44a693 + df7882e commit 16342ab

35 files changed

Lines changed: 6287 additions & 1144 deletions

.gitignore

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -53,15 +53,15 @@ junit/
5353
*.mo
5454
*.pot
5555

56-
# Django stuff:
56+
# Django stuff
5757
*.log
5858
local_settings.py
5959

60-
# Flask stuff:
60+
# Flask stuff
6161
instance/
6262
.webassets-cache
6363

64-
# Scrapy stuff:
64+
# Scrapy stuff
6565
.scrapy
6666

6767
# Sphinx documentation

README.rst

Lines changed: 26 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11

2+
.. image:: https://github.com/PythonPredictions/cobra/raw/master/material/logo.png
3+
:width: 700
24

35
.. image:: https://img.shields.io/pypi/v/pythonpredictions-cobra.svg
46
:target: https://pypi.org/project/pythonpredictions-cobra/
@@ -9,26 +11,20 @@
911

1012
------------------------------------------------------------------------------------------------------------------------------------
1113

12-
=====
13-
cobra
14-
=====
15-
.. image:: material\logo.png
16-
:width: 300
14+
**Cobra** is a Python package to build predictive models using linear or logistic regression with a focus on performance and interpretation. It consists of several modules for data preprocessing, feature selection and model evaluation. The underlying methodology was developed at `Python Predictions <https://www.pythonpredictions.com>`_ in the course of hundreds of business-related prediction challenges. It has been tweaked, tested and optimized over the years based on feedback from clients, our team, and academic researchers.
1715

18-
**cobra** is a Python package to build predictive models using logistic regression with a focus on performance and interpretation. It consists of several modules for data preprocessing, feature selection and model evaluation. The underlying methodology was developed at Python Predictions in the course of hundreds of business-related prediction challenges. It has been tweaked, tested and optimized over the years based on feedback from clients, our team, and academic researchers.
19-
20-
Main Features
16+
Main features
2117
=============
2218

2319
- Prepare a given pandas DataFrame for predictive modelling:
2420

2521
- partition into train/selection/validation sets
2622
- create bins from continuous variables
2723
- regroup categorical variables based on statistical significance
28-
- replace missing values and
29-
- add columns with incidence rate per category/bin
24+
- replace missing values
25+
- add columns where categories/bins are replaced with average of target values (linear regression) or with incidence rate (logistic regression)
3026

31-
- Perform univariate feature selection based on AUC
27+
- Perform univariate feature selection based on RMSE (linear regression) or AUC (logistic regression)
3228
- Compute correlation matrix of predictors
3329
- Find the suitable variables using forward feature selection
3430
- Evaluate model performance and visualize the results
@@ -41,49 +37,40 @@ These instructions will get you a copy of the project up and running on your loc
4137
Requirements
4238
------------
4339

44-
This package requires the usual Python packages for data science:
45-
46-
- numpy (>=1.19.4)
47-
- pandas (>=1.1.5)
48-
- scipy (>=1.5.4)
49-
- scikit-learn (>=0.23.1)
50-
- matplotlib (>=3.3.3)
51-
- seaborn (>=0.11.0)
52-
53-
54-
These packages, along with their versions are listed in ``requirements.txt`` and can be installed using ``pip``: ::
55-
40+
This package requires only the usual Python libraries for data science, being numpy, pandas, scipy, scikit-learn, matplotlib, seaborn, and tqdm. These packages, along with their versions are listed in ``requirements.txt`` and can be installed using ``pip``: ::
5641

5742
pip install -r requirements.txt
5843

5944

60-
**Note**: if you want to install cobra with e.g. pip, you don't have to install all of these requirements as these are automatically installed with cobra itself.
45+
**Note**: if you want to install Cobra with e.g. pip, you don't have to install all of these requirements as these are automatically installed with Cobra itself.
6146

6247
Installation
6348
------------
6449

65-
The easiest way to install cobra is using ``pip``: ::
50+
The easiest way to install Cobra is using ``pip``: ::
6651

6752
pip install -U pythonpredictions-cobra
6853

69-
Contributing to cobra
70-
=====================
7154

72-
We'd love you to contribute to the development of cobra! There are many ways in which you can contribute, the most common of which is to contribute to the source code or documentation of the project. However, there are many other ways you can contribute (report issues, improve code coverage by adding unit tests, ...).
73-
We use GitHub issue to track all bugs and feature requests. Feel free to open an issue in case you found a bug or in case you wish to see a new feature added.
55+
Documentation and extra material
56+
================================
57+
58+
- A `blog post <https://www.pythonpredictions.com/news/the-little-trick-we-apply-to-obtain-explainability-by-design/>`_ on the overall methodology.
59+
60+
- A `research article <https://doi.org/10.1016/j.dss.2016.11.007>`_ by Geert Verstraeten (co-founder Python Predictions) discussing the preprocessing approach we use in Cobra.
7461

75-
For more details, check our `wiki <https://github.com/PythonPredictions/cobra/wiki/Contributing-guidelines-&-workflows>`_.
62+
- HTML documentation of the `individual modules <https://pythonpredictions.github.io/cobra.io/docstring/modules.html>`_.
7663

77-
Help and Support
78-
================
64+
- A step-by-step `tutorial <https://pythonpredictions.github.io/cobra/tutorials/tutorial_Cobra_logistic_regression.ipynb>`_ for **logistic regression**.
7965

80-
Documentation
81-
-------------
66+
- A step-by-step `tutorial <https://pythonpredictions.github.io/cobra/tutorials/tutorial_Cobra_linear_regression.ipynb>`__ for **linear regression**.
8267

83-
- HTML documentation of the `individual modules <https://pythonpredictions.github.io/cobra.io/docstring/modules.html>`_
84-
- A step-by-step `tutorial <https://pythonpredictions.github.io/cobra.io/tutorial.html>`_
68+
- Check out the Data Science Leuven Meetup `talk <https://www.youtube.com/watch?v=w7ceZZqMEaA&feature=youtu.be>`_ by one of the core developers (second presentation). His `slides <https://github.com/PythonPredictions/Cobra-DS-meetup-Leuven/blob/main/DS_Leuven_meetup_20210209_cobra.pdf>`_ and `related material <https://github.com/PythonPredictions/Cobra-DS-meetup-Leuven>`_ are also available.
69+
70+
Contributing to Cobra
71+
=====================
8572

86-
Outreach
87-
-------------
73+
We'd love you to contribute to the development of Cobra! There are many ways in which you can contribute, the most common of which is to contribute to the source code or documentation of the project. However, there are many other ways you can contribute (report issues, improve code coverage by adding unit tests, ...).
74+
We use GitHub issues to track all bugs and feature requests. Feel free to open an issue in case you found a bug or in case you wish to see a new feature added.
8875

89-
- Check out the Data Science Leuven Meetup `talk <https://www.youtube.com/watch?v=w7ceZZqMEaA&feature=youtu.be>`_ by one of the core developers (second presentation)
76+
For more details, check out our `wiki <https://github.com/PythonPredictions/cobra/wiki/Contributing-guidelines-&-workflows>`_.

cobra/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
from .version import __version__

cobra/evaluation/__init__.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@
88
from .plotting_utils import plot_univariate_predictor_quality
99
from .plotting_utils import plot_correlation_matrix
1010

11-
from .evaluator import Evaluator
12-
11+
# from .evaluator import Evaluator
12+
from .evaluator import ClassificationEvaluator, RegressionEvaluator
1313

1414
__all__ = ["generate_pig_tables",
1515
"compute_pig_table",
@@ -18,4 +18,5 @@
1818
"plot_variable_importance",
1919
"plot_univariate_predictor_quality",
2020
"plot_correlation_matrix",
21-
"Evaluator"]
21+
"ClassificationEvaluator",
22+
"RegressionEvaluator"]

0 commit comments

Comments
 (0)