Skip to content

Commit ee2947f

Browse files
author
Stephen Hoover
authored
DOC Update max_depth settings for CivisML (#240)
The next CivisML release (v2.2) will specify a max_depth for tree ensembles and will remove `max_depth=None` from depths searched during hyperband hyperparameter optimization.
1 parent eb3e341 commit ee2947f

File tree

2 files changed

+19
-17
lines changed

2 files changed

+19
-17
lines changed

CHANGELOG.md

+2
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ This project adheres to [Semantic Versioning](http://semver.org/).
3030
- Switched to pip install-ing dependencies for building the documentation (#230)
3131
- Added a merge rule for the changelog to .gitattributes (#229)
3232
- Default to "all" API resources rather than "base".
33+
- Updated documentation on algorithm hyperparameters to reflect changes with
34+
CivisML v2.2 release (#240)
3335

3436
## 1.8.1 - 2018-02-01
3537
### Added

docs/source/ml.rst

+17-17
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ the following pip-installable dependencies enhance the capabilities of the
2525
- civisml-extensions
2626
- muffnn
2727

28-
28+
2929
Install :mod:`pandas` if you wish to download tables of predictions.
3030
You can also model on :class:`~pandas.DataFrame` objects in your interpreter.
3131

@@ -61,7 +61,7 @@ A :class:`~sklearn.pipeline.Pipeline` allows you to combine multiple
6161
modeling steps (such as missing value imputation and feature selection) into a
6262
single model. The :class:`~sklearn.pipeline.Pipeline` is treated as a unit -- for example,
6363
cross-validation happens over all steps together.
64-
64+
6565
You can define your model in two ways, either by selecting a pre-defined algorithm
6666
or by providing your own scikit-learn
6767
:class:`~sklearn.pipeline.Pipeline` or :class:`~sklearn.base.BaseEstimator` object.
@@ -86,17 +86,17 @@ Name Model Type Algorithm
8686
================================ ================ ================================================================================================================================== ==================================
8787
sparse_logistic classification `LogisticRegression <http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html>`_ ``C=499999950, tol=1e-08``
8888
gradient_boosting_classifier classification `GradientBoostingClassifier <http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html>`_ ``n_estimators=500, max_depth=2``
89-
random_forest_classifier classification `RandomForestClassifier <http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html>`_ ``n_estimators=500``
90-
extra_trees_classifier classification `ExtraTreesClassifier <http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html>`_ ``n_estimators=500``
91-
multilayer_perceptron_classifier classification `muffnn.MLPClassifier <https://github.com/civisanalytics/muffnn>`_
92-
stacking_classifier classification `civismlext.StackedClassifier <https://github.com/civisanalytics/civisml-extensions>`_
93-
sparse_linear_regressor regression `LinearRegression <http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html>`_
94-
sparse_ridge_regressor regression `Ridge <http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html>`_
89+
random_forest_classifier classification `RandomForestClassifier <http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html>`_ ``n_estimators=500, max_depth=7``
90+
extra_trees_classifier classification `ExtraTreesClassifier <http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html>`_ ``n_estimators=500, max_depth=7``
91+
multilayer_perceptron_classifier classification `muffnn.MLPClassifier <https://github.com/civisanalytics/muffnn>`_
92+
stacking_classifier classification `civismlext.StackedClassifier <https://github.com/civisanalytics/civisml-extensions>`_
93+
sparse_linear_regressor regression `LinearRegression <http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html>`_
94+
sparse_ridge_regressor regression `Ridge <http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html>`_
9595
gradient_boosting_regressor regression `GradientBoostingRegressor <http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html>`_ ``n_estimators=500, max_depth=2``
96-
random_forest_regressor regression `RandomForestRegressor <http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html>`_ ``n_estimators=500``
97-
extra_trees_regressor regression `ExtraTreesRegressor <http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesRegressor.html>`_ ``n_estimators=500``
98-
multilayer_perceptron_regressor regression `muffnn.MLPRegressor <https://github.com/civisanalytics/muffnn>`_
99-
stacking_regressor regression `civismlext.StackedRegressor <https://github.com/civisanalytics/civisml-extensions>`_
96+
random_forest_regressor regression `RandomForestRegressor <http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html>`_ ``n_estimators=500, max_depth=7``
97+
extra_trees_regressor regression `ExtraTreesRegressor <http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesRegressor.html>`_ ``n_estimators=500, max_depth=7``
98+
multilayer_perceptron_regressor regression `muffnn.MLPRegressor <https://github.com/civisanalytics/muffnn>`_
99+
stacking_regressor regression `civismlext.StackedRegressor <https://github.com/civisanalytics/civisml-extensions>`_
100100
================================ ================ ================================================================================================================================== ==================================
101101

102102
The "stacking_classifier" model stacks
@@ -151,7 +151,7 @@ By default, CivisML pre-processes data using the
151151
equal to the ``excluded_columns`` parameter. You can replace this
152152
with your own ETL by creating an object of class
153153
:class:`~sklearn.base.BaseEstimator` and passing it as the ``etl``
154-
parameter during training.
154+
parameter during training.
155155

156156
By default, :class:`~civismlext.preprocessing.DataFrameETL`
157157
automatically one-hot encodes all categorical columns in the
@@ -214,7 +214,7 @@ distributions:
214214
+------------------------------------+--------------------+-----------------------------------------------------------------------------+
215215
| | random_forest_classifier | | ``n_estimators`` | | ``criterion: ['gini', 'entropy']`` |
216216
| | random_forest_regressor | | ``min = 100,`` | | ``max_features: truncexpon(b=10., loc=.01, scale=1./10.11)`` |
217-
| | extra_trees_classifier | | ``max = 1000`` | | ``max_depth: [1, 2, 3, 4, 6, 10, None]`` |
217+
| | extra_trees_classifier | | ``max = 1000`` | | ``max_depth: [1, 2, 3, 4, 6, 10]`` |
218218
| | extra_trees_regressor | | |
219219
| | RF step in stacking_classifier | | |
220220
| | RF step in stacking_regressor | | |
@@ -245,7 +245,7 @@ argument to :class:`~civis.ml.ModelPipeline` which will install the
245245
dependencies in your runtime environment. VCS support is also enabled
246246
(see `docs
247247
<https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support>`_.)
248-
Installing a remote git repository from, say, Github only requires passing the HTTPS
248+
Installing a remote git repository from, say, Github only requires passing the HTTPS
249249
URL in the form of, for example, ``git+https://github.com/scikit-learn/scikit-learn``.
250250

251251
CivisML will run ``pip install [your package here]``. We strongly encourage you to pin
@@ -270,11 +270,11 @@ A simple example of how to do this with API looks as follows
270270

271271
.. code-block:: python
272272
273-
273+
274274
import civis
275275
password = 'abc123' # token copied from https://github.com/settings/tokens
276276
username = 'user123' # Github username
277-
git_token_name = 'Github credential'
277+
git_token_name = 'Github credential'
278278
279279
client = civis.APIClient()
280280
credential = client.credentials.post(password=password,

0 commit comments

Comments
 (0)