Skip to content

Commit f414182

Browse files
authored
Merge pull request #245 from janosh/master
Code style enforcement
2 parents edc349c + 3f5a463 commit f414182

20 files changed

+662
-483
lines changed

.pre-commit-config.yaml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
repos:
2+
- repo: https://github.com/pre-commit/mirrors-isort
3+
rev: v4.3.21
4+
hooks:
5+
- id: isort
6+
language_version: python3.7
7+
- repo: https://github.com/ambv/black
8+
rev: stable
9+
hooks:
10+
- id: black
11+
language_version: python3.7
12+
- repo: https://github.com/pre-commit/pre-commit-hooks
13+
rev: v2.3.0
14+
hooks:
15+
- id: flake8

CONTRIBUTING.md

Lines changed: 43 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,44 +1,64 @@
11
# Contributing to automatminer
2+
23
We love your input! We want to make contributing to automatminer as easy and transparent as possible, whether it's:
3-
* Reporting a bug
4-
* Discussing the current state of the code
5-
* Submitting a fix
6-
* Proposing or implementing new features
7-
* Becoming a maintainer
4+
5+
- Reporting a bug
6+
- Discussing the current state of the code
7+
- Submitting a fix
8+
- Proposing or implementing new features
9+
- Becoming a maintainer
810

911
## Reporting bugs, getting help, and discussion
12+
1013
At any time, feel free to start a thread on the automatminer [Discourse forum](https://hackingmaterials.discourse.group/c/matminer/automatminer).
1114

1215
If you are making a bug report, incorporate as many elements of the following as possible to ensure a timely response and avoid the need for followups:
13-
* A quick summary and/or background
14-
* Steps to reproduce - be specific! **Provide sample code.**
15-
* What you expected would happen, compared to what actually happens
16-
* The full stack trace of any errors you encounter
17-
* Notes (possibly including why you think this might be happening, or steps you tried that didn't work)
16+
17+
- A quick summary and/or background
18+
- Steps to reproduce - be specific! **Provide sample code.**
19+
- What you expected would happen, compared to what actually happens
20+
- The full stack trace of any errors you encounter
21+
- Notes (possibly including why you think this might be happening, or steps you tried that didn't work)
1822

1923
We love thorough bug reports as this means the development team can make quick and meaningful fixes. When we confirm your bug report, we'll move it to the GitHub issues where its progress can be further tracked.
2024

21-
## Contributing code modifications or additions through Github
22-
We use github to host code, to track issues and feature requests, as well as accept pull requests.
25+
## Contributing code modifications or additions through GitHub
26+
27+
We use GitHub to host code, to track issues and feature requests, as well as accept pull requests.
2328

24-
Pull requests are the best way to propose changes to the codebase. Follow the [Github flow](https://www.atlassian.com/git/tutorials/comparing-workflows/forking-workflow) for more information on this procedure.
29+
Pull requests are the best way to propose changes to the codebase. Follow the [GitHub flow](https://www.atlassian.com/git/tutorials/comparing-workflows/forking-workflow) for more information on this procedure.
2530

2631
The basic procedure for making a PR is:
27-
* Fork the repo and create your branch from master.
28-
* Commit your improvements to your branch and push to your Github fork (repo).
29-
* When you're finished, go to your fork and make a Pull Request. It will automatically update if you need to make further changes.
32+
33+
- Fork the repo on GitHub and clone it to your machine.
34+
35+
```sh
36+
git clone https://github.com/<your_github_name>/automatminer
37+
```
38+
39+
- Install both regular and development dependencies and setup the `git` pre-commit hook.
40+
41+
```sh
42+
pip install -r requirements.txt requirement && pre-commit install
43+
```
44+
45+
This step is important as your changes may otherwise contain style violations that will throw errors when running our CI on your pull request.
46+
- Commit your improvements and push to your GitHub fork.
47+
- When you're finished, go to your fork and make a pull request. It will automatically update if you need to make further changes.
3048

3149
### How to Make a **Great** Pull Request
50+
3251
We have a few tips for writing good PRs that are accepted into the main repo:
3352

34-
* Use the Google Code style for all of your code. Find an example [here.](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html)
35-
* Your code should have (4) spaces instead of tabs.
36-
* If needed, update the documentation.
37-
* **Write tests** for new features! Good tests are 100%, absolutely necessary for good code. We use the python `unittest` framework -- see some of the other tests in this repo for examples, or review the [Hitchhiker's guide to python](https://docs.python-guide.org/writing/tests/) for some good resources on writing good tests.
38-
* Understand your contributions will fall under the same license as this repo.
53+
- Use the Google Code style for all of your code. Find an example [here.](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html)
54+
- Your code should have (4) spaces instead of tabs.
55+
- If needed, update the documentation.
56+
- **Write tests** for new features! Good tests are 100%, absolutely necessary for good code. We use the python `unittest` framework -- see some of the other tests in this repo for examples, or review the [Hitchhiker's guide to python](https://docs.python-guide.org/writing/tests/) for some good resources on writing good tests.
57+
- Understand your contributions will fall under the same license as this repo.
3958

40-
When you submit your PR, our CI service will automatically run your tests.
59+
When you submit your PR, our CI service will automatically run your tests.
4160
We welcome good discussion on the best ways to write your code, and the comments on your PR are an excellent area for discussion.
4261

4362
#### References
44-
This document was adapted from the open-source contribution guidelines for Facebook's Draft, as well as briandk's [contribution template](https://gist.github.com/briandk/3d2e8b3ec8daf5a27a62).
63+
64+
This document was adapted from the open-source contribution guidelines for Facebook's Draft, as well as briandk's [contribution template](https://gist.github.com/briandk/3d2e8b3ec8daf5a27a62).

automatminer/__init__.py

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
from automatminer.preprocessing import DataCleaner, FeatureReducer
2-
from automatminer.automl import TPOTAdaptor, SinglePipelineAdaptor
3-
from automatminer.featurization import AutoFeaturizer
4-
from automatminer.pipeline import MatPipe
5-
from automatminer.presets import get_preset_config
1+
from automatminer.automl import SinglePipelineAdaptor, TPOTAdaptor # noqa
2+
from automatminer.featurization import AutoFeaturizer # noqa
3+
from automatminer.pipeline import MatPipe # noqa
4+
from automatminer.preprocessing import DataCleaner, FeatureReducer # noqa
5+
from automatminer.presets import get_preset_config # noqa
66

7-
__author__ = 'Alex Dunn, Qi Wang, Alex Ganose, Alireza Faghaninia, Anubhav Jain'
8-
__author_email__ = '[email protected]'
9-
__license__ = 'Modified BSD'
7+
__author__ = "Alex Dunn, Qi Wang, Alex Ganose, Alireza Faghaninia, Anubhav Jain"
8+
__author_email__ = "[email protected]"
9+
__license__ = "Modified BSD"
1010
__version__ = "2019.10.14"

automatminer/base.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,13 @@
33
"""
44
import abc
55
import logging
6+
7+
from automatminer.utils.log import (
8+
AMM_LOGGER_BASENAME,
9+
initialize_logger,
10+
initialize_null_logger,
11+
)
612
from sklearn.base import BaseEstimator
7-
from automatminer.utils.log import initialize_logger, \
8-
initialize_null_logger, AMM_LOGGER_BASENAME
913

1014
__authors__ = ["Alex Dunn <[email protected]>", "Alex Ganose <[email protected]>"]
1115

@@ -24,7 +28,7 @@ def logger(self):
2428
@logger.setter
2529
def logger(self, new_logger):
2630
"""Set a new logger.
27-
31+
2832
Args:
2933
new_logger (Logger, bool): A boolean or custom logger object to use
3034
for logging. Alternatively, if set to True, the default automatminer

automatminer/pipeline.py

Lines changed: 48 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,25 @@
11
"""
22
The highest level classes for pipelines.
33
"""
4-
import os
54
import copy
5+
import os
66
import pickle
77
from typing import Dict
88

99
import pandas as pd
10-
11-
from automatminer.base import LoggableMixin, DFTransformer
10+
from automatminer.base import DFTransformer, LoggableMixin
1211
from automatminer.presets import get_preset_config
13-
from automatminer.utils.ml import regression_or_classification
14-
from automatminer.utils.pkg import check_fitted, set_fitted, \
15-
return_attrs_recursively, AutomatminerError, VersionError, get_version, \
16-
save_dict_to_file
1712
from automatminer.utils.log import AMM_DEFAULT_LOGGER
13+
from automatminer.utils.ml import regression_or_classification
14+
from automatminer.utils.pkg import (
15+
AutomatminerError,
16+
VersionError,
17+
check_fitted,
18+
get_version,
19+
return_attrs_recursively,
20+
save_dict_to_file,
21+
set_fitted,
22+
)
1823

1924

2025
class MatPipe(DFTransformer, LoggableMixin):
@@ -88,15 +93,23 @@ class MatPipe(DFTransformer, LoggableMixin):
8893
target (str): The name of the column where target values are held.
8994
"""
9095

91-
def __init__(self, autofeaturizer=None, cleaner=None, reducer=None,
92-
learner=None, logger=AMM_DEFAULT_LOGGER):
96+
def __init__(
97+
self,
98+
autofeaturizer=None,
99+
cleaner=None,
100+
reducer=None,
101+
learner=None,
102+
logger=AMM_DEFAULT_LOGGER,
103+
):
93104
transformers = [autofeaturizer, cleaner, reducer, learner]
94105
if not all(transformers):
95106
if any(transformers):
96-
raise AutomatminerError("Please specify all dataframe"
97-
"transformers (autofeaturizer, learner,"
98-
"reducer, and cleaner), or none (to use"
99-
"default).")
107+
raise AutomatminerError(
108+
"Please specify all dataframe"
109+
"transformers (autofeaturizer, learner,"
110+
"reducer, and cleaner), or none (to use"
111+
"default)."
112+
)
100113
else:
101114
config = get_preset_config("express")
102115
autofeaturizer = config["autofeaturizer"]
@@ -117,7 +130,7 @@ def __init__(self, autofeaturizer=None, cleaner=None, reducer=None,
117130
super(MatPipe, self).__init__()
118131

119132
@staticmethod
120-
def from_preset(preset: str = 'express', **powerups):
133+
def from_preset(preset: str = "express", **powerups):
121134
"""
122135
Get a preset MatPipe from a string using
123136
automatminer.presets.get_preset_config
@@ -238,8 +251,7 @@ def predict(self, df, ignore=None):
238251
return merged_df
239252

240253
@set_fitted
241-
def benchmark(self, df, target, kfold, fold_subset=None, cache=False,
242-
ignore=None):
254+
def benchmark(self, df, target, kfold, fold_subset=None, cache=False, ignore=None):
243255
"""
244256
If the target property is known for all data, perform an ML benchmark
245257
using MatPipe. Used for getting an idea of how well AutoML can predict
@@ -292,22 +304,26 @@ def benchmark(self, df, target, kfold, fold_subset=None, cache=False,
292304
if os.path.exists(cache_src):
293305
self.logger.warning(
294306
"Cache src {} already found! Ensure this featurized data "
295-
"matches the df being benchmarked.".format(cache_src))
307+
"matches the df being benchmarked.".format(cache_src)
308+
)
296309
self.logger.warning("Running pre-featurization for caching.")
297310
self.autofeaturizer.fit_transform(df, target)
298311
elif cache_src and not cache:
299312
raise AutomatminerError(
300313
"Caching was enabled in AutoFeaturizer but not in benchmark. "
301314
"Either disable caching in AutoFeaturizer or enable it by "
302-
"passing cache=True to benchmark.")
315+
"passing cache=True to benchmark."
316+
)
303317
elif cache and not cache_src:
304318
raise AutomatminerError(
305319
"MatPipe cache is enabled, but no cache_src was defined in "
306320
"autofeaturizer. Pass the cache_src argument to AutoFeaturizer "
307-
"or use the cache_src get_preset_config powerup.")
321+
"or use the cache_src get_preset_config powerup."
322+
)
308323
else:
309-
self.logger.debug("No caching being used in AutoFeaturizer or "
310-
"benchmark.")
324+
self.logger.debug(
325+
"No caching being used in AutoFeaturizer or " "benchmark."
326+
)
311327

312328
if not fold_subset:
313329
fold_subset = list(range(kfold.n_splits))
@@ -372,25 +388,20 @@ def summarize(self, filename=None) -> Dict[str, str]:
372388
"drop_na_targets",
373389
]
374390
cleaner_data = {
375-
attr: str(getattr(self.cleaner, attr))
376-
for attr in cleaner_attrs
391+
attr: str(getattr(self.cleaner, attr)) for attr in cleaner_attrs
377392
}
378393

379-
reducer_attrs = [
380-
"reducers",
381-
"reducer_params",
382-
]
394+
reducer_attrs = ["reducers", "reducer_params"]
383395
reducer_data = {
384-
attr: str(getattr(self.reducer, attr))
385-
for attr in reducer_attrs
396+
attr: str(getattr(self.reducer, attr)) for attr in reducer_attrs
386397
}
387398

388399
attrs = {
389400
"featurizers": self.autofeaturizer.featurizers,
390401
"ml_model": str(self.learner.best_pipeline),
391402
"feature_reduction": reducer_data,
392403
"data_cleaning": cleaner_data,
393-
"features": self.learner.features
404+
"features": self.learner.features,
394405
}
395406
if filename:
396407
save_dict_to_file(attrs, filename)
@@ -416,12 +427,16 @@ def save(self, filename="mat.pipe"):
416427

417428
temp_logger = copy.deepcopy(self._logger)
418429
loggables = [
419-
self, self.learner, self.reducer, self.cleaner, self.autofeaturizer
430+
self,
431+
self.learner,
432+
self.reducer,
433+
self.cleaner,
434+
self.autofeaturizer,
420435
]
421436
for loggable in loggables:
422437
loggable._logger = AMM_DEFAULT_LOGGER
423438

424-
with open(filename, 'wb') as f:
439+
with open(filename, "wb") as f:
425440
pickle.dump(self, f)
426441

427442
# Reassign live memory objects for further use in this object
@@ -446,7 +461,7 @@ def load(filename, logger=True, supress_version_mismatch=False):
446461
Returns:
447462
pipe (MatPipe): A MatPipe object.
448463
"""
449-
with open(filename, 'rb') as f:
464+
with open(filename, "rb") as f:
450465
pipe = pickle.load(f)
451466

452467
if pipe.version != get_version() and not supress_version_mismatch:
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
from .core import DataCleaner, FeatureReducer
1+
from .core import DataCleaner, FeatureReducer # noqa

0 commit comments

Comments
 (0)