This document summarizes guidelines and best practices for contributions to the python component of the library cuML, the machine learning component of the RAPIDS ecosystem. This is an evolving document so contributions, clarifications and issue reports are highly welcome.
Please start by reading:
Refer to the section on thread safety in C++ DEVELOPER_GUIDE.md
- PEP8 and flake8 is used to check the adherence to this style.
- sklearn coding guidelines
- Make sure that this algo has been implemented in the C++ side. Refer to C++ DEVELOPER_GUIDE.md for guidelines on developing in C++.
- Refer to the next section for the remaining steps.
- Create a corresponding algoName.pyx file inside
python/cuml
folder. - Ensure that the folder structure inside here reflects that of sklearn's. Example,
pca.pyx
should be kept inside thedecomposition
sub-folder ofpython/cuml
. . Match the corresponding scikit-learn's interface as closely as possible. Refer to their developer guide on API design of sklearn objects for details. - Always make sure to have your class inherit from
cuml.Base
class as your parent/ancestor. - Ensure that the estimator's output fields follow the 'underscore on both sides' convention explained in the documentation of
cuml.Base
. This allows it to support configurable output types.
For an in-depth guide to creating estimators, see the Estimator Guide
If you are trying to call into cuda runtime APIs inside cuml.cuda
, in case of any errors, they'll raise a cuml.cuda.CudaRuntimeError
. For example:
from cuml.cuda import Stream, CudaRuntimeError
try:
s = Stream()
s.sync
except CudaRuntimeError as cre:
print("Cuda Error! '%s'" % str(cre))
TBD
We mostly follow PEP 257 style docstrings for documenting the interfaces.
The examples in the documentation are checked through doctest. To skip the check for an example's output, use the command # doctest: +SKIP
.
Examples subject to numerical imprecision, or that can't be reproduced consistently should be skipped.
We use https://docs.pytest.org/en/latest/ for writing and running tests. To see existing examples, refer to any of the test_*.py
files in the folder cuml/tests
.
Some tests are run against inputs generated with hypothesis. See the cuml/testing/strategies.py
module for custom strategies that can be used to test cuml estimators with diverse inputs. For example, use the regression_datasets()
strategy to test random regression problems.
When using hypothesis for testing, you must include at least one explicit example using the @example
decorator alongside any @given
strategies. This ensures that:
- Every test has at least one deterministic test case that always runs
- Critical edge cases are documented and tested consistently
- Test failures can be reproduced reliably
Note: While the explicit examples will always run in CI, the hypothesis-generated test cases (from @given
strategies) only run during nightly testing by default. This ensures fast CI runs while still maintaining thorough testing coverage.
Example of a valid hypothesis test:
@example(dtype=np.float32, sparse_input=False) # baseline case, runs as part of PR CI
@example(dtype=np.float64, sparse_input=True) # edge case, runs as part of PR CI
@given(
dtype=st.sampled_from((np.float32, np.float64)),
sparse_input=st.booleans()
) # strategy-based cases, only runs during nightly tests
def test_my_estimator(dtype, sparse_input):
# Test implementation
pass
The test collection will fail if any test uses @given
without an accompanying @example
.
TODO: talk about enabling RMM here when it is ready
If you want to schedule the execution of two algorithms concurrently, it is better to create two separate streams and assign them to separate handles. Finally, schedule the algorithms using these handles.
import cuml
from cuml.cuda import Stream
s1 = Stream()
h1 = cuml.Handle()
h1.setStream(s1)
s2 = Stream()
h2 = cuml.Handle()
h2.setStream(s2)
algo1 = cuml.Algo1(handle=h1, ...)
algo2 = cuml.Algo2(handle=h2, ...)
algo1.fit(X1, y1)
algo2.fit(X2, y2)
To know more underlying details about stream ordering refer to the corresponding section of C++ DEVELOPER_GUIDE.md
TODO: Add more details.
The cuML code including its Python operations can be profiled. The nvtx_benchmark.py
is a helper script that produces a simple benchmark summary. To use it, run python nvtx_benchmark.py "python test.py"
.
Here is an example with the following script:
from cuml.datasets import make_blobs
from cuml.manifold import UMAP
X, y = make_blobs(n_samples=1000, n_features=30)
model = UMAP()
model.fit(X)
embeddngs = model.transform(X)
that once benchmarked can have its profiling summarized:
datasets.make_blobs : 1.3571 s
manifold.umap.fit [0x7f10eb69d4f0] : 0.6629 s
|> umap::unsupervised::fit : 0.6611 s
|==> umap::knnGraph : 0.4693 s
|==> umap::simplicial_set : 0.0015 s
|==> umap::embedding : 0.1902 s
manifold.umap.transform [0x7f10eb69d4f0] : 0.0934 s
|> umap::transform : 0.0925 s
|==> umap::knnGraph : 0.0909 s
|==> umap::smooth_knn : 0.0002 s
|==> umap::optimization : 0.0011 s