Description
Internally cuml
makes use of a single global logger with a single global log level. This lets internal code make logging calls like
/* A C++ log */
CUML_LOG_DEBUG("some debug log message")
or
# A python log
logger.debug("some debug log message")
Users may change the global log level by calling logger.set_level(logger.level_enum.info)
.
Seems simple enough on initial inspection. But then enter verbose
.
Estimators and certain functions also take a verbose
parameter that intends to mirror sklearn's verbose
parameter (where a higher integer means "log more"). In most estimators this is implemented by translating the verbose
kwarg to a level_enum
, then calling set_level
(usually in C++), mutating the global logger level.
In [1]: import cuml
In [2]: X, _ = cuml.datasets.make_blobs()
In [3]: tsne = cuml.TSNE(verbose=6) # very verbose
In [4]: cuml.internals.logger.get_level() # current log level
Out[4]: <level_enum.warn: 3>
In [5]: tsne.fit(X)
[2025-03-04 21:59:37.189] [CUML] [debug] Data size = (100, 2) with dim = 2 perplexity = 30.000000
[2025-03-04 21:59:37.201] [CUML] [debug] Getting distances.
[2025-03-04 21:59:37.410] [CUML] [debug] Now normalizing distances so exp(D) doesn't explode.
[2025-03-04 21:59:37.411] [CUML] [debug] Searching for optimal perplexity via bisection search.
[2025-03-04 21:59:37.605] [CUML] [debug] [t-SNE] KL divergence: 0.09647911041975021
Out[5]: TSNE()
In [6]: cuml.internals.logger.get_level() # global log level is mutated after fit!
Out[6]: <level_enum.trace: 0>
Other estimators fully ignore the verbose
parameter, continuing to log at whatever the global setting was before.
It's also unclear when the global setting should be respected or used - if all estimators and many functions take a verbose
parameter (controlling local verbosity), then wouldn't the local always override (at least by the current definition of the default value of verbose=False
meaning "info")?
The situation is inconsistent at best, and the global mutation + translating between verbose
and level_enum
settings has led to bugs (see #6393).
In summary:
cuml
uses a global logger to handle all logging. This logger has a global level setting.cuml
also wants to support averbose
parameter a. la.sklearn
, where log levels are handled local to each estimator.- The default for
verbose
is not "respect the global setting", it'sFalse
(which corresponds toINFO
). Ifverbose
was handled perfectly, it's unclear when the global setting would be used instead by current intentions. - Some estimators (e.g.
PCA
) ignore theverbose
setting completely - Others implement it by mutating the global level without resetting it (e.g.
TSNE
,UMAP
, ...) - We have 2 log level descriptors (
verbose
andlevel_enum
), and failure to translate between them has led to subtle bugs (A few log level handling cleanups #6393).