-
Notifications
You must be signed in to change notification settings - Fork 184
Description
Intro
안녕하세요 @hyunwoongko 님! 한국어 챗봇 프레임워크를 필요로 했는데, 너무 잘 만드신 것 같습니다!
코드와 자세한 docs를 읽어보며 감탄했습니다. 덕분에 원하는 기능의 챗봇을 만들 수 있을 것 같습니다.
문제 상황
[DistanceClassifier] 학습을 완료한 후, 이런 에러가 발생합니다. (아마 OOD를 이용해 classification metrics report 파일을 만드는 과정인 것 같습니다.)
...
[DistanceClassifier] Epoch : 10, ETA : 4.3569 sec
Traceback (most recent call last):
File "application.py", line 26, in <module>
kochat = KochatApi(
File "/workspace/.pyenv_mirror/user/3.8.19/lib/python3.8/site-packages/kochat/app/kochat_api.py", line 56, in __init__
self.__fit_intent()
File "/workspace/.pyenv_mirror/user/3.8.19/lib/python3.8/site-packages/kochat/app/kochat_api.py", line 153, in __fit_intent
self.intent_classifier.fit(self.dataset.load_intent(self.embed_processor))
File "/workspace/.pyenv_mirror/user/3.8.19/lib/python3.8/site-packages/kochat/proc/intent_classifier.py", line 44, in fit
report, _ = self.metrics.report(['in_dist', 'out_dist'], mode='ood')
File "/workspace/.pyenv_mirror/user/3.8.19/lib/python3.8/site-packages/sklearn/utils/_testing.py", line 317, in wrapper
return fn(*args, **kwargs)
File "/workspace/.pyenv_mirror/user/3.8.19/lib/python3.8/site-packages/kochat/utils/metrics.py", line 86, in report
classification_report(
File "/workspace/.pyenv_mirror/user/3.8.19/lib/python3.8/site-packages/sklearn/utils/validation.py", line 73, in inner_f
return f(**kwargs)
File "/workspace/.pyenv_mirror/user/3.8.19/lib/python3.8/site-packages/sklearn/metrics/_classification.py", line 1950, in classification_report
raise ValueError(
ValueError: Number of classes, 1, does not match size of target_names, 2. Try specifying the labels parameter
저의 생각
kochat/utils/metrics.py의 Metrics.report() 함수를 보면 classification_report() 함수를 호출하고 있습니다.
class Metrics:
...
def report(self, label_dict: dict, mode: str) -> tuple:
"""
분류 보고서와 confusion matrix를 출력합니다.
여기에는 Precision, Recall, F1 Score, Accuracy 등이 포함됩니다.
:return: 다양한 메트릭으로 측정한 모델 성능
"""
...
report = DataFrame(
classification_report(
y_true=label,
y_pred=predict,
target_names=list(label_dict),
output_dict=True
)
)
...classification_report() 함수 정의는 다음과 같습니다. 에러는 해당 코드의 맨 마지막 줄에서 발생합니다.
def classification_report(y_true, y_pred, *, labels=None, target_names=None,
sample_weight=None, digits=2, output_dict=False,
zero_division="warn"):
"""Build a text report showing the main classification metrics.
Read more in the :ref:`User Guide <classification_report>`.
Parameters
----------
y_true : 1d array-like, or label indicator array / sparse matrix
Ground truth (correct) target values.
y_pred : 1d array-like, or label indicator array / sparse matrix
Estimated targets as returned by a classifier.
labels : array, shape = [n_labels]
Optional list of label indices to include in the report.
target_names : list of strings
Optional display names matching the labels (same order).
sample_weight : array-like of shape (n_samples,), default=None
Sample weights.
digits : int
Number of digits for formatting output floating point values.
When ``output_dict`` is ``True``, this will be ignored and the
returned values will not be rounded.
output_dict : bool (default = False)
If True, return output as dict
.. versionadded:: 0.20
zero_division : "warn", 0 or 1, default="warn"
Sets the value to return when there is a zero division. If set to
"warn", this acts as 0, but warnings are also raised.
Returns
-------
report : string / dict
Text summary of the precision, recall, F1 score for each class.
Dictionary returned if output_dict is True. Dictionary has the
following structure::
{'label 1': {'precision':0.5,
'recall':1.0,
'f1-score':0.67,
'support':1},
'label 2': { ... },
...
}
The reported averages include macro average (averaging the unweighted
mean per label), weighted average (averaging the support-weighted mean
per label), and sample average (only for multilabel classification).
Micro average (averaging the total true positives, false negatives and
false positives) is only shown for multi-label or multi-class
with a subset of classes, because it corresponds to accuracy otherwise.
See also :func:`precision_recall_fscore_support` for more details
on averages.
Note that in binary classification, recall of the positive class
is also known as "sensitivity"; recall of the negative class is
"specificity".
See also
--------
precision_recall_fscore_support, confusion_matrix,
multilabel_confusion_matrix
Examples
--------
>>> from sklearn.metrics import classification_report
>>> y_true = [0, 1, 2, 2, 2]
>>> y_pred = [0, 0, 2, 2, 1]
>>> target_names = ['class 0', 'class 1', 'class 2']
>>> print(classification_report(y_true, y_pred, target_names=target_names))
precision recall f1-score support
<BLANKLINE>
class 0 0.50 1.00 0.67 1
class 1 0.00 0.00 0.00 1
class 2 1.00 0.67 0.80 3
<BLANKLINE>
accuracy 0.60 5
macro avg 0.50 0.56 0.49 5
weighted avg 0.70 0.60 0.61 5
<BLANKLINE>
>>> y_pred = [1, 1, 0]
>>> y_true = [1, 1, 1]
>>> print(classification_report(y_true, y_pred, labels=[1, 2, 3]))
precision recall f1-score support
<BLANKLINE>
1 1.00 0.67 0.80 3
2 0.00 0.00 0.00 0
3 0.00 0.00 0.00 0
<BLANKLINE>
micro avg 1.00 0.67 0.80 3
macro avg 0.33 0.22 0.27 3
weighted avg 1.00 0.67 0.80 3
<BLANKLINE>
"""
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
labels_given = True
if labels is None:
labels = unique_labels(y_true, y_pred) # labels의 정의되는 지점
labels_given = False
else:
labels = np.asarray(labels)
# labelled micro average
micro_is_accuracy = ((y_type == 'multiclass' or y_type == 'binary') and
(not labels_given or
(set(labels) == set(unique_labels(y_true, y_pred)))))
if target_names is not None and len(labels) != len(target_names):
if labels_given:
warnings.warn(
"labels size, {0}, does not match size of target_names, {1}"
.format(len(labels), len(target_names))
)
else:
raise ValueError(
"Number of classes, {0}, does not match size of "
"target_names, {1}. Try specifying the labels "
"parameter".format(len(labels), len(target_names))
) # 여기에서 에러가 발생합니다!
...즉, labels와 target_names의 길이가 달라서 에러가 발생하는 것으로 보입니다. labels는 classification_report() 함수에서 일부러 None 값이 들어가도록 따로 값을 적어 호출하지 않으신 것 같아서 labels는 unique_labels(y_true, y_pred)로 정의됩니다.
unique_labels() 함수의 설명 속 예시는 다음과 같습니다.
Examples
--------
>>> from sklearn.utils.multiclass import unique_labels
>>> unique_labels([3, 5, 5, 5, 7, 7])
array([3, 5, 7])
>>> unique_labels([1, 2, 3, 4], [2, 2, 3, 4])
array([1, 2, 3, 4])
>>> unique_labels([1, 2, 10], [5, 11])
array([ 1, 2, 5, 10, 11])
즉, unique_labels(y_true, y_pred)는 y_true와 y_pred를 합집합 하는 연산이라 보입니다.
문제는 이때 y_true와 y_pred가 모두 동일한 label인 1, 즉 out_dist을 가지고 있을 때 발생합니다. (학습을 충분히 시키지 않은 문제도 있지만, 모두 OOD로 분류되더라도 학습은 진행되어야 하는 것 아닌가요?)
y_true와 y_pred를 출력해보면 각각 [1 1 1 ... 1 1 1]과 [1 1 1 ... 1 1 1]로, 길이는 동일합니다.
해당 에러는 어떻게 해결할 수 있을까요? 열심히 제 나름대로 저의 시행착오를 정리했는데 두서가 없는 점 죄송합니다 ㅠㅠ 멋진 프레임워크를 공유해주셔서 다시 한 번 감사합니다.