Division of data into training and validation set  & COCO Metric Callback not working with Keras CV implementation as expected

### Discussed in https://github.com/keras-team/keras-cv/discussions/2126

<div type='discussions-op-text'>

<sup>Originally posted by **Inshu32** November  6, 2023</sup>
I am trying to implement Keras-cv based pipeline to train a custom dataset using https://keras.io/examples/vision/yolov8/ example. I have an object detection problem with 6 classes. I am facing two issues:
_1. While dividing the dataset using take and skip, since the data is divided sequentially it takes first 2 classes for validation and rest 4 training. This is creating problems as data is being trained on different data and tested on different dataset. I used tf.data.shuffle to overcome this problem but still division of dataset doesn't ensure that all the classes are represented in both training and val set. 
2. While running Yolo.fit, I expect the algorithm to evaluate the predictions using COCO metric call back. for which I am using teh following function:_ 
class EvaluateCOCOMetricsCallback(keras.callbacks.Callback):
    def __init__(self, data, save_path):
        super().__init__()
        self.data = data
        self.metrics = keras_cv.metrics.BoxCOCOMetrics(
            bounding_box_format="xyxy",
            evaluate_freq=1e9,
        )

        self.save_path = save_path
        self.best_map = -1.0

    def on_epoch_end(self, epoch, logs):
        self.metrics.reset_state()
        for batch in self.data:
            images, y_true = batch[0], batch[1]
            y_pred = self.model.predict(images, verbose=0)
            self.metrics.update_state(y_true, y_pred)

        metrics = self.metrics.result(force=True)
        logs.update(metrics)

        current_map = metrics["MaP"]
        if current_map > self.best_map:
            self.best_map = current_map
            self.model.save(self.save_path)  # Save the model when mAP improves

        return logs
Which produces the following error:
**tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node __wrapped__ConcatV2_N_13_device_/job:localhost/replica:0/task:0/device:CPU:0}} ConcatOp : Dimension 1 in both shapes must be equal: shape[0] = [32,2,4] vs. shape[2] = [32,1,4] [Op:ConcatV2] name: concat
More detailed traceback {File "/home/lib/python3.9/site-packages/keras_cv/metrics/object_detection/box_coco_metrics.py", line 262, in _compute_result
    _box_concat(self.ground_truths),

  File "/home/lib/python3.9/site-packages/keras_cv/metrics/object_detection/box_coco_metrics.py", line 44, in _box_concat
    result[key] = tf.concat([b[key] for b in boxes], axis=0)
}**
_**This to my understanding is problem with multiple bounding boxes in one image. Ragged tensor solves the problem while training with multiple bounding boxes. In the above case I think predicted bounding box is only one and ground truth has 2 bounding boxes for the same image. How to solve this problem ?**_
</div>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Division of data into training and validation set & COCO Metric Callback not working with Keras CV implementation as expected #2509

Discussed in #2126

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Division of data into training and validation set & COCO Metric Callback not working with Keras CV implementation as expected #2509

Description

Discussed in #2126

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions