Skip to content

Problems using custom data sets #253

Closed
@xixioba

Description

@xixioba

I am trying to use my own Lidar data to test PV-RCNN instead of kitti data, I used similar kaggle annotations
However, I get an error when trying to run the code and the error message is as follows

File "***/OpenPCDet/pcdet/datasets/innovusion/innovusion_dataset.py", line 77, in __getitem__
    data_dict = self.prepare_data(data_dict=input_dict)
  File "***/OpenPCDet/pcdet/datasets/dataset.py", line 124, in prepare_data
    'gt_boxes_mask': gt_boxes_mask
  File "***/OpenPCDet/pcdet/datasets/augmentor/data_augmentor.py", line 93, in forward
    data_dict = cur_augmentor(data_dict=data_dict)
  File "***/OpenPCDet/pcdet/datasets/augmentor/database_sampler.py", line 179, in __call__
    sampled_boxes = np.stack([x['box3d_lidar'] for x in sampled_dict], axis=0).astype(np.float32)
  File "<__array_function__ internals>", line 6, in stack
  File "***/anaconda3/envs/ml/lib/python3.7/site-packages/numpy/core/shape_base.py", line 423, in stack
    raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack

I located the code and found that it was related to data enhancement, in pcdet/datasets/augmentor/database_sampler.py

    def __call__(self, data_dict):
        """
        Args:
            data_dict:
                gt_boxes: (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]

        Returns:

        """
        gt_boxes = data_dict['gt_boxes']
        gt_names = data_dict['gt_names'].astype(str)
        existed_boxes = gt_boxes
        total_valid_sampled_dict = []
        for class_name, sample_group in self.sample_groups.items():
            if self.limit_whole_scene:
                num_gt = np.sum(class_name == gt_names)
                sample_group['sample_num'] = str(int(self.sample_class_num[class_name]) - num_gt)
            if int(sample_group['sample_num']) > 0:
                sampled_dict = self.sample_with_fixed_number(class_name, sample_group)  ### need help

                sampled_boxes = np.stack([x['box3d_lidar'] for x in sampled_dict], axis=0).astype(np.float32)

                if self.sampler_cfg.get('DATABASE_WITH_FAKELIDAR', False):
                    sampled_boxes = box_utils.boxes3d_kitti_fakelidar_to_lidar(sampled_boxes)

                iou1 = iou3d_nms_utils.boxes_bev_iou_cpu(sampled_boxes[:, 0:7], existed_boxes[:, 0:7])
                iou2 = iou3d_nms_utils.boxes_bev_iou_cpu(sampled_boxes[:, 0:7], sampled_boxes[:, 0:7])
                iou2[range(sampled_boxes.shape[0]), range(sampled_boxes.shape[0])] = 0
                iou1 = iou1 if iou1.shape[1] > 0 else iou2
                valid_mask = ((iou1.max(axis=1) + iou2.max(axis=1)) == 0).nonzero()[0]
                valid_sampled_dict = [sampled_dict[x] for x in valid_mask]
                valid_sampled_boxes = sampled_boxes[valid_mask]

                existed_boxes = np.concatenate((existed_boxes, valid_sampled_boxes), axis=0)
                total_valid_sampled_dict.extend(valid_sampled_dict)

        sampled_gt_boxes = existed_boxes[gt_boxes.shape[0]:, :]
        if total_valid_sampled_dict.__len__() > 0:
            data_dict = self.add_sampled_boxes_to_scene(data_dict, sampled_gt_boxes, total_valid_sampled_dict)

        data_dict.pop('gt_boxes_mask')
        return data_dict

Then the key function is sample_with_fixed_number(self, class_name, sample_group)

    def sample_with_fixed_number(self, class_name, sample_group):
        """
        Args:
            class_name:
            sample_group:
        Returns:

        """
        sample_num, pointer, indices = int(sample_group['sample_num']), sample_group['pointer'], sample_group['indices']
        if pointer >= len(self.db_infos[class_name]):
            indices = np.random.permutation(len(self.db_infos[class_name]))
            pointer = 0

        sampled_dict = [self.db_infos[class_name][idx] for idx in indices[pointer: pointer + sample_num]]
        pointer += sample_num
        sample_group['pointer'] = pointer
        sample_group['indices'] = indices
        return sampled_dict

Self.db_infos is used in the code, it is specified by sampler_cfg.DB_INFO_PATH, but my data dose not have it, so I am stuck here, what do I need to do to fix it, or is there a detailed explanation for me to understand this code
Note: My data annotation format

id confidence center_x center_y center_z width length height yaw class_name

thank you all

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions