Simple Question about object detection code

https://github.com/doc-doc/CoVGT/blob/cbc9fa7830b304f3c3f9c53040489ea9ad35a9aa/model/EncoderVid.py#L56-L71

Hi, I am reading your code about object detection. I found the above one in your EncoderVid.py
Do you still remember why you choose 5 dimension (dim_bbox) for positional embedding? What is the source of this way? (Faster RCNN or Detectron)

Thank you for your prompt response! Thanks for your great work!

	def forward(self, video_o):

	bsize, numc, numf, numr, fdim = video_o.shape

	video_o = video_o.view(bsize, numc*numf, numr, fdim)
	roi_feat = video_o[:,:,:, :self.dim_feat]
	roi_bbox = video_o[:,:,:, self.dim_feat:(self.dim_feat+self.dim_bbox)]

	bbox_pos = self.bbox_conv(roi_bbox.permute(
	0, 3, 1, 2)).permute(0, 2, 3, 1)

	bbox_features = torch.cat([roi_feat, bbox_pos], dim=-1)

	bbox_feat = self.tohid(bbox_features)

	return bbox_feat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simple Question about object detection code #11

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Simple Question about object detection code #11

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions