SMC (SenseMoCap) is a file format designed with multi-camera multi-model support in mind. Each smc file is essentially a HDF5 database,made easy for cross-platform, cross-language support (h5py, H5Cpp).
Each SMC file contains one sequence of 4D human data, with multiple data modalities in the following structure.
-
-
Attributes
- actor_id: actor id, int32 scalar
- action_id: action id, int32 scalar
- datetime_str: data collection time stamp, string (YYYY-MM-DD-hh-mm-ss)
-
- A JSON String with N calibrated Kinects and M iPhone extrinsic parameters with
- Indexing cameras:
- Kinect Color Index: i*2, 0<=i<N
- Kinect Depth Index: i*2 + 1, 0<=i<N
- iPhone Index: N*2 + j, 0<=j<M
- Parameters:
- Extrinsic: cam2world transformation
- R: Rotation Matrix [3,3]
- T: Translation [3]
- Floor: floor parameter [4]
- Extrinsic: cam2world transformation
- Indexing cameras:
- A JSON String with N calibrated Kinects and M iPhone extrinsic parameters with
-
-
Attributes
-
-
-
- Intrinsics (Dataset): K4A SDK factory calibrated intrinsic, float32, shape (15,)
- Resolution (Dataset): color camera resolution (width, height), uint16, shape (2,),
- MetricRadius (Dataset): metric radius from K4A SDK, float32 scalar
-
- Intrinsics (Dataset): K4A SDK factory calibrated intrinsic, float32, shape (15,)
- Resolution (Dataset): color camera resolution (width, height), uint16, shape (2,),
- MetricRadius (Dataset): metric radius from K4A SDK, float32 scalar
-
-
- Dataset with F(number of frames) color images
- RGBA Color image (byte array)
- Dataset with F(number of frames) color images
-
- Dataset with F(number of frames) depth images
- 16 bit depth image: 2D uint16 array with Shape H*W (576, 640)
- Dataset with F(number of frames) depth images
-
- Dataset with F(number of frames) Infrared images
- 16 bit IR image: 2D uint16 array with Shape H*W (576, 640)
- Dataset with F(number of frames) Infrared images
-
- Dataset with F(number of frames) body mask images from frame difference
- 8 bit body mask image: 2D uint8 array with Shape H*W (576, 640)
- Dataset with F(number of frames) body mask images from frame difference
-
- Dataset with F(number of frames) body mask images from K4A Body Tracking SDK
- 8 bit body mask image: 2D uint8 array with Shape H*W (576, 640)
- Dataset with F(number of frames) body mask images from K4A Body Tracking SDK
-
- Dataset with F(number of frames) Skeleton data from K4A Body Tracking SDK
- JSON: as specified in K4A SDK
- Dataset with F(number of frames) Skeleton data from K4A Body Tracking SDK
-
- Background Images for matting. Available from v3 data
- Color : Same as Kinect Color
- Depth: Same as Kinect Depth
- Background Images for matting. Available from v3 data
-
-
-
-
Attributes
- num_frame: number of iPhone frames, int32 scalar, close to number of Kinect frames * 2 + 4
- color_resolution: iPhone RGB resolution(width, height), int32, shape (2,)
- depth_resolution: iPhone Depth resolution (width, height), int32, shape (2,)
-
-
- Dataset with F(number of frames) color images
- RGBA Color image (byte array)
- Dataset with F(number of frames) color images
-
- Dataset with F(number of frames) depth images (from iPhone LiDAR)
- 16 bit depth image: 2D uint16 array with Shape H*W (192, 256)
- Dataset with F(number of frames) depth images (from iPhone LiDAR)
-
- Dataset with F(number of frames) confidence maps
- 8 bit confidence: 2D uint16 array with Shape H*W (192, 256)
- Dataset with F(number of frames) confidence maps
-
- Dataset with F(number of frames) body mask from Apple ARKit
- 8 bit body mask: 2D uint16 array with Shape H*W (192, 256)
- Dataset with F(number of frames) body mask from Apple ARKit
-
- Dataset with F(number of frames) camera information
- JSON with camera intrinsics, timestamp etc
- Dataset with F(number of frames) camera information
-
-
-
- Attributes
- num_frame: number of frames for 3D key points
- convention: convention for key points
- created_time: creation timestamp
- keypoints3d (Dataset): 3D key point computed from triangulate_optim
- Keypoints3d_mask (Dataset): corresponding mask
- Attributes
-
-
Kinect (Group)
-
- DeviceID
- Length aligned with 3D Keypoints,reprojection from 3D Keypoints
- DeviceID
-
-
iPhone (Group)
-
- DeviceID
- Length aligned with 3D Keypoints,reprojection from 3D Keypoints
- DeviceID
-
-
-
- Attributes
- num_frame: SMPL frames
- created_time: creation timestamp
- global_orient (Dataset): Global Orientation: Nx3
- body_pose (Dataset): Body Pose: Nx23x3
- betas (Dataset): SMPL Betas: 1x10
- transl (Dataset): Global Translation: Nx3
- keypoints3d (Dataset): SMPL Keypoints: Nx3
- Attributes
-
Please first install MMHuman3D following the installation guide.
To read a .smc
file, you may refer to the instructions below:
from mmhuman3d.data.data_structures.smc_reader import SMCReader
# Initialize a smc reader
smc_reader = SMCReader('/path/to/pxxxxxx_axxxxxx.smc')
# Get images
# Kinect IDs: from 0 to 9
# iPhone ID: 0; vertical: images are transformed from landscape to vertical
kinect_images = smc_reader.get_color(device='Kinect', device_id=0)
iphone_images = smc_reader.get_color(device='iPhone', device_id=0, vertical=True)
smc_reader.get_kinect_color_extrinsics(kinect_id=0)
smc_reader.get_iphone_extrinsics(iphone_id=0)
# Get images
kinect_images = smc_reader.get_iphone_depth(device='Kinect', device_id=0)
# Get 2D keypoints
iphone_keypoints2d = smc_reader.get_keypoints2d(device='Kinect', device_id=0)
iphone_keypoints2d = smc_reader.get_keypoints2d(device='iPhone', device_id=0, vertical=True)
# Get 3D keypoints
keypoints3d = smc_reader.get_keypoints3d(device='Kinect', device_id=0)
# Get SMPL
smpl = smc_reader.get_smpl(device='Kinect', device_id=0)