Skip to content

Useful statistics for training models #20

@naga-karthik

Description

@naga-karthik

I computed some stats about the dataset which could be considered for training the segmentation models

Subjects with and without lesions

It is useful to know the exact number of either of them so as to think of a curriculum learning strategy involving training the model for a few "warm-up" epochs only on the subjects with lesions and gradually introducing subjects who do not.

Number of subjects without lesion: 56

['sub-m969884', 'sub-m139339', 'sub-m456943', 'sub-m831195', 'sub-m376420', 'sub-m824387', 'sub-m116619', 'sub-m598270', 'sub-m769014', 'sub-m804794', 'sub-m362157', 'sub-m927055', 
'sub-m665521', 'sub-m906416', 'sub-m162129', 'sub-m709160', 'sub-m852614', 'sub-m902962', 'sub-m659205', 'sub-m843987', 'sub-m128628', 'sub-m884947', 'sub-m012474', 'sub-m053662', 
'sub-m998939', 'sub-m373162', 'sub-m711452', 'sub-m073580', 'sub-m380212', 'sub-m597981', 'sub-m116500', 'sub-m790028', 'sub-m300747', 'sub-m991840', 'sub-m987382', 'sub-m936588',
 'sub-m747612', 'sub-m854598', 'sub-m838132', 'sub-m431499', 'sub-m387058', 'sub-m737478', 'sub-m090343', 'sub-m627960', 'sub-m441629', 'sub-m339071', 'sub-m206271', 'sub-m550628', 
'sub-m472036', 'sub-m553941', 'sub-m358902', 'sub-m826180', 'sub-m491476', 'sub-m554105', 'sub-m919335', 'sub-m299563']
Number of subjects with lesion: 163

['sub-m703984', 'sub-m545591', 'sub-m052556', 'sub-m552033', 'sub-m425924', 'sub-m531168', 'sub-m508874', 'sub-m205815', 'sub-m990877', 'sub-m484245', 'sub-m757346', 'sub-m868134', 
'sub-m723132', 'sub-m738530', 'sub-m322775', 'sub-m362600', 'sub-m707812', 'sub-m463857', 'sub-m597865', 'sub-m378204', 'sub-m026506', 'sub-m818513', 'sub-m718495', 'sub-m572861', 
'sub-m563712', 'sub-m977362', 'sub-m978163', 'sub-m829931', 'sub-m991145', 'sub-m295736', 'sub-m159764', 'sub-m531317', 'sub-m158425', 'sub-m360832', 'sub-m243433', 'sub-m142435', 
'sub-m221398', 'sub-m762797', 'sub-m724575', 'sub-m786260', 'sub-m560928', 'sub-m275415', 'sub-m818091', 'sub-m808926', 'sub-m522051', 'sub-m117189', 'sub-m556439', 'sub-m774069', 
'sub-m220491', 'sub-m434248', 'sub-m916671', 'sub-m694074', 'sub-m222399', 'sub-m839135', 'sub-m350871', 'sub-m763939', 'sub-m739531', 'sub-m793289', 'sub-m205610', 'sub-m023917', 
'sub-m310073', 'sub-m778290', 'sub-m717470', 'sub-m631090', 'sub-m704693', 'sub-m354066', 'sub-m772796', 'sub-m094254', 'sub-m698534', 'sub-m063690', 'sub-m757043', 'sub-m556894', 
'sub-m595577', 'sub-m573737', 'sub-m168132', 'sub-m356340', 'sub-m356026', 'sub-m816146', 'sub-m751383', 'sub-m944619', 'sub-m663069', 'sub-m698817', 'sub-m126053', 'sub-m621782', 
'sub-m909606', 'sub-m508941', 'sub-m673334', 'sub-m785774', 'sub-m978546', 'sub-m085197', 'sub-m312155', 'sub-m492109', 'sub-m798409', 'sub-m104714', 'sub-m993488', 'sub-m751075', 
'sub-m040509', 'sub-m843491', 'sub-m949797', 'sub-m977227', 'sub-m469393', 'sub-m558234', 'sub-m474555', 'sub-m878455', 'sub-m043194', 'sub-m664123', 'sub-m527202', 'sub-m029034', 
'sub-m087754', 'sub-m545924', 'sub-m809689', 'sub-m779887', 'sub-m403171', 'sub-m275864', 'sub-m569425', 'sub-m729353', 'sub-m617186', 'sub-m701054', 'sub-m333631', 'sub-m315309', 
'sub-m027847', 'sub-m707324', 'sub-m397667', 'sub-m339845', 'sub-m941876', 'sub-m841476', 'sub-m846990', 'sub-m870870', 'sub-m251271', 'sub-m243881', 'sub-m220667', 'sub-m124504', 
'sub-m172680', 'sub-m901378', 'sub-m245390', 'sub-m886317', 'sub-m094503', 'sub-m979943', 'sub-m640779', 'sub-m493131', 'sub-m379862', 'sub-m438239', 'sub-m730546', 'sub-m762599', 
'sub-m781551', 'sub-m072533', 'sub-m189434', 'sub-m115467', 'sub-m438273', 'sub-m838420', 'sub-m986156', 'sub-m644597', 'sub-m819426', 'sub-m329161', 'sub-m479421', 'sub-m684459', 
'sub-m504077', 'sub-m157227', 'sub-m551363', 'sub-m034619', 'sub-m412427', 'sub-m037477', 'sub-m292834']

Min/max sizes of the images and labels

In order to decide on a optimal cropping size, it is useful to know what largest and smallest dimensions across all the subjects. Hence,

min along each dimension: [ 18  42 201]
max along each dimension: [160 175 713]

EDIT: Note that these dimensions correspond to the sizes of the preprocessed images that have been cropped using the spinal cord segmentation mask.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions