Scalability and performance

We'd like to conduct an experiment to determine the effect of different CPU/GPU and memory on performance.

A proposed course of work (feedback encouraged):

We could deploy an instance of DSA on different AWS EC2 instances and compare the time for first superpixel and feature generation and the time of a few training iterations.  

Possible things to compare to produce some benchmarks:
- number of cpus or cores 
- availability of gpu (possibly try different gpu classes)
- memory (often coupled with cpus)
- images on local block storage versus S3
- number of images.  We could have a few test sets, maybe take a large number of images from the TCGA collection and measure the speed of different numbers of images in the different configurations.  Ideally, we'd like to try sets that substantially exceed the number of cpu cores so the work can saturate the hardware.  I might try powers of 2 or 4 (e.g, 1 image, 4, 16, 64, ...)

Ideally we'd have some infrastructure-as-code way to deploy this so that we can reproduce the results, at least for deploying to a specific EC2 instance style and uploading our data, even if we kick off the individual jobs manually.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scalability and performance #122

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scalability and performance #122

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions