Skip to content

Scalability and performance #122

@manthey

Description

@manthey

We'd like to conduct an experiment to determine the effect of different CPU/GPU and memory on performance.

A proposed course of work (feedback encouraged):

We could deploy an instance of DSA on different AWS EC2 instances and compare the time for first superpixel and feature generation and the time of a few training iterations.

Possible things to compare to produce some benchmarks:

  • number of cpus or cores
  • availability of gpu (possibly try different gpu classes)
  • memory (often coupled with cpus)
  • images on local block storage versus S3
  • number of images. We could have a few test sets, maybe take a large number of images from the TCGA collection and measure the speed of different numbers of images in the different configurations. Ideally, we'd like to try sets that substantially exceed the number of cpu cores so the work can saturate the hardware. I might try powers of 2 or 4 (e.g, 1 image, 4, 16, 64, ...)

Ideally we'd have some infrastructure-as-code way to deploy this so that we can reproduce the results, at least for deploying to a specific EC2 instance style and uploading our data, even if we kick off the individual jobs manually.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions