Skip to content

[RFC] Batteries Included - Phase 3 #6323

Open
@datumbox

Description

@datumbox

🚀 The feature

Note: To track the progress of the project check out this board.

This is the 3rd phase of TorchVision's modernization project (see phase 1 and 2). We aim to keep TorchVision relevant by ensuring it provides off-the-shelf all the necessary primitives, model architectures and recipe utilities to produce SOTA results for the supported Computer Vision tasks.

1. New Primitives

To enable our users to reproduce the latest state-of-the-art research we will enhance TorchVision with the following data augmentations, layers, losses and other operators:

Data Augmentations

Losses

Operators added in PyTorch Core

2. New Architectures & Model Iterations

To ensure that our users have access to the most popular SOTA models, we will add the following architectures along with pre-trained weights:

Image Classification

Video Classification

3. Improved Training Recipes & Pre-trained models

To ensure that are users can have access to strong baselines and SOTA weights, we will improve our training recipes to incorporate the newly released primitives and offer improved pre-trained models:

Reference Scripts

Pre-trained weights

  • Improve the accuracy of Video models

Other Candidates

There are several other Operators (#5414), Losses (#2980), Augmentations (#3817) and Models (#2707) proposed by the community. Here are some potential candidates that we could implement depending on bandwidth. Contributions are welcome for any of the below:

cc @datumbox @vfdev-5

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions