Skip to content

JSoC 2021: warp AD and its application -- the roadmap #11

@johnnychen94

Description

@johnnychen94

Abstract: build a trainable warp pipeline and apply it to image processing and computer vision fields.

author: @SomTambe

Mentors: @johnnychen94, and @DhairyaLGandhi

Introduction

A backward-mode warp operation involves two functions inside: an backward coordinate map ϕ and value estimator τ. Warp operation is applied pointwisely: Y[p] = τ(ϕ(p), X).

Depends on whether we make them trainable or not, there are four options:

  • [Option 1] ϕ and τ are not trainable
  • [Option 2] ϕ is trainable and τ is not trainable
  • [Option 3] ϕ is not trainable and τ is trainable
  • [Option 4] ϕ and τ are trainable

Option 1 is the typical and classical case: ϕ is build heuristically and τ is typically one interpolation/extrapolation.

To make option 2, 3, and 4 work, we need to backprop three gradients: ∂L/∂Wϕ, ∂L/∂Wτ, and ∂L/∂X.

Applications

These are just very simple and naive ideas, there are more possibilities and there will be more issues that need to solve.

Registration and mapping

This is a direction application of option 2

Idea: instead of using predefined transformation ϕ, we can train a small network to do the job

Issues that need to identify and address:

  • Real distortions are usually non-linear and elastic, so using one linear transformation can't do the job. How do we design the forward pass of ϕ to overfit the model for real-world images?
  • How to train it on batches?
  • How many parameters do we need, and how's the performance in both loss and running time?

Super resolution

This is a direct application of the option 3.

Idea: Instead of using any fixed weight interpolation function τ, we can train a very very small network to do the job.

Issues that need to identify and address:

  • The performance: unlike typical CNN network, the interpolation network τ requires length(Y) times forward pass to generate entire image, which is gigantic.
  • How to train it on batches.

Roadmap

This roadmap is organized very coarsely. How we approach each task belongs to other more specific issues.

Framework support:

  • AD support to warp in our playground DiffImages. This is a must before we approaching to concrete applications.
  • move associated adjoints to upstream packages: ImageCore, ImageTransformations, Interpolations and others. This is for maintenance purpose so we can leave it as a post cleanup work.

Applications:

  • image registration or other similar project to apply our option 2 idea
  • super resolution or other similar project to apply our option 3 idea

We should treat our applications a separate project so every application idea shoauld be managed in one repo with associated Project.toml and Manifest.toml.

Timeline and evaluation

Our progress is currently delayed a lot due to the unfamiliarity with AD and warp mechanisms. It's hard to make a strict timeline for research projects, so this is a loose one for reference purposes:

  • We should make our warp AD work before Aug 7. This is the main focus of the mid-term evaluation.
  • build one small demo for each application idea before the final evaluation.
  • making it a real work(i.e., beating the existing typical CNN-based heavy model) leaves it to future work. They do not belong to the scope of this JSoC project evaluation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions