-
-
Couldn't load subscription status.
- Fork 2
Description
Abstract: build a trainable warp pipeline and apply it to image processing and computer vision fields.
author: @SomTambe
Mentors: @johnnychen94, and @DhairyaLGandhi
Introduction
A backward-mode warp operation involves two functions inside: an backward coordinate map ϕ and value estimator τ. Warp operation is applied pointwisely: Y[p] = τ(ϕ(p), X).
Depends on whether we make them trainable or not, there are four options:
- [Option 1] ϕ and τ are not trainable
- [Option 2] ϕ is trainable and τ is not trainable
- [Option 3] ϕ is not trainable and τ is trainable
- [Option 4] ϕ and τ are trainable
Option 1 is the typical and classical case: ϕ is build heuristically and τ is typically one interpolation/extrapolation.
To make option 2, 3, and 4 work, we need to backprop three gradients: ∂L/∂Wϕ, ∂L/∂Wτ, and ∂L/∂X.
Applications
These are just very simple and naive ideas, there are more possibilities and there will be more issues that need to solve.
Registration and mapping
This is a direction application of option 2
Idea: instead of using predefined transformation ϕ, we can train a small network to do the job
Issues that need to identify and address:
- Real distortions are usually non-linear and elastic, so using one linear transformation can't do the job. How do we design the forward pass of ϕ to overfit the model for real-world images?
- How to train it on batches?
- How many parameters do we need, and how's the performance in both loss and running time?
Super resolution
This is a direct application of the option 3.
Idea: Instead of using any fixed weight interpolation function τ, we can train a very very small network to do the job.
Issues that need to identify and address:
- The performance: unlike typical CNN network, the interpolation network τ requires
length(Y)times forward pass to generate entire image, which is gigantic. - How to train it on batches.
Roadmap
This roadmap is organized very coarsely. How we approach each task belongs to other more specific issues.
Framework support:
- AD support to
warpin our playground DiffImages. This is a must before we approaching to concrete applications. - move associated adjoints to upstream packages:
ImageCore,ImageTransformations,Interpolationsand others. This is for maintenance purpose so we can leave it as a post cleanup work.
Applications:
- image registration or other similar project to apply our option 2 idea
- super resolution or other similar project to apply our option 3 idea
We should treat our applications a separate project so every application idea shoauld be managed in one repo with associated Project.toml and Manifest.toml.
Timeline and evaluation
Our progress is currently delayed a lot due to the unfamiliarity with AD and warp mechanisms. It's hard to make a strict timeline for research projects, so this is a loose one for reference purposes:
- We should make our
warpAD work before Aug 7. This is the main focus of the mid-term evaluation. - build one small demo for each application idea before the final evaluation.
- making it a real work(i.e., beating the existing typical CNN-based heavy model) leaves it to future work. They do not belong to the scope of this JSoC project evaluation.