-
Notifications
You must be signed in to change notification settings - Fork 3
CapsNet
Jhalak Patel edited this page Nov 5, 2017
·
1 revision
- Due to Subsampling or pooling - spatial correlation between images and pixels is lost.
- Convolution networks are bad if the image is in any different position, rotated or upside down -- thus need for data augmentation as proposed by AlexNet
- invariance vs equivariance
- The goal of subsampling or pooling was to make the network invariant to small changes in the spatial locality.
- better to aim for equivariance - if we rotate the image, the network should learn the change
- Idea: Human brain should achieve translation invariance in a better way i.e. better than pooling
- Brain has different modules - called as Capsules which can handle a different kind of stimulus
- In CNN, routing through the network is done through pooling - we have Convolution, Non-Linearity and the routing through pooling
- Better way to route data -
- Basic Idea: In place of adding another layer, lets nest another layer within a layer. That nested layer is called Capsule - which is a group of neurons
- Thus instead of making network deeper, make the network deep in terms of nesting or inner structure.
- The Capsule based model is more robust to translation and rotation.
- Layer based Squashing:
- In typical NN, only output is squashed using ReLU or non-linear layer.
- In place applying Non-Linearity to a single neuron we can group the neurons into a capsule and apply Non-Linearity on a capsule
- Dynamic Routing
- Replace scalar output feature detectors with vector output capsules
- Replaces MaxPooling by Routing by Agreement