pytorch.org/tutorials/beginner/basics/intro
- Consistent
- Automates
- Overall the same now - used to be that pytorch was easy to get up and tensor flow was good for production
- Pytorch is more cutting edge - more pythonic

-
Multi-dimensional object
-
Documentation: https://pytorch.org/docs/stable/nn.html
-
- Contain Tensors

- Containers
- Convolution Layers
- Pooling layers
- Padding Layers
- Non-linear Activations (weighted sum, nonlinearity)
- Non-linear Activations (other)
- Normalization Layers
- Recurrent Layers
- Transformer Layers
- Linear Layers
- Dropout Layers
- Sparse Layers
- Distance Functions
- Loss Functions
- Vision Layers
- Shuffle Layers
- DataParallel Layers (multi-GPU, distributed)
- Utilities
- Quantized Functions
- Lazy Modules Initialization
- A layer is not just the tensor, it is also the gradient. That is why you cannot switch back and forth to numpy from the tensor
- Relu turns everything less than 0 to 0





