[FEATURE]: Integration of M5 Network Architecture for Audio Processing

### Feature Name

To enhance the Deep Learning Playground's audio data processing capabilities, we aim to integrate the M5 network architecture, inspired by the M5 network. This architecture is crucial for processing raw audio data efficiently, especially focusing on the receptive field of the first layer's filters.

### Your Name

Surya Subramanian

### Description

We are currently working on creating an audio trainspace in our deep learning playground. As part of this, we need to integrate a convolutional neural network to process raw audio data. The specific architecture we are looking to implement is modeled after the M5 network architecture, which is described in detail in this paper: https://arxiv.org/pdf/1610.00087.pdf.

Here is the Python code for the M5 network architecture:

(also available here: https://colab.research.google.com/github/pytorch/tutorials/blob/gh-pages/_downloads/audio_classifier_tutorial.ipynb#scrollTo=iXUe9kHdcV16)

```
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv1d(1, 128, 80, 4)
        self.bn1 = nn.BatchNorm1d(128)
        self.pool1 = nn.MaxPool1d(4)
        self.conv2 = nn.Conv1d(128, 128, 3)
        self.bn2 = nn.BatchNorm1d(128)
        self.pool2 = nn.MaxPool1d(4)
        self.conv3 = nn.Conv1d(128, 256, 3)
        self.bn3 = nn.BatchNorm1d(256)
        self.pool3 = nn.MaxPool1d(4)
        self.conv4 = nn.Conv1d(256, 512, 3)
        self.bn4 = nn.BatchNorm1d(512)
        self.pool4 = nn.MaxPool1d(4)
        self.avgPool = nn.AvgPool1d(30) #input should be 512x30 so this outputs a 512x1
        self.fc1 = nn.Linear(512, 10)
        
    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(self.bn1(x))
        x = self.pool1(x)
        x = self.conv2(x)
        x = F.relu(self.bn2(x))
        x = self.pool2(x)
        x = self.conv3(x)
        x = F.relu(self.bn3(x))
        x = self.pool3(x)
        x = self.conv4(x)
        x = F.relu(self.bn4(x))
        x = self.pool4(x)
        x = self.avgPool(x)
        x = x.permute(0, 2, 1) #change the 512x1 to 1x512
        x = self.fc1(x)
        return F.log_softmax(x, dim = 2)
```

The task is to integrate this model into our training directory. The model should be callable from the audio.py route. The files training/core/training.py and training/core/dmodel.py might be useful for this integration. 

This is kind of open-ended so feel free to play around with this! Lmk if you have any questions. 

So in summary:

### Objectives

- **Neural Network Integration:** Implement the M5 network architecture within the Deep Learning Playground's training module to handle audio data processing effectively.
  
- **Model Components:** Define the neural network layers including convolutional, batch normalization, pooling, and fully connected layers as per the M5 architecture specifications.
  
- **Compatibility:** Ensure seamless integration of the M5 network architecture with the existing training pipeline and trainers in the audio route (`audio.py`).

### Implementation Details

- Create a new neural network class `Net` in `training/core/dl_model.py` with the M5 network architecture specifications.
  
- Include layers for convolution, batch normalization, pooling, and fully connected layers as described in the M5 architecture.
  
- Update the training pipeline in `training/core/training.py` to utilize the `Net` model for audio data processing tasks.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE]: Integration of M5 Network Architecture for Audio Processing #1161

Feature Name

Your Name

Description

Objectives

Implementation Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE]: Integration of M5 Network Architecture for Audio Processing #1161

Description

Feature Name

Your Name

Description

Objectives

Implementation Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions