The AclNet-int8 model is quantized and fine-tuned with NNCF variant of AclNet model, which is designed to perform sound classification.
The AclNet-int8 model is trained on an internal dataset of environmental sounds for 53 different classes, listed in file <omz_dir>/data/dataset_classes/aclnet_53cl.txt.
For details about the model, see this paper.
The model input is a segment of PCM audio samples in N, C, 1, L format.
The model output for AclNet-int8 is the sound classifier output for the 53 different environmental sound classes from the internal sound database.
| Metric | Value |
|---|---|
| Type | Classification |
| GFLOPs | 2.71 |
| MParams | 1.41 |
| Source framework | PyTorch* |
| Metric | Value |
|---|---|
| Top 1 | 87.1% |
| Top 5 | 93.0% |
Metrics were computed on internal validation dataset according to following publication and paper.
Audio, name - result.1, shape - 1, 1, 1, L, format is N, C, 1, L, where:
N- batch sizeC- channelL- number of PCM samples (minimum value is 16000)
Audio, name - result.1, shape - 1, 1, 1, L, format is N, C, 1, L, where:
N- batch sizeC- channelL- number of PCM samples (minimum value is 16000)
Sound classifier (see labels file, <omz_dir>/data/dataset_classes/aclnet_53cl.txt), name - 486, shape - 1, 53, output data format is N, C, where:
N- batch sizeC- predicted softmax scores for each class in [0, 1] range
Sound classifier (see labels file, <omz_dir>/data/dataset_classes/aclnet_53cl.txt), name - 486, shape - 1, 53, output data format is N, C, where:
N- batch sizeC- predicted softmax scores for each class in [0, 1] range
You can download models and if necessary convert them into OpenVINO™ IR format using the Model Downloader and other automation tools as shown in the examples below.
An example of using the Model Downloader:
omz_downloader --name <model_name>
An example of using the Model Converter:
omz_converter --name <model_name>
The model can be used in the following demos provided by the Open Model Zoo to show its capabilities:
The original model is distributed under Apache License, Version 2.0.