Embedded Machine Learning Toolbox

The EML Toolbox is a collection of scripts by the Christian Doppler Laboratory for Embedded Machine Learning for automated and simplified training and inference of neural networks. We collect and unify scripts that help us in our everyday work to setup software and frameworks.

The purpose of using unified scripts on different hardware platforms is to solve the following task: Considering requirements regarding latency, accuracy and environmental factors, we want to find the best combination of a

Neural network
Neural network optimization
Hardware
Hardware configuration

Second, we want to estimate or measure the effect of a particular network on multiple hardware to see where it works well.

Overview

The EML Toolbox provides efficient cross-hardware measurements.

Examples:

Xavier running 3 days with 250 task spooler jobs for 33 networks
Perform inference on multiple devices in parallel

Architecture

The architecture conists of several parts:

Toolbox: It provides the infrastructure through general python scripts for data preparation, training, inference and evaluation. Further, it provides common interfaces for the hardware inference engines and model optimizers. The interfaces are then utilized by execution scripts.
Network optimizers: Plug-in Pruning- and Quantization Tools that can be applied to a network to optimize its performance for a certain hardware device
Network estimators: In a network search, estimators provide a possibility to test a certain hardware without having to implement the network on it. It saves engineering effort and makes the process faster
Hardware: The embedded hardware devices, which are connected to the toolbox and available for network tests
Hardware configuration optimizers: For each hardware, there is the possibility to setup the hardware for minimum latency, minimum power or minimum energy consumption

Implementation and Interfaces

The EML Toolbox is built around uniformity and common interfaces to minimize the need for customization.

Folder Structure of the Datasets

We use a uniform folder structure for the datasets that is compatible to the EML projects data structure. By using this structure, we can leave all relative paths in the execution scripts without any changes. Each dataset is put in a dataset root folder on a proper location.

Folder structure: link

Folder Structure for the EML Projects

We use a uniform folder structure. The advantage is that we can copy prepared scripts into this folder structure and execute without changing anything in the scripts. It reduces the engineering effort of setting up new projects. The folders has the following structure, starting from some base folder, .e.g $HOME:

./automl # Optional: AutoML EfficientDet repository
./demonstration_projects # Projects with complete scripts and data, which are used to test the environments
- ./eml_projects/[YOUR DEMO]
./eml_projects # Project folder for custom projects)
- ./eml_projects/[YOUR PROJECT1]
- ./eml_projects/[YOUR PROJECT2]
./eml-tools # Scripts base repository
./eml-tools-samples # Repository with sample projects for debugging EML Tools
./models # Optional: Tensorflow Object Detection API reporsitory, however optional
./protobuf # Optional: Protobuf as part of the Tensorflow Object Detection API
./envs # Python environments (Note 20210719: new, optional interface addition to handle multiple environments)
- ./envs/tf24 # Tensorflow 2.4 python virtual environment
- ./envs/[YOUR ENVIRONMENT]

The same structure is used for training as well as inference.

Each individial project uses the common folder structure from Template Folder Structure for Tensor Flow, which is similar to the standard Tensorflow 2 workspace.

Interface Network Folder and File Names

Much information is put into the folder name of a certain network. Many evaluation tools use this information from the position in the file name. Therefore, it is important to keep on to this conventions, in order to prevent customization of tools. The following naming convention is based on the structure of Tensorflow 2.

Each model shall be put into a separate folder. Model file names shall be kept the same, e.g. saved_model.pb, while the folder name helds information about the network.

Network folder name convention: [FRAMEWORK][NETWORKNAME][RESOLUTION_X]x[RESOLUTION_Y][DATASET][CUSTOM_PARAMETER_1][CUSTOM_PARAMETER_2]...[CUSTOM_PARAMETER_n]

[FRAMEWORK]:

cf: Caffe
tf2: Tensorflow 2
tf2oda: Tensorflow 2 Object Detection API
dk: Darknet
pt: PyTorch
tf2ke: Tensorflow 2 Keras

If no dataset is known, the following syntax is used. [DATASET] unknown: "ND"

Examples:

tf2_mobilenetV2_224x224_coco_D100
pt_refinedet_480x360_imagenet_LR03_WR04
tf2oda_ssdmobilenetv2_320x320_pedestrian

Interface Hardware Modules for Inference

For the connection of platform specific inference engines and model optimizers, we use the following hardware module interfaces: Hardware Module Interfaces

Execution Scripts

Within the common folder structure, we use an uniform execution of scripts. In the ./scripts-and-guides repository, there are two types of scripts: generalized python scripts and specialized .bat and .sh scripts. The python scripts are used equally on each hardware. The shell scripts are adapted to a certain hardware or application of the python scripts. The following types of scripts that form the toolbox core can be found here:

EML Tools Scripts:

Converters: Data conversion scripts from e.g. VOC->Coco
Data Processing Tools: Renaming tools, partitioning of images into train-validation-test sets
Hardware Modules: Common Inference scripts for each hardware, where the interface scripts often access common packages
- NVIDIA Trt: Specific inference and model conversion scripts for the NVIDIA platform
- Intel OpenVino: Specific inference and model conversion scripts for the Intel platform
- Hardware Module Interfaces: Interface definitions for the hardware to be able to connect to the EML Toolbox
Inference Evaluation Tools: General inference and model evaluation tools, e.g. TF2 inference engine
Power Measurements: Power measurement scripts
Training Tools: Tools for the use to train models on a server
Visualization: Visualization tools adapted to the interfaces for the EML Toolbox
Template Folder Structure for Tensor Flow: TF2 Template folder structure

For the implementation on target devices, each hardware module uses its own separate repository. The following hardware module repositories are available (by 2021-12-08):

Additional Toolbox Extras:

Sample Projects: To learn and debug the toolbox, sample projects are provided.

Guides how to use the Toolbox

The following guides will help the user to setup and execute the EML Toolbox.

In the following video Videolink, the tutorial guides the user on how to setup a standard workspace with a standard example (Oxford Pets) on Tensorflow 2 Object Detection API. First, the folder structure is setup. Then the images are copied. We then prepare the images for training with scripts in the EML toolbox followed by training on the server. Then, we test the model on a local PC and evaluate the results.

Datasets for Verifying the Functionlaity

On Kaggle, we published a dataset that is used to test and debug the EML Tools pipeline. This dataset is a subset of the Oxford Pets dataset (source https://www.robots.ox.ac.uk/~vgg/data/pets/). It contains the two classes cats and dogs. Faulty images have been removed. Annotations for Pascal VOC, Coco, TFRecords and Yolo are included to provide a complete dataset for testing object detection pipelines.

Link: https://www.kaggle.com/alexanderwendt/oxford-pets-cleaned-for-eml-tools

Setup Folder Structure for Inference

Go to the individual hardware repositories for detailed setup guides and setup scripts.

The complete folder structure should look like this:

Execute Inference on Intel Hardware

If everything is setup correctly and there are exported models in ./exported-models, check if paths are correct in the scripts and execute ./add_folder_inftf2_jobs.sh. The script will do the following:

Read the model names from ./exported-models
For each model name, it will create a copy of ./tf2_inf_eval_saved_model_TEMPLATE.sh and replace TEMPLATE with the model name.
Add the copied and renamed script to the task spooler
As the task starts, inference will be done on the model that is part of the file name.
Results will be written into ./results with a folder for each model and hardware as well as result files according to Hardware Module Interfaces

For further details, go to the individual hardware repositories for detailed setup guides and setup scripts.

Results

After successful inference of multiple networks and configurations on one device, you get results in a file that looks like this. The results consist of model information, measured latencies and evaluation performances like mAP and recall. With this information, visualizations can be created. Below is an example of a mAP/latency graph that compares IntelNUC CPU and GPU for different resolutions of TF2ODA SSD-MobileNetV2.

With the results, it is also possible to calculate the gain of a certain network class with a certain hardware model optimization. Below is a figure showing the relative latency and mAP of executing SSD-MobileNetV2 with different settings on Tensorflow 2 vs. executing it in OpenVino with FP16 or FP32 quantization.

Requirements for Connected Projects

The following requirements shall be implemented to be compatible to the EML Tool. If you are new to this topic, it is recommended to follow this process:

Setup the target system exactly as described in a reliable guide with their example networks
As soon as inference is possible with the standard method, then try to adapt the folder structure and the execution scripts

EML-IF 1: The training project shall be setup with a virtual environment (venv) on EDA02 for training with at least demo data. A training demo or the real project shall be able to be executed without any changes of the start script.

EML-IF 2: The following folder structure shall be used for the training and inference project unless customization is necessary: https://github.com/embedded-machine-learning/scripts-and-guides/tree/main/scripts/template_workspace

EML-IF 3: Training and optimization scripts shall have the following structure: https://github.com/embedded-machine-learning/scripts-and-guides/blob/main/scripts/training/README.md#training-files-structure

EML-IF 4: Exported models after training shall use the following naming convention: https://github.com/embedded-machine-learning/scripts-and-guides/blob/main/scripts/README.md#interface-network-folder-and-file-names

EML-IF 5: The inference project shall be setup on at least one inference device with demo or real validation data. The project shall be able to be executed without any changes of the start script.

EML-IF 6: All networks shall implement the following interface for latency measurements: https://github.com/embedded-machine-learning/scripts-and-guides/tree/main/scripts/hardwaremodules/interfaces#Interface-for-Hardware-Module-Developers

EML-IF 7: If applicable, All networks shall implement the following interface for object detection measurements: https://github.com/embedded-machine-learning/scripts-and-guides/tree/main/scripts/hardwaremodules/interfaces#Object-Detection-Interface

Upcoming

Hardware Platforms

Edge TPU USB Stick

Networks

YoloV4

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.idea		.idea
_img		_img
conversion		conversion
data_preparation		data_preparation
hardwaremodules		hardwaremodules
inference_evaluation		inference_evaluation
power_measurements		power_measurements
samples/dataset_generation		samples/dataset_generation
template_folders		template_folders
training		training
visualization/PlotNeuralNet		visualization/PlotNeuralNet
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Embedded Machine Learning Toolbox

Overview

Architecture

Implementation and Interfaces

Folder Structure of the Datasets

Folder Structure for the EML Projects

Interface Network Folder and File Names

Interface Hardware Modules for Inference

Execution Scripts

Guides how to use the Toolbox

Datasets for Verifying the Functionlaity

Setup Folder Structure for Inference

Execute Inference on Intel Hardware

Results

Requirements for Connected Projects

Upcoming

Hardware Platforms

Networks

About

Uh oh!

Releases

Packages

Languages

License

embedded-machine-learning/eml-tools

Folders and files

Latest commit

History

Repository files navigation

Embedded Machine Learning Toolbox

Overview

Architecture

Implementation and Interfaces

Folder Structure of the Datasets

Folder Structure for the EML Projects

Interface Network Folder and File Names

Interface Hardware Modules for Inference

Execution Scripts

Guides how to use the Toolbox

Datasets for Verifying the Functionlaity

Setup Folder Structure for Inference

Execute Inference on Intel Hardware

Results

Requirements for Connected Projects

Upcoming

Hardware Platforms

Networks

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages