Add missing training support to onert Python API (experimental module)#15175
Merged
chunseoklee merged 2 commits intoSamsung:masterfrom Apr 17, 2025
Merged
Add missing training support to onert Python API (experimental module)#15175chunseoklee merged 2 commits intoSamsung:masterfrom
chunseoklee merged 2 commits intoSamsung:masterfrom
Conversation
This commit integrate previously omitted modifications for training support in the onert Python API. - Expose new experimental training functionalities by updating the package’s public API: - Modified `__init__.py` to include the `experimental` submodule. - Added the experimental module, which imports train components and exposes `TrainSession`, `traininfo`, `DataLoader`, `optimizer`, `losses`, and `metrics`. - Implemented a flexible DataLoader in the experimental training module: - Supports input from file paths or NumPy arrays. - Handles loading of both .npy and raw binary files with configurable shapes and data types. - Includes batching logic and a split method for training/validation separation. - Improved training compiler behavior in `TrainingCompiler.cc`: - Adjusted the shape validation to accept unspecified dimensions (using `ir::Shape::kUnspecifiedDim`) in addition to dimensions of value 1. ONE-DCO-1.0-Signed-off-by: ragmani <ragmani0216@gmail.com>
Contributor
Author
Contributor
Author
|
I tested with python samples. python3 runtime/onert/sample/minimal-python/experimental/src/train_with_dataset.py -m mobilenetv2 -i out/imagenet_a.test.input.100.bin -l out/imagenet_a.test.output.100.bin --data_length 100 --optimizer adam --loss cce --learning_rate 0.01 --batch_size 10 --validation_split=0.2
Load data
== training parameter ==
- learning_rate = 0.01
- batch_size = 10
- loss_info = {loss = CategoricalCrossentropy, reduction = sum over batch size}
- optimizer = Adam
- num_of_trainable_ops = -1
========================
Epoch 1/5 - Train time: 704.429ms/step - IO time: 0.067ms/step - Train Loss: 10.7749 - Validation Loss: 10.1255 - CategoricalAccuracy: 0.0000
Epoch 2/5 - Train time: 679.521ms/step - IO time: 0.059ms/step - Train Loss: 6.1418 - Validation Loss: 12.0664 - CategoricalAccuracy: 0.0000
Epoch 3/5 - Train time: 712.286ms/step - IO time: 0.060ms/step - Train Loss: 5.7052 - Validation Loss: 14.5072 - CategoricalAccuracy: 0.0000
Epoch 4/5 - Train time: 741.144ms/step - IO time: 0.056ms/step - Train Loss: 5.4454 - Validation Loss: 15.3301 - CategoricalAccuracy: 0.0000
Epoch 5/5 - Train time: 771.123ms/step - IO time: 0.073ms/step - Train Loss: 6.6274 - Validation Loss: 17.4566 - CategoricalAccuracy: 0.0000
===================================
MODEL_LOAD takes 6.9752 ms
COMPILE takes 274.4315 ms
EXECUTE takes 29863.0246 ms
- Epoch 1 takes 5829.0490 ms
- Epoch 2 takes 5635.3495 ms
- Epoch 3 takes 5896.5318 ms
- Epoch 4 takes 6126.5195 ms
- Epoch 5 takes 6375.5748 ms
===================================
nnpackage mobilenetv2 trains successfully.python3 runtime/onert/sample/minimal-python/experimental/src/train_step_with_dataset.py -m mobilenetv2 -i out/imagenet_a.test.input.100.bin -l out/imagenet_a.test.output.100.bin --data_length 100 --optimizer adam --loss cce --learning_rate 0.01 --batch_size 10
Load data
== training parameter ==
- learning_rate = 0.01
- batch_size = 10
- loss_info = {loss = CategoricalCrossentropy, reduction = sum over batch size}
- optimizer = Adam
- num_of_trainable_ops = -1
========================
Step 1/10 - Train time: 704.106 ms/step - Train Loss: 9.0140
Step 2/10 - Train time: 710.967 ms/step - Train Loss: 21.7883
Step 3/10 - Train time: 700.940 ms/step - Train Loss: 6.7287
Step 4/10 - Train time: 701.534 ms/step - Train Loss: 8.3510
Step 5/10 - Train time: 704.686 ms/step - Train Loss: 9.8825
Step 6/10 - Train time: 706.220 ms/step - Train Loss: 8.4540
Step 7/10 - Train time: 705.723 ms/step - Train Loss: 11.6198
Step 8/10 - Train time: 709.600 ms/step - Train Loss: 10.3613
Step 9/10 - Train time: 712.440 ms/step - Train Loss: 9.9402
Step 10/10 - Train time: 712.583 ms/step - Train Loss: 9.4323
===================================
Average Loss: 10.5572
CategoricalAccuracy: 0.0000
Average Time: 706.8799 ms/step
===================================
nnpackage mobilenetv2 trains successfully. |
Contributor
|
@zetwhite Could you please review this PR? |
zetwhite
reviewed
Apr 17, 2025
zetwhite
reviewed
Apr 17, 2025
0e99b8e to
c330797
Compare
ragmani
commented
Apr 17, 2025
Comment on lines
+142
to
+147
| array = np.frombuffer(data, dtype=dtype) | ||
| if array.size != expected_elements: | ||
| raise ValueError( | ||
| f"Raw data size does not match the expected shape: {shape}. " | ||
| f"Expected {expected_elements} elements, got {array.size} elements.") | ||
| return array.reshape(shape) |
Contributor
Author
There was a problem hiding this comment.
I only keep f.read() in the with so the file is open just to load the bytes. Once data is in memory, np.frombuffer, size checks, and reshape work on that buffer(no open file needed) so they live outside the with.
zetwhite
approved these changes
Apr 17, 2025
chunseoklee
reviewed
Apr 17, 2025
chunseoklee
approved these changes
Apr 17, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit integrate previously omitted modifications for training support in the onert Python API.
__init__.pyto include theexperimentalsubmodule.TrainSession,traininfo,DataLoader,optimizer,losses, andmetrics.TrainingCompiler.cc:ir::Shape::kUnspecifiedDim) in addition to dimensions of value 1.ONE-DCO-1.0-Signed-off-by: ragmani ragmani0216@gmail.com