|
| 1 | +# Collaboratively Train Yolo-v5 Using MistNet on COCO128 Dataset |
| 2 | + |
| 3 | +This case introduces how to train a federated learning job with an aggregation algorithm named MistNet in MNIST |
| 4 | +handwritten digit classification scenario. Data is scattered in different places (such as edge nodes, cameras, and |
| 5 | +others) and cannot be aggregated at the server due to data privacy and bandwidth. As a result, we cannot use all the |
| 6 | +data for training. In some cases, edge nodes have limited computing resources and even have no training capability. The |
| 7 | +edge cannot gain the updated weights from the training process. Therefore, traditional algorithms (e.g., federated |
| 8 | +average), which usually aggregate the updated weights trained by different edge clients, cannot work in this scenario. |
| 9 | +MistNet is proposed to address this issue. |
| 10 | + |
| 11 | +MistNet partitions a DNN model into two parts, a lightweight feature extractor at the edge side to generate meaningful |
| 12 | +features from the raw data, and a classifier including the most model layers at the cloud to be iteratively trained for |
| 13 | +specific tasks. MistNet achieves acceptable model utility while greatly reducing privacy leakage from the released |
| 14 | +intermediate features. |
| 15 | + |
| 16 | +## Object Detection Experiment |
| 17 | + |
| 18 | +> Assume that there are two edge nodes and a cloud node. Data on the edge nodes cannot be migrated to the cloud due to privacy issues. |
| 19 | +> Base on this scenario, we will demonstrate the mnist example. |
| 20 | +
|
| 21 | +### Prepare Nodes |
| 22 | + |
| 23 | +``` |
| 24 | +CLOUD_NODE="cloud-node-name" |
| 25 | +EDGE1_NODE="edge1-node-name" |
| 26 | +EDGE2_NODE="edge2-node-name" |
| 27 | +``` |
| 28 | + |
| 29 | +### Install Sedna |
| 30 | + |
| 31 | +Follow the [Sedna installation document](/docs/setup/install.md) to install Sedna. |
| 32 | + |
| 33 | +### Prepare Dataset |
| 34 | + |
| 35 | +Download [dataset](https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip) and do data partition |
| 36 | + |
| 37 | +``` |
| 38 | +wget https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip |
| 39 | +unzip coco128.zip -d data |
| 40 | +rm coco128.zip |
| 41 | +python partition.py ./data 2 |
| 42 | +``` |
| 43 | + |
| 44 | +move ```./data/1``` to `/data` of ```EDGE1_NODE```. |
| 45 | + |
| 46 | +``` |
| 47 | +mkdir -p /data |
| 48 | +cd /data |
| 49 | +mv ./data/1 ./ |
| 50 | +``` |
| 51 | + |
| 52 | +move ```./data/2``` to `/data` of ```EDGE2_NODE```. |
| 53 | + |
| 54 | +``` |
| 55 | +mkdir -p /data |
| 56 | +cd /data |
| 57 | +mv ./data/2 ./ |
| 58 | +``` |
| 59 | + |
| 60 | +### Prepare Images |
| 61 | + |
| 62 | +This example uses these images: |
| 63 | + |
| 64 | +1. aggregation worker: ```kubeedge/sedna-example-federated-learning-mistnet:v0.3.0``` |
| 65 | +2. train worker: ```kubeedge/sedna-example-federated-learning-mistnet-client:v0.3.0``` |
| 66 | + |
| 67 | +These images are generated by the script [build_images.sh](/examples/build_image.sh). |
| 68 | + |
| 69 | +### Create Federated Learning Job |
| 70 | + |
| 71 | +#### Create Dataset |
| 72 | + |
| 73 | +create dataset for `$EDGE1_NODE` |
| 74 | + |
| 75 | +```n |
| 76 | +kubectl create -f - <<EOF |
| 77 | +apiVersion: sedna.io/v1alpha1 |
| 78 | +kind: Dataset |
| 79 | +metadata: |
| 80 | + name: "coco-dataset" |
| 81 | +spec: |
| 82 | + url: "/data/test.txt" |
| 83 | + format: "txt" |
| 84 | + nodeName: edge-node |
| 85 | +EOF |
| 86 | +``` |
| 87 | + |
| 88 | +create dataset for `$EDGE2_NODE` |
| 89 | + |
| 90 | +``` |
| 91 | +kubectl create -f - <<EOF |
| 92 | +apiVersion: sedna.io/v1alpha1 |
| 93 | +kind: Dataset |
| 94 | +metadata: |
| 95 | + name: "coco-dataset" |
| 96 | +spec: |
| 97 | + url: "/data/test.txt" |
| 98 | + format: "txt" |
| 99 | + nodeName: edge-node |
| 100 | +EOF |
| 101 | +``` |
| 102 | + |
| 103 | +#### Create Model |
| 104 | + |
| 105 | +create the directory `/model` in the host of `$EDGE1_NODE` |
| 106 | + |
| 107 | +``` |
| 108 | +mkdir /model |
| 109 | +``` |
| 110 | + |
| 111 | +create the directory `/model` in the host of `$EDGE2_NODE` |
| 112 | + |
| 113 | +``` |
| 114 | +mkdir /model |
| 115 | +``` |
| 116 | + |
| 117 | +``` |
| 118 | +TODO: put pretrained model on nodes. |
| 119 | +``` |
| 120 | + |
| 121 | +create model |
| 122 | + |
| 123 | +``` |
| 124 | +kubectl create -f - <<EOF |
| 125 | +apiVersion: sedna.io/v1alpha1 |
| 126 | +kind: Model |
| 127 | +metadata: |
| 128 | + name: "yolo-v5-model" |
| 129 | +spec: |
| 130 | + url: "/model/yolo.pb" |
| 131 | + format: "pb" |
| 132 | +EOF |
| 133 | +``` |
| 134 | + |
| 135 | +#### Start Federated Learning Job |
| 136 | + |
| 137 | +``` |
| 138 | +kubectl create -f - <<EOF |
| 139 | +apiVersion: sedna.io/v1alpha1 |
| 140 | +kind: FederatedLearningJob |
| 141 | +metadata: |
| 142 | + name: mistnet-on-mnist-dataset |
| 143 | +spec: |
| 144 | + stopCondition: |
| 145 | + operator: "or" # and |
| 146 | + conditions: |
| 147 | + - operator: ">" |
| 148 | + threshold: 100 |
| 149 | + metric: rounds |
| 150 | + - operator: ">" |
| 151 | + threshold: 0.95 |
| 152 | + metric: targetAccuracy |
| 153 | + - operator: "<" |
| 154 | + threshold: 0.03 |
| 155 | + metric: deltaLoss |
| 156 | + aggregationTrigger: |
| 157 | + condition: |
| 158 | + operator: ">" |
| 159 | + threshold: 5 |
| 160 | + metric: num_of_ready_clients |
| 161 | + aggregationWorker: |
| 162 | + model: |
| 163 | + name: "mistnet-on-mnist-model" |
| 164 | + template: |
| 165 | + spec: |
| 166 | + nodeName: $CLOUD_NODE |
| 167 | + containers: |
| 168 | + - image: kubeedge/sedna-example-federated-learning-mistnet-on-mnist-dataset-aggregation:v0.4.0 |
| 169 | + name: agg-worker |
| 170 | + imagePullPolicy: IfNotPresent |
| 171 | + env: # user defined environments |
| 172 | + - name: "cut_layer" |
| 173 | + value: "4" |
| 174 | + - name: "epsilon" |
| 175 | + value: "100" |
| 176 | + - name: "aggregation_algorithm" |
| 177 | + value: "mistnet" |
| 178 | + - name: "batch_size" |
| 179 | + value: "10" |
| 180 | + resources: # user defined resources |
| 181 | + limits: |
| 182 | + memory: 2Gi |
| 183 | + trainingWorkers: |
| 184 | + - dataset: |
| 185 | + name: "edge1-surface-defect-detection-dataset" |
| 186 | + template: |
| 187 | + spec: |
| 188 | + nodeName: $EDGE1_NODE |
| 189 | + containers: |
| 190 | + - image: kubeedge/sedna-example-federated-learning-mistnet-on-mnist-dataset-train:v0.4.0 |
| 191 | + name: train-worker |
| 192 | + imagePullPolicy: IfNotPresent |
| 193 | + env: # user defined environments |
| 194 | + - name: "batch_size" |
| 195 | + value: "32" |
| 196 | + - name: "learning_rate" |
| 197 | + value: "0.001" |
| 198 | + - name: "epochs" |
| 199 | + value: "2" |
| 200 | + resources: # user defined resources |
| 201 | + limits: |
| 202 | + memory: 2Gi |
| 203 | + - dataset: |
| 204 | + name: "edge2-surface-defect-detection-dataset" |
| 205 | + template: |
| 206 | + spec: |
| 207 | + nodeName: $EDGE2_NODE |
| 208 | + containers: |
| 209 | + - image: kubeedge/sedna-example-federated-learning-mistnet-on-mnist-dataset-train:v0.4.0 |
| 210 | + name: train-worker |
| 211 | + imagePullPolicy: IfNotPresent |
| 212 | + env: # user defined environments |
| 213 | + - name: "batch_size" |
| 214 | + value: "32" |
| 215 | + - name: "learning_rate" |
| 216 | + value: "0.001" |
| 217 | + - name: "epochs" |
| 218 | + value: "2" |
| 219 | + resources: # user defined resources |
| 220 | + limits: |
| 221 | + memory: 2Gi |
| 222 | +EOF |
| 223 | +``` |
| 224 | + |
| 225 | +``` |
| 226 | +TODO: show the benifit of mistnet. for example, the compared results of fedavg & mistnet. |
| 227 | +
|
| 228 | +``` |
| 229 | + |
| 230 | +### Check Federated Learning Status |
| 231 | + |
| 232 | +``` |
| 233 | +kubectl get federatedlearningjob surface-defect-detection |
| 234 | +``` |
| 235 | + |
| 236 | +### Check Federated Learning Train Result |
| 237 | + |
| 238 | +After the job completed, you will find the model generated on the directory `/model` in `$EDGE1_NODE` and `$EDGE2_NODE`. |
0 commit comments