|
112 | 112 | - [What each stage contributes](#what-each-stage-contributes) |
113 | 113 | - [Presentation guide](#presentation-guide) |
114 | 114 | - [Final note](#final-note) |
115 | | - |
116 | | ---- |
117 | | - |
118 | | -## Project overview |
119 | | - |
120 | | -ROBOT Sentinel is a distributed systems project that simulates predictive diagnostics in a Factory 4.0 environment. In this solution, robot clients send telemetry to a central TCP/IP server, which processes the data with a trained Machine Learning model and returns a diagnostic result in real time. |
121 | | - |
122 | | -The repository is intentionally organized into **three evolutionary stages**, showing the progression from exploratory experimentation to a cleaner socket architecture and finally to the integrated version delivered in class. The final version combines networking, concurrency and intelligent diagnosis in a centralized server model. |
123 | | - |
124 | | ---- |
125 | | - |
126 | | -## Scenario and motivation |
127 | | - |
128 | | -The scenario represents an industrial environment in which dozens of robots operate with high precision and downtime is expensive. The main idea is to centralize intelligence in a more powerful server instead of making each robot process diagnostics locally. |
129 | | - |
130 | | -This design creates three core technical challenges: handling many simultaneous network connections, performing accurate real-time anomaly detection, and protecting shared counters or robot state from race conditions. These three concerns are directly reflected in the project architecture and code evolution. |
131 | | - |
132 | | ---- |
133 | | - |
134 | | -## Repository structure |
135 | | - |
136 | | -```text |
137 | | -. |
138 | | -├── 1_Robot_Fabi_Exploratory/ |
139 | | -│ ├── train_model.py |
140 | | -│ ├── servidor_central.py |
141 | | -│ └── no_sensor.py |
142 | | -│ |
143 | | -├── 2_Robot_Pedro_Exploratory/ |
144 | | -│ ├── robot_server.py |
145 | | -│ └── robot_client.py |
146 | | -│ |
147 | | -└── 3-ROBOT_FINAL/ |
148 | | - ├── Bot Status Identificator.pkl |
149 | | - ├── robot_server.py |
150 | | - └── robot_client.py |
151 | | -``` |
152 | | - |
153 | | - |
154 | | ---- |
155 | | - |
156 | | -## Project stages overview |
157 | | - |
158 | | -| Stage / Folder | Focus | Key files | Main concepts | |
159 | | -| :-- | :-- | :-- | :-- | |
160 | | -| `1_Robot_Fabi_Exploratory` | Full exploratory prototype | `train_model.py`, `servidor_central.py`, `no_sensor.py` | Model training, JSON communication, TCP server, `Lock`-protected global alert counter, simulated robot client with reconnection logic | |
161 | | -| `2_Robot_Pedro_Exploratory` | Cleaner socket base | `robot_server.py`, `robot_client.py` | Simplified client–server separation, cleaner TCP skeleton, architectural refactoring for later expansion | |
162 | | -| `3-ROBOT_FINAL` | Final integrated solution | `robot_server.py`, `robot_client.py`, `Bot Status Identificator.pkl` | Centralized ML inference, shared robot registry, safer concurrent access, operational command protocol, clean session termination | |
163 | | - |
164 | | - |
165 | | ---- |
166 | | - |
167 | | -## Technical foundations |
168 | | - |
169 | | -### TCP/IP communication |
170 | | - |
171 | | -The project uses **TCP/IP** because the system depends on reliable and ordered communication between the robots and the central server. In a predictive diagnostics scenario, message loss or disorder would compromise the returned diagnosis and the integrity of the shared state maintained by the server. |
172 | | - |
173 | | -This makes TCP a better fit than UDP for this assignment. The system is designed around request–response behavior and consistent server-side processing, not around best-effort or low-overhead delivery. |
174 | | - |
175 | | -### Multithreading |
176 | | - |
177 | | -One of the main requirements is that the server must handle multiple simultaneous network flows. The project addresses that by using threads so each client connection can be processed independently and the whole server does not stop when one robot is active. |
178 | | - |
179 | | -This design is fundamental in distributed systems because it allows concurrent client handling and better reflects what a real central industrial node would do when many robots send data at the same time. |
180 | | - |
181 | | -### Synchronization and shared state |
182 | | - |
183 | | -Concurrency creates the need for synchronization. In the first exploratory implementation, the server maintains a global alert counter and protects updates with `threading.Lock()`, preventing failures in concurrent counting. |
184 | | - |
185 | | -This directly addresses the race-condition problem described in the briefing. Later, in the final stage, that same idea evolves into a more structured shared robot registry with synchronized access to preserve consistency under concurrent operations. |
186 | | - |
187 | | -### Machine Learning inference |
188 | | - |
189 | | -The server should act as a centralized classifier, receiving robot features, preprocessing them, running inference and returning a diagnosis. That architecture appears clearly in the exploratory implementation and remains central to the final design. |
190 | | - |
191 | | -The training script builds a dataset with `temperatura`, `vibracao` and `rpm`, trains a `RandomForestClassifier`, prints a classification report and serializes the model as a `.pkl` file for later use by the server. This allows the model to be loaded at runtime without retraining. |
192 | | - |
193 | | ---- |
194 | | - |
195 | | -## Linux network tips |
196 | | - |
197 | | -Before testing the project, it is useful to verify local network information, port listening state and loopback connectivity. |
198 | | - |
199 | | -### Show IP addresses |
200 | | - |
201 | | -```bash |
202 | | -ip addr |
203 | | -``` |
204 | | - |
205 | | - |
206 | | -### Show listening ports |
207 | | - |
208 | | -```bash |
209 | | -ss -tuln |
210 | | -``` |
211 | | - |
212 | | - |
213 | | -### Show listening ports with processes |
214 | | - |
215 | | -```bash |
216 | | -sudo ss -tulpn |
217 | | -``` |
218 | | - |
219 | | - |
220 | | -### Test loopback connectivity |
221 | | - |
222 | | -```bash |
223 | | -ping localhost |
224 | | -ping 127.0.0.1 |
225 | | -``` |
226 | | - |
227 | | - |
228 | | -### Quick summary |
229 | | - |
230 | | -| Goal | Command | |
231 | | -| :-- | :-- | |
232 | | -| Show local IPs | `ip addr` | |
233 | | -| Show listening ports | `ss -tuln` | |
234 | | -| Show listening ports with process names | `sudo ss -tulpn` | |
235 | | -| Test connectivity | `ping localhost` | |
236 | | - |
237 | | - |
238 | | ---- |
239 | | - |
240 | | -## Stage 1 – `1_Robot_Fabi_Exploratory` |
241 | | - |
242 | | -This stage is the first complete prototype of the project. It already demonstrates the full flow from data generation to server inference and diagnostic response, making it the most conceptually complete exploratory step. |
243 | | - |
244 | | -### `train_model.py` |
245 | | - |
246 | | -The script creates a structured dataset with the features `temperatura`, `vibracao` and `rpm`, and the target `falha`. It then splits the data, trains a `RandomForestClassifier`, prints a classification report and exports both the CSV dataset and the serialized model file. |
247 | | - |
248 | | -Generated artifacts: |
249 | | - |
250 | | -- `exemplo_dados.csv` |
251 | | -- `modelo_falha_rf.pkl` |
252 | | - |
253 | | - |
254 | | -### `servidor_central.py` |
255 | | - |
256 | | -This server validates the model file before startup, loads it with `pickle`, listens on TCP, receives robot telemetry as JSON and classifies the data through a function that expects `temperatura`, `vibracao` and `rpm`. It returns either `"FALHA"` or `"NORMAL"` and also tracks the global number of alerts. |
257 | | - |
258 | | -A strong point of this implementation is its robustness. It handles invalid UTF-8, invalid JSON, missing required fields, numeric conversion errors and internal exceptions, all while protecting the shared counter with `threading.Lock()`. |
259 | | - |
260 | | -### `no_sensor.py` |
261 | | - |
262 | | -This is a simulated robot client used for testing. It generates random sensor values periodically, sends them to the server in JSON format and logs the diagnosis returned by the server together with timestamps and the robot identifier. |
263 | | - |
264 | | -It also implements reconnection behavior for timeouts, refused connections, connection reset and unexpected failures, which makes it useful for resilience testing during demonstrations. |
265 | | - |
266 | | ---- |
267 | | - |
268 | | -## Stage 2 – `2_Robot_Pedro_Exploratory` |
269 | | - |
270 | | -This stage focuses on simplification and architectural clarity. Instead of keeping the first prototype’s heavier end-to-end structure, it isolates the socket communication into a cleaner server–client base that can be extended later with intelligence and synchronization logic. |
271 | | - |
272 | | -Its main value is educational and architectural. It turns the project into a more understandable network skeleton, which makes the final integrated version easier to develop, test and explain. |
273 | | - |
274 | | ---- |
275 | | - |
276 | | -## Stage 3 – `3-ROBOT_FINAL` |
277 | | - |
278 | | -This is the final version delivered for the course presentation. It extends the cleaner socket structure with centralized ML inference, robot registration, shared state and a command-based interaction model. |
279 | | - |
280 | | -The final version includes: |
281 | | - |
282 | | -- a pre-trained model file; |
283 | | -- a central threaded server; |
284 | | -- a client terminal for operational commands; |
285 | | -- shared robot tracking; |
286 | | -- support for clean disconnect behavior through a sentinel command. |
287 | | - |
288 | | ---- |
289 | | - |
290 | | -## System architecture |
291 | | - |
292 | | -The final system uses a centralized decision architecture. Robot clients act as distributed nodes that send operational data, while the server acts as the intelligence layer responsible for processing, diagnosis and state coordination. This directly matches the Factory 4.0 motivation in the project. |
293 | | - |
294 | | -### Server flow |
295 | | - |
296 | | -1. A robot client connects to the TCP server. |
297 | | -2. The server creates a dedicated thread to handle that connection. |
298 | | -3. The client may register, request the current robot list, send telemetry or terminate its session. |
299 | | -4. The server preprocesses the received payload and invokes the Machine Learning model. |
300 | | -5. The server returns a diagnosis and safely updates shared state whenever needed. |
301 | | - |
302 | | -### Architecture diagram |
303 | | - |
304 | | -```mermaid |
305 | | -flowchart LR |
306 | | - C1[Robot Client 1] |
307 | | - C2[Robot Client 2] |
308 | | - C3[Robot Client 3] |
309 | | -
|
310 | | - S[Central TCP Server] |
311 | | -
|
312 | | - T1[Thread 1] |
313 | | - T2[Thread 2] |
314 | | - T3[Thread 3] |
315 | | -
|
316 | | - M[ML Model] |
317 | | - R[Shared Robot Registry] |
318 | | - L[Lock / RLock] |
319 | | - D[Diagnosis Response] |
320 | | -
|
321 | | - C1 --> S |
322 | | - C2 --> S |
323 | | - C3 --> S |
324 | | -
|
325 | | - S --> T1 |
326 | | - S --> T2 |
327 | | - S --> T3 |
328 | | -
|
329 | | - T1 --> M |
330 | | - T2 --> M |
331 | | - T3 --> M |
332 | | -
|
333 | | - T1 --> R |
334 | | - T2 --> R |
335 | | - T3 --> R |
336 | | -
|
337 | | - R --> L |
338 | | - M --> D |
339 | | -``` |
340 | | - |
341 | | - |
342 | | ---- |
343 | | - |
344 | | -## Final server |
345 | | - |
346 | | -The final server is the central intelligence node of the system. It is responsible for accepting TCP connections, dispatching work per client, loading a trained model and coordinating shared information about connected robots. |
347 | | - |
348 | | -Its main responsibilities are: |
349 | | - |
350 | | -- accept client connections; |
351 | | -- manage one execution flow per client; |
352 | | -- interpret commands and telemetry payloads; |
353 | | -- perform ML inference; |
354 | | -- return diagnostics; |
355 | | -- maintain synchronized shared robot state. |
356 | | - |
357 | | -This stage is especially important because it turns the earlier exploratory ideas into a more presentable distributed solution. Instead of being only a classifier endpoint, the server now behaves more like an operational control node. |
358 | | - |
359 | | ---- |
360 | | - |
361 | | -## Final client |
362 | | - |
363 | | -The final client is an interactive terminal interface that communicates with the central server. It supports operational commands and inference input, making it appropriate both for testing and for classroom demonstration. |
364 | | - |
365 | | -Compared with the simulated sensor client from the first stage, this version is more explicit and user-driven. It allows the team to demonstrate robot registration, listing connected robots, sending model inputs and ending sessions cleanly. |
366 | | - |
367 | | ---- |
368 | | - |
369 | | -## How to run |
370 | | - |
371 | | -### Requirements |
372 | | - |
373 | | -- Python 3.10+ |
374 | | -- `joblib` |
375 | | -- `pandas` |
376 | | -- `scikit-learn` |
377 | | -- `numpy` |
378 | | -- `Bot Status Identificator.pkl` inside `3-ROBOT_FINAL/` |
379 | | - |
380 | | - |
381 | | -### Linux or macOS |
382 | | - |
383 | | -1. Navigate to the final folder: |
384 | | -```bash |
385 | | -cd /path/to/3-ROBOT_FINAL |
386 | | -``` |
387 | | - |
388 | | -2. Create and activate a virtual environment: |
389 | | -```bash |
390 | | -python3 -m venv .venv |
391 | | -source .venv/bin/activate |
392 | | -pip install joblib pandas scikit-learn numpy |
393 | | -``` |
394 | | - |
395 | | -3. Optionally test local connectivity: |
396 | | -```bash |
397 | | -ping localhost |
398 | | -# or |
399 | | -ping 127.0.0.1 |
400 | | -``` |
401 | | - |
402 | | -4. Start the server: |
403 | | -```bash |
404 | | -python3 robot_server.py |
405 | | -``` |
406 | | - |
407 | | -5. In another terminal, start one or more clients: |
408 | | -```bash |
409 | | -cd /path/to/3-ROBOT_FINAL |
410 | | -source .venv/bin/activate |
411 | | -python3 robot_client.py |
412 | | -``` |
413 | | - |
414 | | - |
415 | | -### VS Code |
416 | | - |
417 | | -Open the repository in VS Code, enter the `3-ROBOT_FINAL` folder and use the integrated terminal to run the same commands. During presentation, it is useful to keep one terminal for the server and two or three terminals for clients so concurrency becomes visible. |
418 | | - |
419 | | ---- |
420 | | - |
421 | | -## Commands supported |
422 | | - |
423 | | -The final interaction model is command-based, which improves clarity during testing and presentation. The final version supports registration, listing connected robots, sending inference data and terminating the session with a sentinel command. |
424 | | - |
425 | | -Typical commands: |
426 | | - |
427 | | -- `cadastro` |
428 | | -- `listar bots` |
429 | | -- feature vector payload |
430 | | -- `sair` |
431 | | - |
432 | | -Example payload: |
433 | | - |
434 | | -```text |
435 | | -10, 50, 70, 3, "v1.0" |
436 | | -``` |
437 | | - |
438 | | - |
439 | | ---- |
440 | | - |
441 | | -## What each stage contributes |
442 | | - |
443 | | -The first stage proves that the complete pipeline works: data generation, model training, TCP communication, diagnosis and concurrent alert counting. It is the most experimental but also the richest in functional demonstration. |
444 | | - |
445 | | -The second stage improves software structure by simplifying the socket base. The third stage then combines architectural cleanliness with the intelligence and synchronization ideas introduced earlier, resulting in the final presented system. |
446 | | - |
447 | | ---- |
448 | | - |
449 | | -## Presentation guide |
450 | | - |
451 | | -### Suggested live demo |
452 | | - |
453 | | -A strong presentation flow is: |
454 | | - |
455 | | -1. Start the central server. |
456 | | -2. Open multiple client terminals. |
457 | | -3. Register clients with `cadastro`. |
458 | | -4. Show the result of `listar bots`. |
459 | | -5. Send one or more telemetry payloads. |
460 | | -6. Explain how the diagnosis is returned and how the shared state remains consistent under concurrent activity. |
461 | | - |
462 | | -### Quick oral explanations |
463 | | - |
464 | | -**Why TCP?** |
465 | | -Because the system needs reliable, ordered communication between robots and the central server. That is essential for valid diagnostics and consistent counters. |
466 | | - |
467 | | -**Why threads?** |
468 | | -Because the server must handle multiple robots simultaneously without blocking other connections. That is one of the main distributed-systems requirements. |
469 | | - |
470 | | -**Why `Lock` or `RLock`?** |
471 | | -Because shared state can be accessed by many threads at the same time. Synchronization prevents race conditions and preserves data integrity. |
472 | | - |
473 | | -**Why a `.pkl` model file?** |
474 | | -Because the model is trained beforehand and then loaded directly by the server at runtime, which makes real-time inference practical during execution and presentation. |
475 | | - |
476 | | ---- |
477 | | - |
478 | | -## Final note |
479 | | - |
480 | | -ROBOT Sentinel documents the evolution of a distributed predictive-diagnostics system from exploratory prototype to integrated final solution. The final architecture combines TCP communication, multithreading, synchronization and Machine Learning inference in a way that reflects the goals of a Factory 4.0 classroom project. |
481 | | - |
482 | | - |
483 | | - |
0 commit comments