Skip to content

Commit cf9eab7

Browse files
author
BESS Solutions
committed
feat(v0.6.0): Edge AI — AI-IDS + ONNX Dispatcher
Implements the first two capabilities from the BESSAI v2.0 roadmap (Q3 2026): AI-IDS (src/interfaces/ai_ids.py): - ModbusAnomalyDetector: IsolationForest + z-score timing ensemble - Score 0-1 (0=normal, 1=anomalous), threshold=0.65 default - Fail-safe: returns 0.0 before fit() (no false positives on startup) - Alerts via structlog + bess_ids_alerts_total Prometheus counter ONNX Dispatcher (src/interfaces/onnx_dispatcher.py): - Loads dispatch_policy.onnx at edge (no internet required) - Input: [soc_pct, power_kw, temp_c, hour_of_day] -> target_kw - Graceful fallback: returns None if model missing -> SafetyGuard takes over - Async context manager with Prometheus bess_onnx_inference_ms gauge Metrics (src/interfaces/metrics.py): - +4 new Prometheus metrics: IDS_ALERTS_TOTAL, IDS_ANOMALY_SCORE, ONNX_INFERENCE_MS, ONNX_DISPATCH_COMMANDS_TOTAL ONNX model (models/dispatch_policy.onnx): - Dummy linear model: target_kw = soc_pct * 0.8 (for testing) - Generated by scripts/generate_dummy_onnx.py - Replace with trained Ray RLlib export in production Dependencies: numpy>=1.26.0, scikit-learn>=1.4.0, onnxruntime>=1.18.0 Tests: 73/73 passed in 11.89s (+19 new: test_ai_ids.py + test_onnx_dispatcher.py)
1 parent b9212d7 commit cf9eab7

9 files changed

Lines changed: 880 additions & 21 deletions

File tree

CHANGELOG.md

Lines changed: 18 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -7,34 +7,32 @@
77
88
---
99

10-
## 🤖 AGENT HANDOFF — Estado actual del proyecto (2026-02-19T15:00 -03:00)
10+
## 🤖 AGENT HANDOFF — Estado actual del proyecto (2026-02-19T15:09 -03:00)
1111

1212
### Contexto del sistema
1313
**BESSAI Edge Gateway** (`open-bess-edge`) es el componente de borde de un sistema de gestión de baterías industriales (BESS). Adquiere telemetría via **Modbus TCP** desde inversores Huawei SUN2000, valida seguridad, y publica a **GCP Pub/Sub** con observabilidad via **OpenTelemetry** y **Prometheus**.
1414

15-
### Estado del código — ✅ v0.5.0, COMPLETO Y VALIDADO
15+
### Estado del código — ✅ v0.6.0, COMPLETO Y VALIDADO
1616

1717
| Archivo | Estado | Notas |
1818
|---|---|---|
1919
| `src/core/config.py` | ✅ Producción | `INVERTER_IP` acepta IPs y hostnames. Nuevo: `HEALTH_PORT=8000` |
2020
| `src/core/safety.py` | ✅ Producción | check_safety + watchdog_loop async |
2121
| `src/core/main.py` | ✅ Producción | Integrado con HealthServer + Prometheus metrics |
2222
| `src/drivers/modbus_driver.py` | ✅ Producción | pymodbus 3.12, struct-based encode/decode |
23-
| `src/interfaces/health.py` |**NUEVO** | Servidor HTTP /health (JSON) + /metrics (Prometheus) vía aiohttp |
24-
| `src/interfaces/metrics.py` |**NUEVO** | Contadores/Gauges: cycles, safety_blocks, SOC, power, cycle_duration |
23+
| `src/interfaces/health.py` | ✅ Producción | Servidor HTTP /health (JSON) + /metrics (Prometheus) vía aiohttp |
24+
| `src/interfaces/metrics.py` |**AMPLIADO** | +4 métricas AI: IDS_ALERTS, IDS_SCORE, ONNX_MS, ONNX_CMDS |
25+
| `src/interfaces/ai_ids.py` |**NUEVO** | AI-IDS: IsolationForest + z-score ensemble, score 0-1, alertas Prometheus |
26+
| `src/interfaces/onnx_dispatcher.py` |**NUEVO** | ONNX Runtime offline dispatcher, fallback gracioso si no hay modelo |
2527
| `src/interfaces/pubsub_publisher.py` | ✅ Producción | Async context manager, GCP Pub/Sub, JSON envelope |
2628
| `src/interfaces/otel_setup.py` | ✅ Producción | TracerProvider + MeterProvider |
27-
| `infrastructure/docker/docker-compose.yml` |**MEJORADO** | +Perfil `monitoring` (Prometheus+Grafana), port 8000, healthcheck HTTP |
28-
| `infrastructure/prometheus/prometheus.yml` |**NUEVO** | Scrape config: gateway:8000 + otel-collector:8888 |
29-
| `infrastructure/grafana/provisioning/` |**NUEVO** | Auto-provisioning datasource Prometheus |
30-
| `infrastructure/terraform/backend.tf` |**NUEVO** | GCS remote state config (listo para habilitar) |
31-
| `infrastructure/terraform/terraform.tfvars.example` |**NUEVO** | Template de variables TF |
32-
| `pyproject.toml` |**NUEVO** | Centraliza ruff/mypy/pytest/coverage config |
33-
| `docs/local_development.md` |**NUEVO** | Guía completa de desarrollo local |
34-
| `.github/workflows/ci.yml` |**MEJORADO** | +Job `terraform-validate` (sin credenciales GCP) |
35-
| `tests/test_health.py` |**NUEVO** | 9 tests para /health y /metrics endpoints |
36-
37-
**Suite de tests: 54/54 ✅ en 6.96s — Python 3.14 · pytest-asyncio 1.3.0**
29+
| `models/dispatch_policy.onnx` |**NUEVO** | Modelo dummy (SOC×0.8). Reemplazar con export de Ray RLlib. |
30+
| `scripts/generate_dummy_onnx.py` |**NUEVO** | Genera el modelo dummy + smoke test integrado |
31+
| `infrastructure/docker/docker-compose.yml` | ✅ Producción | Perfil `monitoring` (Prometheus+Grafana), port 8000 |
32+
| `infrastructure/prometheus/prometheus.yml` | ✅ Producción | Scrape config: gateway:8000 + otel-collector:8888 |
33+
| `infrastructure/terraform/` | ✅ Producción | apply ejecutado — 18 recursos en GCP |
34+
35+
**Suite de tests: 73/73 ✅ en 11.89s — Python 3.14 · pytest-asyncio 1.3.0**
3836

3937
### 🐳 Stack Docker — OPERATIVO
4038

@@ -66,13 +64,12 @@ docker compose -f infrastructure/docker/docker-compose.yml --profile simulator -
6664

6765
### 🟢 Próximo agente — Continuar aquí
6866

69-
**Todos los bloqueadores resueltos.** El pipeline completo está operativo:
70-
- lint (ruff) → test (54/54) → tf-validate → docker-build → docker-push → Artifact Registry
67+
**Todos los bloqueadores resueltos.** El pipeline completo está operativo.
7168

72-
**Próxima prioridad — BESSAI v2.0 (Q3 2026):**
73-
- Edge AI: ONNX Runtime (inferencia offline)
74-
- AI-IDS: detección de intrusiones Modbus
75-
- Ver roadmap: `docs/bessai_v2_roadmap.md`
69+
**Próxima prioridad — BESSAI v0.7.0 (Edge AI Fase 2):**
70+
- DRL Training: Ray RLlib (PPO/SAC) + Gymnasium + pandapower simulator
71+
- Federated Learning: Flower (flwr) — solo gradientes salen del edge
72+
- Ver roadmap: `docs/bessai_v2_roadmap.md` — Fase 2 aún en progreso
7673

7774
### 📂 Estructura de archivos clave
7875
```

models/dispatch_policy.onnx

265 Bytes
Binary file not shown.

requirements.txt

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,3 +54,13 @@ prometheus_client>=0.20.0
5454
# HTTP server — Health check & metrics endpoint
5555
# ---------------------------------------------------------------------------
5656
aiohttp>=3.9.5
57+
58+
# ---------------------------------------------------------------------------
59+
# Edge AI — AI-IDS (Modbus anomaly detection) + ONNX inference engine
60+
# ---------------------------------------------------------------------------
61+
# Numerical computing (required by scikit-learn and onnxruntime)
62+
numpy>=1.26.0
63+
# IsolationForest for Modbus anomaly detection (AI-IDS)
64+
scikit-learn>=1.4.0
65+
# ONNX Runtime for offline dispatch policy inference
66+
onnxruntime>=1.18.0

scripts/generate_dummy_onnx.py

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
"""
2+
scripts/generate_dummy_onnx.py
3+
================================
4+
Generates a minimal ONNX model for local testing of the ONNXDispatcher.
5+
6+
Model: Linear dispatch policy
7+
Input: [soc_pct, power_kw, temp_c, hour_of_day] (shape: 1x4, float32)
8+
Output: [dispatch_target_kw] (shape: 1x1, float32)
9+
Formula: target_kw = soc_pct * 0.8
10+
11+
This is a placeholder model. In production, the ONNX model is produced by
12+
training a Ray RLlib policy and exporting it via onnxmltools or torch.onnx.
13+
14+
Usage:
15+
python scripts/generate_dummy_onnx.py
16+
"""
17+
18+
from __future__ import annotations
19+
20+
from pathlib import Path
21+
22+
import numpy as np
23+
24+
try:
25+
import onnx
26+
from onnx import TensorProto, helper
27+
_ONNX_AVAILABLE = True
28+
except ImportError:
29+
_ONNX_AVAILABLE = False
30+
31+
try:
32+
import onnxruntime as ort
33+
_ORT_AVAILABLE = True
34+
except ImportError:
35+
_ORT_AVAILABLE = False
36+
37+
38+
def create_dummy_onnx_model() -> "onnx.ModelProto":
39+
"""Create a minimal linear dispatch ONNX model.
40+
41+
Graph: input (1x4) → MatMul (4x1 weights) → output (1x1)
42+
Weights: [0.8, 0.0, 0.0, 0.0] → target_kw ≈ soc_pct * 0.8
43+
"""
44+
# Weight matrix [4, 1]: only soc_pct (index 0) has weight 0.8
45+
weights = np.array([[0.8], [0.0], [0.0], [0.0]], dtype=np.float32)
46+
weight_tensor = helper.make_tensor(
47+
name="W",
48+
data_type=TensorProto.FLOAT,
49+
dims=[4, 1],
50+
vals=weights.flatten().tolist(),
51+
)
52+
53+
# Graph definition
54+
input_ = helper.make_tensor_value_info("input", TensorProto.FLOAT, [1, 4])
55+
output = helper.make_tensor_value_info("output", TensorProto.FLOAT, [1, 1])
56+
57+
matmul_node = helper.make_node(
58+
op_type="MatMul",
59+
inputs=["input", "W"],
60+
outputs=["output"],
61+
)
62+
63+
graph = helper.make_graph(
64+
nodes=[matmul_node],
65+
name="DummyDispatchPolicy",
66+
inputs=[input_],
67+
outputs=[output],
68+
initializer=[weight_tensor],
69+
)
70+
71+
model = helper.make_model(graph, opset_imports=[helper.make_opsetid("", 17)])
72+
model.doc_string = (
73+
"Dummy dispatch policy for BESSAI testing. "
74+
"target_kw = soc_pct * 0.8. "
75+
"Replace with a trained Ray RLlib export in production."
76+
)
77+
onnx.checker.check_model(model)
78+
return model
79+
80+
81+
def main() -> None:
82+
if not _ONNX_AVAILABLE:
83+
print("ERROR: 'onnx' package not installed. Run: pip install onnx")
84+
raise SystemExit(1)
85+
86+
model = create_dummy_onnx_model()
87+
output_path = Path("models/dispatch_policy.onnx")
88+
output_path.parent.mkdir(parents=True, exist_ok=True)
89+
onnx.save(model, str(output_path))
90+
print(f"[OK] Dummy ONNX model saved: {output_path}")
91+
92+
# Quick smoke test with onnxruntime
93+
if _ORT_AVAILABLE:
94+
sess = ort.InferenceSession(str(output_path))
95+
test_input = np.array([[90.0, 50.0, 25.0, 14.0]], dtype=np.float32)
96+
outputs = sess.run(None, {"input": test_input})
97+
target_kw = float(outputs[0][0][0])
98+
print(f" Smoke test -> input SOC=90% -> dispatch_target_kw={target_kw:.2f} kW")
99+
assert abs(target_kw - 72.0) < 1.0, f"Expected ~72.0, got {target_kw}"
100+
print(" [OK] Smoke test passed.")
101+
else:
102+
print(" [WARN] onnxruntime not installed - skipping smoke test.")
103+
104+
105+
if __name__ == "__main__":
106+
main()

0 commit comments

Comments
 (0)