Bilpapster
diff --git a/‎docs/concepts/compact-vs-native-data.rst‎
Lines changed: 417 additions & 0 deletions b/‎docs/concepts/compact-vs-native-data.rst‎
Lines changed: 417 additions & 0 deletions
diff --git a/‎docs/concepts/data-quality.rst‎
Lines changed: 1 addition & 0 deletions b/‎docs/concepts/data-quality.rst‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/concepts/index.rst‎
Lines changed: 8 additions & 0 deletions b/‎docs/concepts/index.rst‎
Lines changed: 8 additions & 0 deletions
diff --git a/‎docs/concepts/stream-windows.rst‎
Lines changed: 1 addition & 0 deletions b/‎docs/concepts/stream-windows.rst‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/examples/advanced-examples.rst‎
Lines changed: 133 additions & 0 deletions b/‎docs/examples/advanced-examples.rst‎
Lines changed: 133 additions & 0 deletions
diff --git a/‎docs/examples/basic-examples.rst‎
Lines changed: 8 additions & 0 deletions b/‎docs/examples/basic-examples.rst‎
Lines changed: 8 additions & 0 deletions
diff --git a/‎examples/compact_data.py‎
Lines changed: 132 additions & 0 deletions b/‎examples/compact_data.py‎
Lines changed: 132 additions & 0 deletions
@@ -289,6 +289,7 @@ What's Next?
 
 Now that you understand how data quality concepts evolve for streaming data:
 
+- 📊 **Understand data formats**: :doc:`compact-vs-native-data` - How Stream DaQ handles different data representations seamlessly
 - 🪟 **Learn about windowing**: :doc:`stream-windows` - How to make infinite streams manageable
 - 📏 **Explore measures**: :doc:`measures-and-assessments` - The building blocks of Stream DaQ quality checks
 - 💡 **See it in action**: :doc:`../examples/index` - Real-world quality monitoring examples
@@ -40,6 +40,13 @@ Welcome to the conceptual heart of Stream DaQ! Understanding these core concepts
 
         **Stream processing principles** - Understand late arrivals, watermarks, and how Stream DaQ handles the complexity of real-time data.
 
+    .. grid-item-card:: 📊 Compact vs Native Data
+        :link: compact-vs-native-data
+        :link-type: doc
+        :class-header: bg-secondary text-white
+
+        **Data format strategies** - Learn when to use compact vs native formats and how Stream DaQ handles both seamlessly.
+
 The Big Picture
 ---------------------
 
@@ -168,5 +175,6 @@ Ready to dive deeper? Start with :doc:`data-quality` to understand why streaming
    stream-windows
    measures-and-assessments
    real-time-monitoring
+   compact-vs-native-data
 
 |made_with_love|
@@ -454,6 +454,7 @@ What's Next?
 
 Now that you understand how to slice infinite streams into manageable windows:
 
+- 📊 **Understand data formats**: :doc:`compact-vs-native-data` - How different data formats work seamlessly with windows
 - 📏 **Learn about measures**: :doc:`measures-and-assessments` - What to calculate within each window
 - ⚡ **Explore real-time concepts**: :doc:`real-time-monitoring` - Production considerations for windowed monitoring
 - 💡 **See windowing in action**: :doc:`../examples/index` - Real-world windowing patterns
 
@@ -1,6 +1,139 @@
 🧙‍♂️ Advanced Examples
 =============================
 
+Compact Data Monitoring Example
+--------------------------------
+
+Stream DaQ provides seamless support for compact data formats commonly used in IoT and resource-constrained environments. Instead of manually transforming compact data into individual records, Stream DaQ handles this automatically, allowing you to focus on defining meaningful quality measures.
+
+.. seealso::
+   
+   For conceptual background on compact vs native data formats, see :doc:`../concepts/compact-vs-native-data`.
+
+**What makes data "compact"?**
+
+Compact data represents multiple field values in a single record, typically using arrays or lists. This format is prevalent in IoT scenarios because it:
+
+- **Reduces bandwidth usage** by ~60% compared to individual field transmissions
+- **Minimizes storage requirements** on resource-constrained devices  
+- **Enables efficient batch transmission** of multiple sensor readings
+- **Optimizes network protocols** for wireless sensor networks
+
+**Common IoT scenarios using compact data:**
+
+- Environmental monitoring stations (temperature, humidity, pressure)
+- Industrial sensor networks (vibration, temperature, speed)
+- Smart building systems (occupancy, air quality, energy usage)
+- Vehicle telemetry (GPS coordinates, speed, fuel consumption, engine metrics)
+
+.. code-block:: python
+
+    # pip install streamdaq
+    
+    import pathway as pw
+    from streamdaq import DaQMeasures as dqm
+    from streamdaq import CompactData, Windows, StreamDaQ
+
+    # Configuration for compact IoT sensor data
+    FIELDS_COLUMN = "fields"
+    FIELDS = ["temperature", "humidity", "pressure"]  # IoT sensor measurements
+    VALUES_COLUMN = "values"
+    TIMESTAMP_COLUMN = "timestamp"
+
+    # Example compact data source (simulating IoT sensor network)
+    class CompactDataSource(pw.io.python.ConnectorSubject):
+        """Simulates IoT sensors sending compact data format."""
+        def run(self):
+            nof_fields = len(FIELDS)
+            nof_compact_rows = 5
+            timestamp = value = 0
+            for _ in range(nof_compact_rows):
+                message = {
+                    TIMESTAMP_COLUMN: timestamp,
+                    FIELDS_COLUMN: FIELDS,
+                    VALUES_COLUMN: [value + i for i in range(nof_fields)]
+                }
+                value += len(FIELDS)
+                timestamp += 1
+                self.next(**message)
+
+    # Define schema for compact data structure
+    schema_dict = {
+        TIMESTAMP_COLUMN: int,
+        FIELDS_COLUMN: list[str],
+        VALUES_COLUMN: list[int | None]  # Supports missing values
+    }
+    schema = pw.schema_from_dict(schema_dict)
+
+    # Create compact data stream
+    compact_data_stream = pw.io.python.read(
+        CompactDataSource(),
+        schema=schema,
+    )
+
+    # Configure Stream DaQ for automatic compact data handling
+    daq = StreamDaQ().configure(
+        window=Windows.sliding(duration=3, hop=1, origin=0),
+        source=compact_data_stream,
+        time_column=TIMESTAMP_COLUMN,
+        wait_for_late=1,  # Handle late IoT data arrivals
+        
+        # Stream DaQ automatically transforms compact to native format
+        compact_data=CompactData() \
+            .with_fields_column(FIELDS_COLUMN) \
+            .with_values_column(VALUES_COLUMN) \
+            .with_values_dtype(int)
+    )
+
+    # Define quality measures for individual sensor fields
+    # Notice: Direct field access despite compact input format!
+    daq.add(dqm.count('pressure'), name="readings") \
+       .add(dqm.missing_count('temperature') + 
+            dqm.missing_count('pressure') + 
+            dqm.missing_count('humidity'),
+            assess="<2", name="missing_readings") \
+       .add(dqm.is_frozen('humidity'), name="frozen_humidity_sensor")
+
+    # Start monitoring
+    daq.watch_out()
+
+**Stream DaQ's Automatic Transformation Benefits:**
+
+1. **No Manual Preprocessing**: Stream DaQ internally converts compact data to native format for quality analysis
+2. **Seamless Field Access**: Reference individual fields (``temperature``, ``humidity``, ``pressure``) directly in quality measures
+3. **Missing Value Handling**: Automatic support for ``None`` values common in real-world IoT scenarios  
+4. **Type Safety**: Configurable data type handling with validation
+5. **Temporal Alignment**: Proper time-based windowing despite compact input format
+
+**Compact vs Native Data Comparison:**
+
+.. code-block:: json
+
+    // Compact format (1 record):
+    {
+        "timestamp": 1,
+        "fields": ["temperature", "humidity", "pressure"], 
+        "values": [23.5, 65.2, 1013.25]
+    }
+
+    // Equivalent native format (3 records):
+    {"timestamp": 1, "temperature": 23.5}
+    {"timestamp": 1, "humidity": 65.2} 
+    {"timestamp": 1, "pressure": 1013.25}
+
+**Why This Matters for IoT:**
+
+Without Stream DaQ's automatic handling, you would typically need to:
+
+- Manually unpack compact rows into individual field records
+- Handle missing values and data type conversions
+- Manage temporal alignment across different fields
+- Write custom transformation logic before quality monitoring
+
+Stream DaQ eliminates this preprocessing pipeline, allowing you to focus on defining meaningful quality measures rather than data transformation logic. This is especially valuable in resource-constrained environments where development time and computational efficiency are critical.
+
+For a complete working example with detailed comments, see the ``examples/compact_data.py`` file in the examples directory. To understand the conceptual differences between compact and native data formats, see :doc:`../concepts/compact-vs-native-data`.
+
 Schema Validation Example
 --------------------------
 
 
@@ -110,3 +110,11 @@ The trend measure calculates the slope of a linear regression line through the d
 Trend analysis complements traditional min-max and range checks for comprehensive data quality monitoring. While threshold checks validate current values, trend analysis ensures data consistency over time by detecting unexpected patterns or gradual shifts that could indicate sensor drift or measurement errors.
 
 Luckily, Stream DaQ offers a suite of over 30 data quality measures, including range conformance, profiling statistics, trend analysis and many more - making comprehensive data quality monitoring both powerful and effortless!
+
+**What's Next?**
+
+Ready for more advanced scenarios? Check out:
+
+- 🧙‍♂️ **Advanced Examples**: :doc:`advanced-examples` - Compact data handling, schema validation, and more
+- 📚 **Core Concepts**: :doc:`../concepts/index` - Deep dive into streaming data quality theory
+- 📊 **Data Formats**: :doc:`../concepts/compact-vs-native-data` - Understanding different data representations
@@ -0,0 +1,132 @@
+# pip install streamdaq
+
+import pathway as pw
+from streamdaq import DaQMeasures as dqm
+from streamdaq import CompactData, Windows, StreamDaQ
+
+# Configuration constants for compact data structure
+FIELDS_COLUMN = "fields"
+FIELDS = ["temperature", "humidity", "pressure"]  # simulating IoT sensor measurements
+VALUES_COLUMN = "values"
+TIMESTAMP_COLUMN = "timestamp"
+
+
+# We first need to define a data source sending compact data.
+# If you already have one, skip this part!
+class CompactDataSource(pw.io.python.ConnectorSubject):
+    """
+    Simulates an IoT sensor network sending compact data format.
+    
+    Example compact format:
+    {
+        "timestamp": 1,
+        "fields": ["temperature", "humidity", "pressure"],
+        "values": [23.5, 65.2, 1013.25]
+    }
+
+    vs. traditional native format:
+    {"timestamp": 1, "temperature": 23.5, "humidity": 65.2, "pressure": 1013.25}
+    """
+
+    def run(self):
+        nof_fields = len(FIELDS)
+        nof_compact_rows = 5  # how many compact data rows to send in this simulation
+        timestamp = value = 0
+        for _ in range(nof_compact_rows):
+            message = {
+                TIMESTAMP_COLUMN: timestamp,
+                FIELDS_COLUMN: FIELDS,
+                VALUES_COLUMN: [value + i for i in range(nof_fields)]
+                # VALUES_COLUMN: [value + i if (value + i) % 5 > 0 else None for i in range(nof_fields)]
+                # replace with the above line to make it more spicy by adding a missing reading every five ;)
+            }
+            value += len(FIELDS)
+            timestamp += 1
+            self.next(**message)
+
+
+# Define schema for the compact data structure
+schema_dict = {
+    TIMESTAMP_COLUMN: int,
+    FIELDS_COLUMN: list[str],
+    VALUES_COLUMN: list[int | None],  # Supports missing values (None) for real-world scenarios
+}
+schema = pw.schema_from_dict(schema_dict)
+
+# Create the compact data stream (simulating IoT sensor network)
+compact_data_stream = pw.io.python.read(
+    CompactDataSource(),
+    schema=schema,
+)
+
+print("The initial data source sends compact data, like this:")
+pw.debug.compute_and_print(compact_data_stream)
+
+# If you already have a compact data source, your job starts here!
+
+# Step 1: Configure Stream DaQ for compact data monitoring
+# Stream DaQ automatically handles the transformation from compact to native format,
+# eliminating the need for manual data preprocessing that would typically require:
+# - Unpacking compact rows into individual field records
+# - Handling missing values and data type conversions
+# - Managing temporal alignment across different fields
+daq = StreamDaQ().configure(
+    window=Windows.sliding(duration=3, hop=1, origin=0),  # 3-second sliding window with 1-second hop
+    source=compact_data_stream,
+    time_column=TIMESTAMP_COLUMN,
+    # Just define how your compact data is structured; Stream DaQ takes care of all the rest!
+    # This CompactData configuration tells Stream DaQ how to interpret your format
+    compact_data=CompactData()
+    .with_fields_column(FIELDS_COLUMN)
+    .with_values_column(VALUES_COLUMN)
+    .with_values_dtype(int),
+)
+
+# Step 2: Define data quality measures for IoT sensor monitoring
+# Notice how we can directly reference individual fields (temperature, humidity, pressure)
+# even though they arrive in compact format - Stream DaQ handles the unpacking automatically!
+daq.add(dqm.count("pressure"), name="readings") \
+    .add(dqm.missing_count("temperature") 
+         + dqm.missing_count("pressure") 
+         + dqm.missing_count("humidity"), # Measures the total missing readings per window in all fields
+        assess="<2",  # We can tolerate at most one missing reading per window
+        name="missing_readings",
+    ). \
+    add(dqm.is_frozen("humidity"), name="frozen_humidity_sensor")  # Detect stuck humidity sensor
+
+# Complete list of Data Quality Measures (dqm): https://github.com/Bilpapster/stream-DaQ/blob/main/streamdaq/DaQMeasures.py
+
+
+# Step 3: Kick-off monitoring and let Stream DaQ do the work while you focus on the important
+daq.watch_out()
+
+# IoT Compact Data Monitoring Benefits:
+#
+# 1. Bandwidth Efficiency:
+#    - Compact format reduces network traffic by ~60% compared to individual field transmissions
+#    - Critical for battery-powered sensors with limited connectivity
+#
+# 2. Automatic Transformation:
+#    - Stream DaQ internally converts compact data to native format for quality analysis
+#    - No manual preprocessing required - just specify the compact data structure
+#    - Handles missing values, data types, and temporal alignment automatically
+#
+# 3. Real-World IoT Scenarios:
+#    - Environmental monitoring stations (temperature, humidity, pressure)
+#    - Industrial sensor networks (vibration, temperature, speed)
+#    - Smart building systems (occupancy, air quality, energy usage)
+#    - Vehicle telemetry (GPS, speed, fuel consumption, engine metrics)
+#
+# 4. Quality Monitoring Without Complexity:
+#    - Apply the same quality measures as native data streams
+#    - Detect sensor failures, missing readings, and data anomalies
+#    - Monitor trends and patterns across multiple sensor types simultaneously
+#
+# Stream DaQ's compact data handling eliminates the typical IoT data preprocessing
+# pipeline, allowing you to focus on defining meaningful quality measures rather
+# than data transformation logic. This is especially valuable in resource-constrained
+# environments where development time and computational efficiency are critical!
+#
+# 📚 Learn More:
+# - Comprehensive compact data documentation: docs/examples/advanced-examples.rst
+# - Conceptual background: docs/concepts/compact-vs-native-data.rst