From 5bfa4df907bbe759c4c282126c5363db8fe10f8a Mon Sep 17 00:00:00 2001
From: bruno-f-cruz <7049351+bruno-f-cruz@users.noreply.github.com>
Date: Thu, 30 Jan 2025 10:42:43 -0800
Subject: [PATCH 1/4] Initial draft of harp file standard
---
HarpFileFormat.md | 61 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 61 insertions(+)
create mode 100644 HarpFileFormat.md
diff --git a/HarpFileFormat.md b/HarpFileFormat.md
new file mode 100644
index 0000000..fc76991
--- /dev/null
+++ b/HarpFileFormat.md
@@ -0,0 +1,61 @@
+
+
+# Standardized harp file format
+
+## Introduction
+
+This document defines a standardized file format for logging data from harp devices. The file format is based on the [Harp Binary Protocol](./BinaryProtocol-8bit.md) and is designed for efficient data logging and parsing.
+
+One of the main advantages of using a standardized binary communication protocol is that logging data from harp devices can be largely generalized. In theory, we could simply dump the binary data from the device straight into a single binary file and be done with it. However this is not always the most convenient way to log data. For instance, if one is interested in ingesting only a subset of messages (e.g. only the messages from a particular sensor connected to the harp device), the previous approach would require a post-processing step to filter out the messages of interest. Furthermore, each address, as per harp protocol spec, has potentially different data formats (e.g. U8 vs U16) or even different lengths if array registers are involved. This can make it very tedious to parse and analyze a binary file offline, since we will have to examine the header of each and every message in the file to determine how to extract its contents.
+
+This processing step could be entirely eliminated if we could ensure that all messages in a single binary file had the same format. Fortunately, for a any given harp device, the payload stored in a specific register will have a fixed type and length. This can be leveraged by simply saving messages from a specific register into a different files (also known as a de-multiplexing strategy).
+
+## harp file format
+
+For each device, we will define a "container" file format that will contain data from a single device, where each register will be saved in a separate binary file:
+
+```plaintext
+π¦.harp
+ β£ π_0_.bin
+ β£ π_1_.bin
+ β£ ...
+ βπ__.bin
+ ```
+---
+
+where:
+
+- the character "_" is reserved as a separator between fields.
+- `` should match the `device.yml` metadata file that fully defines the device and can be found in the repository of each device ([e.g.](https://raw.githubusercontent.com/harp-tech/device.behavior/main/device.yml)). This file can be seen as the "ground-truth" specification of the device. It is used to automatically generate documentation, interfaces and data ingestion tools. While this is not a strict requirement, it is highly recommended.
+- `` is an arbitrary name that identifies the device being used.
+- `` is the register number that is logged in the binary file.
+- `` is an optional suffix that can be co-opted by the user to add any additional information to the file name (e.g. a timestamp, a sequence number, etc).
+- `.harp` is the file extension for the container file.
+
+### The optional `device.yml` file
+
+Including the `device.yml` file that corresponds to the device interface used to log the device's data is recommended. This shall be achieved by simply appending a `device.yml` file to the `harp` file. The container thus becomes:
+```plaintext
+π¦.harp
+ β£ π_0_.bin
+ β£ π_1_.bin
+ β£ ...
+ β£ π__.bin
+ β πdevice.yml (Optional) ```
+---
+```
+
+## Best practices and application notes
+
+### Logging the device's initial configuration
+
+Most of the registers in a given harp device are not emitting period events. As such, it is impossible to know their state unless explicitly queried. This is particularly important for the configuration registers, which define the behavior of the device, as well as metadata registers (e.g. versions). Fortunately, the [Device specification](./Device.md) defines a feature for dumping the values of all registers during acquisition. This can achieved by sending a single message to the `R_OPERATION_CTRL` register with a Bit3 set to 1. This will trigger the device to send a volley of `READ` type messages with the contents of all registers.
+
+> [!IMPORTANT]
+> In your experiments, always validate that your logging routine has fully initialized before requesting a reading dump from the device. Failure to do so may result in missing data.
+
+
+## Release notes
+
+- v0.1
+ * First draft.
From e6ccd4260aa4dbc36c3b77026e393122b155c6b0 Mon Sep 17 00:00:00 2001
From: bruno-f-cruz <7049351+bruno-f-cruz@users.noreply.github.com>
Date: Thu, 6 Feb 2025 10:32:17 -0800
Subject: [PATCH 2/4] Capitalize `Harp`
---
HarpFileFormat.md | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/HarpFileFormat.md b/HarpFileFormat.md
index fc76991..bb36316 100644
--- a/HarpFileFormat.md
+++ b/HarpFileFormat.md
@@ -1,16 +1,16 @@
-# Standardized harp file format
+# Standardized Harp file format
## Introduction
-This document defines a standardized file format for logging data from harp devices. The file format is based on the [Harp Binary Protocol](./BinaryProtocol-8bit.md) and is designed for efficient data logging and parsing.
+This document defines a standardized file format for logging data from Harp devices. The file format is based on the [Harp Binary Protocol](./BinaryProtocol-8bit.md) and is designed for efficient data logging and parsing.
-One of the main advantages of using a standardized binary communication protocol is that logging data from harp devices can be largely generalized. In theory, we could simply dump the binary data from the device straight into a single binary file and be done with it. However this is not always the most convenient way to log data. For instance, if one is interested in ingesting only a subset of messages (e.g. only the messages from a particular sensor connected to the harp device), the previous approach would require a post-processing step to filter out the messages of interest. Furthermore, each address, as per harp protocol spec, has potentially different data formats (e.g. U8 vs U16) or even different lengths if array registers are involved. This can make it very tedious to parse and analyze a binary file offline, since we will have to examine the header of each and every message in the file to determine how to extract its contents.
+One of the main advantages of using a standardized binary communication protocol is that logging data from Harp devices can be largely generalized. In theory, we could simply dump the binary data from the device straight into a single binary file and be done with it. However this is not always the most convenient way to log data. For instance, if one is interested in ingesting only a subset of messages (e.g. only the messages from a particular sensor connected to the Harp device), the previous approach would require a post-processing step to filter out the messages of interest. Furthermore, each address, as per Harp protocol spec, has potentially different data formats (e.g. U8 vs U16) or even different lengths if array registers are involved. This can make it very tedious to parse and analyze a binary file offline, since we will have to examine the header of each and every message in the file to determine how to extract its contents.
-This processing step could be entirely eliminated if we could ensure that all messages in a single binary file had the same format. Fortunately, for a any given harp device, the payload stored in a specific register will have a fixed type and length. This can be leveraged by simply saving messages from a specific register into a different files (also known as a de-multiplexing strategy).
+This processing step could be entirely eliminated if we could ensure that all messages in a single binary file had the same format. Fortunately, for a any given Harp device, the payload stored in a specific register will have a fixed type and length. This can be leveraged by simply saving messages from a specific register into a different files (also known as a de-multiplexing strategy).
-## harp file format
+## Harp file format
For each device, we will define a "container" file format that will contain data from a single device, where each register will be saved in a separate binary file:
@@ -49,7 +49,7 @@ Including the `device.yml` file that corresponds to the device interface used to
### Logging the device's initial configuration
-Most of the registers in a given harp device are not emitting period events. As such, it is impossible to know their state unless explicitly queried. This is particularly important for the configuration registers, which define the behavior of the device, as well as metadata registers (e.g. versions). Fortunately, the [Device specification](./Device.md) defines a feature for dumping the values of all registers during acquisition. This can achieved by sending a single message to the `R_OPERATION_CTRL` register with a Bit3 set to 1. This will trigger the device to send a volley of `READ` type messages with the contents of all registers.
+Most of the registers in a given Harp device are not emitting period events. As such, it is impossible to know their state unless explicitly queried. This is particularly important for the configuration registers, which define the behavior of the device, as well as metadata registers (e.g. versions). Fortunately, the [Device specification](./Device.md) defines a feature for dumping the values of all registers during acquisition. This can achieved by sending a single message to the `R_OPERATION_CTRL` register with a Bit3 set to 1. This will trigger the device to send a volley of `READ` type messages with the contents of all registers.
> [!IMPORTANT]
> In your experiments, always validate that your logging routine has fully initialized before requesting a reading dump from the device. Failure to do so may result in missing data.
From 98a27acc82b347ccadb69bfb07a6a4ebfed022f5 Mon Sep 17 00:00:00 2001
From: brunocruz <7049351+bruno-f-cruz@users.noreply.github.com>
Date: Thu, 13 Feb 2025 08:50:45 -0800
Subject: [PATCH 3/4] Improve text clarity
Co-authored-by: glopesdev
---
HarpFileFormat.md | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/HarpFileFormat.md b/HarpFileFormat.md
index bb36316..964e7e2 100644
--- a/HarpFileFormat.md
+++ b/HarpFileFormat.md
@@ -6,13 +6,13 @@
This document defines a standardized file format for logging data from Harp devices. The file format is based on the [Harp Binary Protocol](./BinaryProtocol-8bit.md) and is designed for efficient data logging and parsing.
-One of the main advantages of using a standardized binary communication protocol is that logging data from Harp devices can be largely generalized. In theory, we could simply dump the binary data from the device straight into a single binary file and be done with it. However this is not always the most convenient way to log data. For instance, if one is interested in ingesting only a subset of messages (e.g. only the messages from a particular sensor connected to the Harp device), the previous approach would require a post-processing step to filter out the messages of interest. Furthermore, each address, as per Harp protocol spec, has potentially different data formats (e.g. U8 vs U16) or even different lengths if array registers are involved. This can make it very tedious to parse and analyze a binary file offline, since we will have to examine the header of each and every message in the file to determine how to extract its contents.
+One of the main advantages of using a standardized binary communication protocol is that logging data from Harp devices can be largely generalized. Conceptually, because all Harp messages share a common standard structure, we can write all the binary data emitted from a device directly into a single binary file. However, this is not always the most convenient way to log data. For instance, if one is interested in ingesting only a subset of messages (e.g. only the messages from a particular sensor connected to the Harp device), this approach would require a post-processing step to filter out the messages of interest. Furthermore, each address, as per Harp protocol spec, has potentially different data formats (e.g. U8 vs U16) or even different lengths if array registers are involved. This can make it more complex to parse and analyze a binary file offline, since we will have to examine the header of each and every message in the file to determine how to extract its contents.
-This processing step could be entirely eliminated if we could ensure that all messages in a single binary file had the same format. Fortunately, for a any given Harp device, the payload stored in a specific register will have a fixed type and length. This can be leveraged by simply saving messages from a specific register into a different files (also known as a de-multiplexing strategy).
+This processing step could be entirely eliminated if we could ensure that all messages in a single binary file had the same format. Fortunately, for any given Harp device, the payload stored in a specific register address is guaranteed to have a fixed format. This can be leveraged in order to save messages from a specific register into different fixed-format files, by employing a de-multiplexing strategy.
## Harp file format
-For each device, we will define a "container" file format that will contain data from a single device, where each register will be saved in a separate binary file:
+For each device, we define a "container" file format which is essentially a folder that will store data from a single device, and where the payload from messages coming from each register is saved sequentially to a separate binary file:
```plaintext
π¦.harp
@@ -23,18 +23,18 @@ For each device, we will define a "container" file format that will contain data
```
---
-where:
+The various components of this convention are detailed below.
-- the character "_" is reserved as a separator between fields.
+- the character `_` is reserved as a separator between fields.
- `` should match the `device.yml` metadata file that fully defines the device and can be found in the repository of each device ([e.g.](https://raw.githubusercontent.com/harp-tech/device.behavior/main/device.yml)). This file can be seen as the "ground-truth" specification of the device. It is used to automatically generate documentation, interfaces and data ingestion tools. While this is not a strict requirement, it is highly recommended.
- `` is an arbitrary name that identifies the device being used.
- `` is the register number that is logged in the binary file.
-- `` is an optional suffix that can be co-opted by the user to add any additional information to the file name (e.g. a timestamp, a sequence number, etc).
-- `.harp` is the file extension for the container file.
+- `` is an optional suffix that can be co-opted by the user to add any additional information to the file name (e.g. a timestamp, a sequence number, etc). If there is no ``, the final `_` should be omitted.
+- `.harp` is the extension for the container folder.
### The optional `device.yml` file
-Including the `device.yml` file that corresponds to the device interface used to log the device's data is recommended. This shall be achieved by simply appending a `device.yml` file to the `harp` file. The container thus becomes:
+Including the `device.yml` file that corresponds to the interface used to log the device's data is recommended. To do this, we place a `device.yml` file at the root of the container folder. The folder structure thus becomes:
```plaintext
π¦.harp
β£ π_0_.bin
@@ -49,10 +49,10 @@ Including the `device.yml` file that corresponds to the device interface used to
### Logging the device's initial configuration
-Most of the registers in a given Harp device are not emitting period events. As such, it is impossible to know their state unless explicitly queried. This is particularly important for the configuration registers, which define the behavior of the device, as well as metadata registers (e.g. versions). Fortunately, the [Device specification](./Device.md) defines a feature for dumping the values of all registers during acquisition. This can achieved by sending a single message to the `R_OPERATION_CTRL` register with a Bit3 set to 1. This will trigger the device to send a volley of `READ` type messages with the contents of all registers.
+Most registers in a Harp device will not emit periodic events. As such, it is impossible to know their state unless explicitly queried. For configuration registers we do want to know this state, since it will define the behavior of the device at runtime. We also want to include metadata registers such as the device name and versions. Fortunately, the [Device specification](./Device.md) defines a feature for dumping the values of all registers during acquisition. By sending a single message to the `R_OPERATION_CTRL` register with `Bit3` set to 1, we can make the device send a rapid sequence of `READ` type messages with the contents of all registers.
> [!IMPORTANT]
-> In your experiments, always validate that your logging routine has fully initialized before requesting a reading dump from the device. Failure to do so may result in missing data.
+> In your experiments, always validate that your logging routine has fully initialized before requesting a read dump from the device. Failure to do so may result in missing data.
## Release notes
From dbed4b6b7f2ab9d548eb1054e725d91d55e70ae0 Mon Sep 17 00:00:00 2001
From: brunocruz <7049351+bruno-f-cruz@users.noreply.github.com>
Date: Thu, 27 Mar 2025 08:52:52 -0700
Subject: [PATCH 4/4] Remove .harp extension requirement from standard
Co-authored-by: glopesdev
---
HarpFileFormat.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/HarpFileFormat.md b/HarpFileFormat.md
index 964e7e2..4068c3c 100644
--- a/HarpFileFormat.md
+++ b/HarpFileFormat.md
@@ -15,7 +15,7 @@ This processing step could be entirely eliminated if we could ensure that all me
For each device, we define a "container" file format which is essentially a folder that will store data from a single device, and where the payload from messages coming from each register is saved sequentially to a separate binary file:
```plaintext
-π¦.harp
+π¦
β£ π_0_.bin
β£ π_1_.bin
β£ ...