Feature/1016 Create pics to xml script#1019
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new utility script scripts/utils/pics_to_xml.sh designed to convert PICS log data (either in Python dict log format or raw KEY=0/1 lines) into XML PICS files grouped by cluster prefix. The reviewer provided several constructive suggestions to improve the script's robustness and parsing accuracy. These include enabling strict error handling (set -euo pipefail), refactoring short-circuit shell evaluations into standard if statements to prevent unexpected exits under set -e, checking argument counts to avoid unbound variable errors under set -u, enhancing the KEY=0/1 parser to handle whitespaces and carriage returns, removing an unused variable in the awk command, and expanding command reception parsing to support both Rsp and Rx suffixes.
| @@ -0,0 +1,273 @@ | |||
| #!/usr/bin/env bash | |||
| usage() { | ||
| echo "Usage: $0 <input_file> [output_dir]" | ||
| echo " $0 --text \"KEY=0/1 ...\" [-o output_dir]" | ||
| exit 1 | ||
| } | ||
|
|
||
| [ $# -eq 0 ] && usage |
There was a problem hiding this comment.
Under set -e, short-circuit evaluations like [ $# -eq 0 ] && usage can cause the shell to exit unexpectedly if the condition is false (since the overall expression returns a non-zero status). Rewriting this as a standard if statement is safer and more readable.
| usage() { | |
| echo "Usage: $0 <input_file> [output_dir]" | |
| echo " $0 --text \"KEY=0/1 ...\" [-o output_dir]" | |
| exit 1 | |
| } | |
| [ $# -eq 0 ] && usage | |
| usage() { | |
| echo "Usage: $0 <input_file> [output_dir]" | |
| echo " $0 --text \\\"KEY=0/1 ...\\\" [-o output_dir]" | |
| exit 1 | |
| } | |
| if [ $# -eq 0 ]; then | |
| usage | |
| fi |
| while [ $# -gt 0 ]; do | ||
| case "$1" in | ||
| --text) | ||
| [ -z "$2" ] && { echo "ERROR: --text requires a value" >&2; exit 1; } | ||
| TEXT_MODE=1 | ||
| INLINE_TEXT="$2" | ||
| shift 2 | ||
| ;; | ||
| -o|--output-dir) | ||
| [ -z "$2" ] && { echo "ERROR: -o requires a value" >&2; exit 1; } | ||
| OUTPUT_DIR="$2" | ||
| shift 2 | ||
| ;; | ||
| -*) | ||
| echo "ERROR: Unknown option: $1" >&2; usage | ||
| ;; | ||
| *) | ||
| if [ -z "$INPUT_FILE" ]; then | ||
| INPUT_FILE="$1" | ||
| elif [ -z "$OUTPUT_DIR" ]; then | ||
| OUTPUT_DIR="$1" # positional second arg kept for back-compat | ||
| else | ||
| echo "ERROR: Unexpected argument: $1" >&2; usage | ||
| fi | ||
| shift | ||
| ;; | ||
| esac | ||
| done |
There was a problem hiding this comment.
When parsing arguments, checking $2 directly (e.g., [ -z "$2" ]) when $# might be less than 2 can lead to unbound variable errors under set -u in some shells, and is generally less robust. Checking the argument count $# explicitly before accessing $2 is safer and more idiomatic. Also, using standard if statements instead of && short-circuits prevents unexpected exits under set -e.
while [ $# -gt 0 ]; do
case "$1" in
--text)
if [ $# -lt 2 ]; then
echo "ERROR: --text requires a value" >&2
exit 1
fi
TEXT_MODE=1
INLINE_TEXT="$2"
shift 2
;;
-o|--output-dir)
if [ $# -lt 2 ]; then
echo "ERROR: -o requires a value" >&2
exit 1
fi
OUTPUT_DIR="$2"
shift 2
;;
-*)
echo "ERROR: Unknown option: $1" >&2
usage
;;
*)
if [ -z "$INPUT_FILE" ]; then
INPUT_FILE="$1"
elif [ -z "$OUTPUT_DIR" ]; then
OUTPUT_DIR="$1" # positional second arg kept for back-compat
else
echo "ERROR: Unexpected argument: $1" >&2
usage
fi
shift
;;
esac
done| if [ "$TEXT_MODE" -eq 0 ]; then | ||
| [ -z "$INPUT_FILE" ] && usage | ||
| [ ! -f "$INPUT_FILE" ] && { echo "ERROR: File not found: $INPUT_FILE" >&2; exit 1; } | ||
| OUTPUT_DIR="${OUTPUT_DIR:-$(dirname "$INPUT_FILE")}" | ||
| else | ||
| [ -n "$INPUT_FILE" ] && { echo "ERROR: --text and a file path are mutually exclusive" >&2; exit 1; } | ||
| OUTPUT_DIR="${OUTPUT_DIR:-$(pwd)}" | ||
| fi |
There was a problem hiding this comment.
Using standard if statements instead of && short-circuits for validation checks is much safer under set -e and improves readability.
| if [ "$TEXT_MODE" -eq 0 ]; then | |
| [ -z "$INPUT_FILE" ] && usage | |
| [ ! -f "$INPUT_FILE" ] && { echo "ERROR: File not found: $INPUT_FILE" >&2; exit 1; } | |
| OUTPUT_DIR="${OUTPUT_DIR:-$(dirname "$INPUT_FILE")}" | |
| else | |
| [ -n "$INPUT_FILE" ] && { echo "ERROR: --text and a file path are mutually exclusive" >&2; exit 1; } | |
| OUTPUT_DIR="${OUTPUT_DIR:-$(pwd)}" | |
| fi | |
| if [ "$TEXT_MODE" -eq 0 ]; then | |
| if [ -z "$INPUT_FILE" ]; then | |
| usage | |
| fi | |
| if [ ! -f "$INPUT_FILE" ]; then | |
| echo "ERROR: File not found: $INPUT_FILE" >&2 | |
| exit 1 | |
| fi | |
| OUTPUT_DIR="${OUTPUT_DIR:-$(dirname "$INPUT_FILE")}" | |
| else | |
| if [ -n "$INPUT_FILE" ]; then | |
| echo "ERROR: --text and a file path are mutually exclusive" >&2 | |
| exit 1 | |
| fi | |
| OUTPUT_DIR="${OUTPUT_DIR:-$(pwd)}" | |
| fi |
| # KEY=0/1 format | ||
| grep -E '^[A-Za-z0-9_.]+=[01]$' "$SRCFILE" \ | ||
| | awk -F= '{print $1 "=" ($2=="1" ? "true" : "false")}' \ | ||
| > "$TMPFILE" |
There was a problem hiding this comment.
The current grep and awk pipeline is very strict and will completely ignore lines that have leading/trailing whitespace, spaces around the = sign, or Windows-style carriage returns (\r). We can make this parsing much more robust and efficient by using a single awk command that handles optional whitespace and strips carriage returns.
| # KEY=0/1 format | |
| grep -E '^[A-Za-z0-9_.]+=[01]$' "$SRCFILE" \ | |
| | awk -F= '{print $1 "=" ($2=="1" ? "true" : "false")}' \ | |
| > "$TMPFILE" | |
| # KEY=0/1 format (robustly handles spaces and carriage returns) | |
| awk -F'[ \\t]*=[ \\t]*' '/^[ \\t]*[A-Za-z0-9_.]+[ \\t]*=[ \\t]*[01]/ { | |
| val = $2; sub(/[ \\t\\r]+$/, "", val); | |
| key = $1; sub(/^[ \\t]+/, "", key); sub(/[ \\t]+$/, "", key); | |
| print key "=" (val == "1" ? "true" : "false") | |
| }' "$SRCFILE" > "$TMPFILE" |
| for PREFIX in $PREFIXES; do | ||
| OUTFILE="${OUTPUT_DIR}/${PREFIX}.xml" | ||
|
|
||
| awk -v prefix="$PREFIX" -v ts="$TIMESTAMP" -v outfile="$OUTFILE" ' |
There was a problem hiding this comment.
The outfile variable passed to awk via -v outfile="$OUTFILE" is never used inside the awk script, as the output is redirected in the shell using > "$OUTFILE" on line 267. Removing it keeps the command clean.
| awk -v prefix="$PREFIX" -v ts="$TIMESTAMP" -v outfile="$OUTFILE" ' | |
| awk -v prefix="$PREFIX" -v ts="$TIMESTAMP" ' |
| # commandsReceived: third segment starts with C, last segment is Rsp | ||
| if (np >= 4 && parts[2] ~ /^[SC]$/ && parts[3] ~ /^C/ && parts[np] == "Rsp") | ||
| return "commandsReceived" |
There was a problem hiding this comment.
In Matter PICS, received commands can be represented with either Rsp or Rx suffixes (e.g., OO.S.C00.Rx). Supporting both ensures that all received commands are correctly categorized under commandsReceived instead of falling back to manually.
| # commandsReceived: third segment starts with C, last segment is Rsp | |
| if (np >= 4 && parts[2] ~ /^[SC]$/ && parts[3] ~ /^C/ && parts[np] == "Rsp") | |
| return "commandsReceived" | |
| # commandsReceived: third segment starts with C, last segment is Rsp or Rx | |
| if (np >= 4 && parts[2] ~ /^[SC]$/ && parts[3] ~ /^C/ && (parts[np] == "Rsp" || parts[np] == "Rx")) | |
| return "commandsReceived" |
Summary
This PR adds a new utility scripts for converting PICS log data to XML PICS files.
Scripts — New PICS-to-XML utilities
Related Issue
#1016
Testing