Skip to content

Feature/1016 Create pics to xml script#1019

Open
rquidute wants to merge 2 commits into
v2.15.1-beta1+summer2026from
feature/1016-create-pics-to-xml-script
Open

Feature/1016 Create pics to xml script#1019
rquidute wants to merge 2 commits into
v2.15.1-beta1+summer2026from
feature/1016-create-pics-to-xml-script

Conversation

@rquidute
Copy link
Copy Markdown
Contributor

@rquidute rquidute commented Jun 3, 2026

Summary
This PR adds a new utility scripts for converting PICS log data to XML PICS files.

Scripts — New PICS-to-XML utilities

  • scripts/pics_to_xml.sh (bash): Converts PICS log data to XML PICS files. Accepts Python dict log format (PICSItem(...)) or KEY=0/1 lines. Supports file input, --text for unquoted inline pasting, and -o for output directory. Produces one XML per cluster prefix.

Related Issue

#1016

Testing

Screenshot 2026-06-03 at 13 20 24

@rquidute rquidute self-assigned this Jun 3, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new utility script scripts/utils/pics_to_xml.sh designed to convert PICS log data (either in Python dict log format or raw KEY=0/1 lines) into XML PICS files grouped by cluster prefix. The reviewer provided several constructive suggestions to improve the script's robustness and parsing accuracy. These include enabling strict error handling (set -euo pipefail), refactoring short-circuit shell evaluations into standard if statements to prevent unexpected exits under set -e, checking argument counts to avoid unbound variable errors under set -u, enhancing the KEY=0/1 parser to handle whitespaces and carriage returns, removing an unused variable in the awk command, and expanding command reception parsing to support both Rsp and Rx suffixes.

@@ -0,0 +1,273 @@
#!/usr/bin/env bash
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To make the script more robust and prevent silent failures, it is highly recommended to enable strict error handling using set -euo pipefail.

Suggested change
#!/usr/bin/env bash
#!/usr/bin/env bash
set -euo pipefail

Comment on lines +18 to +24
usage() {
echo "Usage: $0 <input_file> [output_dir]"
echo " $0 --text \"KEY=0/1 ...\" [-o output_dir]"
exit 1
}

[ $# -eq 0 ] && usage
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Under set -e, short-circuit evaluations like [ $# -eq 0 ] && usage can cause the shell to exit unexpectedly if the condition is false (since the overall expression returns a non-zero status). Rewriting this as a standard if statement is safer and more readable.

Suggested change
usage() {
echo "Usage: $0 <input_file> [output_dir]"
echo " $0 --text \"KEY=0/1 ...\" [-o output_dir]"
exit 1
}
[ $# -eq 0 ] && usage
usage() {
echo "Usage: $0 <input_file> [output_dir]"
echo " $0 --text \\\"KEY=0/1 ...\\\" [-o output_dir]"
exit 1
}
if [ $# -eq 0 ]; then
usage
fi

Comment on lines +34 to +61
while [ $# -gt 0 ]; do
case "$1" in
--text)
[ -z "$2" ] && { echo "ERROR: --text requires a value" >&2; exit 1; }
TEXT_MODE=1
INLINE_TEXT="$2"
shift 2
;;
-o|--output-dir)
[ -z "$2" ] && { echo "ERROR: -o requires a value" >&2; exit 1; }
OUTPUT_DIR="$2"
shift 2
;;
-*)
echo "ERROR: Unknown option: $1" >&2; usage
;;
*)
if [ -z "$INPUT_FILE" ]; then
INPUT_FILE="$1"
elif [ -z "$OUTPUT_DIR" ]; then
OUTPUT_DIR="$1" # positional second arg kept for back-compat
else
echo "ERROR: Unexpected argument: $1" >&2; usage
fi
shift
;;
esac
done
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When parsing arguments, checking $2 directly (e.g., [ -z "$2" ]) when $# might be less than 2 can lead to unbound variable errors under set -u in some shells, and is generally less robust. Checking the argument count $# explicitly before accessing $2 is safer and more idiomatic. Also, using standard if statements instead of && short-circuits prevents unexpected exits under set -e.

while [ $# -gt 0 ]; do
    case "$1" in
        --text)
            if [ $# -lt 2 ]; then
                echo "ERROR: --text requires a value" >&2
                exit 1
            fi
            TEXT_MODE=1
            INLINE_TEXT="$2"
            shift 2
            ;;
        -o|--output-dir)
            if [ $# -lt 2 ]; then
                echo "ERROR: -o requires a value" >&2
                exit 1
            fi
            OUTPUT_DIR="$2"
            shift 2
            ;;
        -*)
            echo "ERROR: Unknown option: $1" >&2
            usage
            ;;
        *)
            if [ -z "$INPUT_FILE" ]; then
                INPUT_FILE="$1"
            elif [ -z "$OUTPUT_DIR" ]; then
                OUTPUT_DIR="$1"   # positional second arg kept for back-compat
            else
                echo "ERROR: Unexpected argument: $1" >&2
                usage
            fi
            shift
            ;;
    esac
done

Comment on lines +63 to +70
if [ "$TEXT_MODE" -eq 0 ]; then
[ -z "$INPUT_FILE" ] && usage
[ ! -f "$INPUT_FILE" ] && { echo "ERROR: File not found: $INPUT_FILE" >&2; exit 1; }
OUTPUT_DIR="${OUTPUT_DIR:-$(dirname "$INPUT_FILE")}"
else
[ -n "$INPUT_FILE" ] && { echo "ERROR: --text and a file path are mutually exclusive" >&2; exit 1; }
OUTPUT_DIR="${OUTPUT_DIR:-$(pwd)}"
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using standard if statements instead of && short-circuits for validation checks is much safer under set -e and improves readability.

Suggested change
if [ "$TEXT_MODE" -eq 0 ]; then
[ -z "$INPUT_FILE" ] && usage
[ ! -f "$INPUT_FILE" ] && { echo "ERROR: File not found: $INPUT_FILE" >&2; exit 1; }
OUTPUT_DIR="${OUTPUT_DIR:-$(dirname "$INPUT_FILE")}"
else
[ -n "$INPUT_FILE" ] && { echo "ERROR: --text and a file path are mutually exclusive" >&2; exit 1; }
OUTPUT_DIR="${OUTPUT_DIR:-$(pwd)}"
fi
if [ "$TEXT_MODE" -eq 0 ]; then
if [ -z "$INPUT_FILE" ]; then
usage
fi
if [ ! -f "$INPUT_FILE" ]; then
echo "ERROR: File not found: $INPUT_FILE" >&2
exit 1
fi
OUTPUT_DIR="${OUTPUT_DIR:-$(dirname "$INPUT_FILE")}"
else
if [ -n "$INPUT_FILE" ]; then
echo "ERROR: --text and a file path are mutually exclusive" >&2
exit 1
fi
OUTPUT_DIR="${OUTPUT_DIR:-$(pwd)}"
fi

Comment on lines +100 to +103
# KEY=0/1 format
grep -E '^[A-Za-z0-9_.]+=[01]$' "$SRCFILE" \
| awk -F= '{print $1 "=" ($2=="1" ? "true" : "false")}' \
> "$TMPFILE"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current grep and awk pipeline is very strict and will completely ignore lines that have leading/trailing whitespace, spaces around the = sign, or Windows-style carriage returns (\r). We can make this parsing much more robust and efficient by using a single awk command that handles optional whitespace and strips carriage returns.

Suggested change
# KEY=0/1 format
grep -E '^[A-Za-z0-9_.]+=[01]$' "$SRCFILE" \
| awk -F= '{print $1 "=" ($2=="1" ? "true" : "false")}' \
> "$TMPFILE"
# KEY=0/1 format (robustly handles spaces and carriage returns)
awk -F'[ \\t]*=[ \\t]*' '/^[ \\t]*[A-Za-z0-9_.]+[ \\t]*=[ \\t]*[01]/ {
val = $2; sub(/[ \\t\\r]+$/, "", val);
key = $1; sub(/^[ \\t]+/, "", key); sub(/[ \\t]+$/, "", key);
print key "=" (val == "1" ? "true" : "false")
}' "$SRCFILE" > "$TMPFILE"

for PREFIX in $PREFIXES; do
OUTFILE="${OUTPUT_DIR}/${PREFIX}.xml"

awk -v prefix="$PREFIX" -v ts="$TIMESTAMP" -v outfile="$OUTFILE" '
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The outfile variable passed to awk via -v outfile="$OUTFILE" is never used inside the awk script, as the output is redirected in the shell using > "$OUTFILE" on line 267. Removing it keeps the command clean.

Suggested change
awk -v prefix="$PREFIX" -v ts="$TIMESTAMP" -v outfile="$OUTFILE" '
awk -v prefix="$PREFIX" -v ts="$TIMESTAMP" '

Comment on lines +139 to +141
# commandsReceived: third segment starts with C, last segment is Rsp
if (np >= 4 && parts[2] ~ /^[SC]$/ && parts[3] ~ /^C/ && parts[np] == "Rsp")
return "commandsReceived"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In Matter PICS, received commands can be represented with either Rsp or Rx suffixes (e.g., OO.S.C00.Rx). Supporting both ensures that all received commands are correctly categorized under commandsReceived instead of falling back to manually.

Suggested change
# commandsReceived: third segment starts with C, last segment is Rsp
if (np >= 4 && parts[2] ~ /^[SC]$/ && parts[3] ~ /^C/ && parts[np] == "Rsp")
return "commandsReceived"
# commandsReceived: third segment starts with C, last segment is Rsp or Rx
if (np >= 4 && parts[2] ~ /^[SC]$/ && parts[3] ~ /^C/ && (parts[np] == "Rsp" || parts[np] == "Rx"))
return "commandsReceived"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant