Skip to content

JSON isn't being escaped correctly in at least one case #280

@ross-spencer

Description

@ross-spencer

Given the signature file sig1.xml:

NB. note the format name: Development\Signature

<?xml version="1.0" encoding="UTF-8"?>
<FFSignatureFile xmlns="http://www.nationalarchives.gov.uk/pronom/SignatureFile" Version="1767086900" DateCreated="2025-12-30T09:28:20+00:00">
  <InternalSignatureCollection>
  <InternalSignature ID="1" Specificity="Specific">
  <ByteSequence Reference="BOFoffset">
    <SubSequence MinFragLength="0" Position="1" SubSeqMaxOffset="0" SubSeqMinOffset="0">
      <Sequence>68656C6C6F</Sequence>
      <DefaultShift>6</DefaultShift>
      <Shift Byte="68">5</Shift>
      <Shift Byte="65">4</Shift>
      <Shift Byte="6C">2</Shift>
      <Shift Byte="6F">1</Shift>
    </SubSequence>
  </ByteSequence>
</InternalSignature>
</InternalSignatureCollection>
<FileFormatCollection>
  <FileFormat ID="1" Name="Development\Signature" PUID="dev/1" Version="1.0" MIMEType="application/octet-stream">
    <InternalSignatureID>1</InternalSignatureID>
    <Extension>ext</Extension>
  </FileFormat>
</FileFormatCollection></FFSignatureFile>

Stored in /tmp build with roy:

roy build --noreports -extend /tmp/sig1.xml

Identify the file hello.txt:

hello

We get valid YAML:

---
siegfried   : 1.11.2
scandate    : 2025-12-30T10:32:54+01:00
signature   : default.sig
created     : 2025-12-30T10:29:59+01:00
identifiers : 
  - name    : 'pronom'
    details : 'DROID_SignatureFile_V120.xml; container-signature-20240715.xml; built without reports; extensions: /tmp/sig1.xml'
---
filename : 'hello'
filesize : 6
modified : 2025-12-30T10:28:45+01:00
errors   : 
matches  :
  - ns      : 'pronom'
    id      : 'dev/1'
    format  : 'Development\Signature'
    version : '1.0'
    mime    : 'application/octet-stream'
    basis   : 'byte match at 0, 5'
    warning : 'extension mismatch'

But, we don't get valid JSON: sf -json hello

{"siegfried":"1.11.2","scandate":"2025-12-30T10:33:57+01:00","signature":"default.sig","created":"2025-12-30T10:29:59+01:00","identifiers":[{"name":"pronom","details":"DROID_SignatureFile_V120.xml; container-signature-20240715.xml; built without reports; extensions: /tmp/berg/sig1.xml"}],"files":[{"filename":"hello","filesize": 6,"modified":"2025-12-30T10:28:45+01:00","errors": "","matches": [{"ns":"pronom","id":"dev/1","format":"Development\Signature","version":"1.0","mime":"application/octet-stream","basis":"byte match at 0, 5","warning":"extension mismatch"}]}]}

JQ reporting the following:

jq: parse error: Invalid escape at line 1, column 456

This is simply due to the \ in Development\Signature.

We have an easy enough workaround, we can remove the slash. In JSON to escape this properly we need to recognize it and identify the slash, and then use \\.

Via: https://www.json.org/json-en.html

We need the following escaped:

escape
'"'
'\'
'/'
'b'
'f'
'n'
'r'
't'
'u' hex hex hex hex

We'd need to look which ones are missing from the JSON formatter. I can write test cases given a spare moment. (Alternatively, consider switching to the default tooling?)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions