Hello. I am trying to familiarize myself with DITA XML 1.3 and the DITA Open Toolkit (DITA-OT) without using proprietary XML editors.
As DITA states that its RELAX NG XML schemas are normative, I am trying to convert DITA-OT's document-type shells from .rng to .rnc files (XML syntax to compact syntax) using trang.
I am able to run the conversion from the CLI; however, the conversion introduces additional text at the end of every line ending within the dita:moduleDesc section:
dita:moduleDesc [
"\x{a}" ~
" "
dita:moduleTitle [ "DITA Concept Shell" ]
"\x{a}" ~
" "
dita:headerComment [
xml:space = "preserve"
"\x{a}" ~
"=============================================================\x{a}" ~
" HEADER \x{a}" ~
"=============================================================\x{a}" ~
"Darwin Information Typing Architecture (DITA) Version 1.3 Plus Errata 02\x{a}" ~
"OASIS Standard\x{a}" ~
"16 January 2018 \x{a}" ~
"Copyright (c) OASIS Open 2018. All rights reserved. \x{a}" ~
"Source: http://docs.oasis-open.org/dita/dita/v1.3/errata02/csprd01/complete/part0-overview/dita-v1.3-errata02-csprd01-part0-overview-complete.html\x{a}" ~
"\x{a}" ~
"============================================================\x{a}" ~
" MODULE: DITA Concept Shell \x{a}" ~
" VERSION: 1.3 \x{a}" ~
" DATE: March 2014 \x{a}" ~
" \x{a}" ~
"=============================================================\x{a}" ~
"\x{a}" ~
"=============================================================\x{a}" ~
" PUBLIC DOCUMENT TYPE DEFINITION \x{a}" ~
" TYPICAL INVOCATION \x{a}" ~
" \x{a}" ~
" Refer to this file by the following public identifier or an \x{a}" ~
" appropriate system identifier \x{a}" ~
" \x{a}" ~
'PUBLIC "-//OASIS//DTD DITA Concept//EN"\x{a}' ~
"\x{a}" ~
"The public ID above refers to the latest version of this DTD.\x{a}" ~
" To refer to this specific version, you may use this value:\x{a}" ~
"\x{a}" ~
'PUBLIC "-//OASIS//DTD DITA 1.3 Concept//EN" \x{a}' ~
"\x{a}" ~
"=============================================================\x{a}" ~
"SYSTEM: Darwin Information Typing Architecture (DITA) \x{a}" ~
" \x{a}" ~
"PURPOSE: DTD to describe DITA Concepts \x{a}" ~
" \x{a}" ~
"ORIGINAL CREATION DATE: \x{a}" ~
" March 2001 \x{a}" ~
" \x{a}" ~
" (C) Copyright OASIS Open 2005, 2014. \x{a}" ~
" (C) Copyright IBM Corporation 2001, 2004. \x{a}" ~
" All Rights Reserved. \x{a}" ~
" \x{a}" ~
" UPDATES: \x{a}" ~
" 2006.06.07 RDA: Added indexing domain \x{a}" ~
" 2006.06.21 RDA: Added props attribute extensions \x{a}" ~
" 2008.02.12 RDA: Modify imbeds to use specific 1.2 version \x{a}" ~
" 2008.04.15 RDA: Added hazard domain \x{a}" ~
" 2014.03.12 RDA: Updated for DITA 1.3. Implemented as \x{a}" ~
" RELAX NG\x{a}" ~
"=============================================================\x{a}" ~
" "
]
"\x{a}" ~
" "
dita:moduleMetadata [
"\x{a}" ~
" "
dita:moduleType [ "topicshell" ]
"\x{a}" ~
" "
dita:moduleShortName [ "concept" ]
"\x{a}" ~
" "
dita:shellPublicIds [
"\x{a}" ~
" "
dita:dtdShell [
"-//OASIS//DTD DITA"
dita:var [ presep = " " name = "ditaver" ]
" Concept//EN"
]
"\x{a}" ~
" "
dita:rncShell [
"urn:oasis:names:tc:dita:rnc:concept.rnc"
dita:var [ presep = ":" name = "ditaver" ]
]
"\x{a}" ~
" "
dita:rngShell [
"urn:oasis:names:tc:dita:rng:concept.rng"
dita:var [ presep = ":" name = "ditaver" ]
]
"\x{a}" ~
" "
dita:xsdShell [
"urn:oasis:names:tc:dita:xsd:concept.xsd"
dita:var [ presep = ":" name = "ditaver" ]
]
"\x{a}" ~
" "
]
"\x{a}" ~
" "
]
"\x{a}" ~
" "
]
Specifically, the extraneous text I am referring to is this:
Interestingly, the number of spaces on the second line seems to be correlated with indentation/nesting: When this text appears right after an additional level of nesting, the number of quoted spaces increases by 2. When this text (snippet? string? artifact?) appears right before a ] that ends a level of nesting, the number of quoted spaces decreases by 2.
For example, the number of quoted spaces increases from 4 to 6 after line 69 ( dita:moduleMetadata [).
On the line before this bracket is closed with ] (line 109), the number of quoted spaces decreases from 6, back down to 4.
The issue is seemingly replicated within a:documentation blocks,
trang pulls in a large number of other .rnc files during this conversion, and these additional .rnc files seem to exhibit the same problem.
I would like to know if there is a way to sidestep this problem. Simplifying the .rng file with jing first and then converting the resulting simplified RELAX NG XML syntax to .rnc seems to work; however, I seem to have a separate problem with that process and thus would prefer to know if I can directly convert DITA-OT document-type shells from .rng to .rnc without problems.
In case it matters, I am using Windows (10) PowerShell via Visual Studio Code's built-in terminal, and java -version (from the same terminal) has the following output:
openjdk version "23.0.2" 2025-01-21
OpenJDK Runtime Environment (build 23.0.2+7-58)
OpenJDK 64-Bit Server VM (build 23.0.2+7-58, mixed mode, sharing)
.zip containing the .rng and .rnc files relevant to this issue:
concept-rnc-and-rng.zip
Please let me know if I can supply any further information.
Edit:
It is likely that the given .rng file cannot be converted alone. If you would like to fully replicate my conversion environment, please download DITA-OT 4.2.4, extract the download, and navigate to .../dita-ot-4.2.4/plugins/org.oasis-open.dita.v1_3/rng/technicalContent/rng (substitute \ for / if on Windows) before running trang on concept.rng within this directory.
Edit 2:
The problem is not relegated to the dita:moduleDesc brackets. In other files, they also occur elsewhere in a pattern I am not sure I can understand. Comments preceded by ## seem intact, but I suspect a:documentation might be related:
- Line 29 of
svg-basic-clip.rnc: a:documentation [ "\x{a}" ~ " SVG.Clip.attrib\x{a}" ~ " " ]
In the interest of clarity, I will provide the following .zip containing all of my generated .rnc files (disregard concept_simplified files) and some of the corresponding .rng files that were provided within the DITA-OT 4.2.4 distribution. They might not work without the full DITA-OT archive.
rng-files-with-rnc.zip
Hello. I am trying to familiarize myself with DITA XML 1.3 and the DITA Open Toolkit (DITA-OT) without using proprietary XML editors.
As DITA states that its RELAX NG XML schemas are normative, I am trying to convert DITA-OT's document-type shells from
.rngto.rncfiles (XML syntax to compact syntax) usingtrang.I am able to run the conversion from the CLI; however, the conversion introduces additional text at the end of every line ending within the
dita:moduleDescsection:Specifically, the extraneous text I am referring to is this:
Interestingly, the number of spaces on the second line seems to be correlated with indentation/nesting: When this text appears right after an additional level of nesting, the number of quoted spaces increases by 2. When this text (snippet? string? artifact?) appears right before a
]that ends a level of nesting, the number of quoted spaces decreases by 2.For example, the number of quoted spaces increases from 4 to 6 after line 69 (
dita:moduleMetadata [).On the line before this bracket is closed with
](line 109), the number of quoted spaces decreases from 6, back down to 4.The issue is seemingly replicated within
a:documentationblocks,trangpulls in a large number of other.rncfiles during this conversion, and these additional.rncfiles seem to exhibit the same problem.I would like to know if there is a way to sidestep this problem. Simplifying the
.rngfile withjingfirst and then converting the resulting simplified RELAX NG XML syntax to.rncseems to work; however, I seem to have a separate problem with that process and thus would prefer to know if I can directly convert DITA-OT document-type shells from.rngto.rncwithout problems.In case it matters, I am using Windows (10) PowerShell via Visual Studio Code's built-in terminal, and
java -version(from the same terminal) has the following output:.zip containing the
.rngand.rncfiles relevant to this issue:concept-rnc-and-rng.zip
Please let me know if I can supply any further information.
Edit:
It is likely that the given
.rngfile cannot be converted alone. If you would like to fully replicate my conversion environment, please download DITA-OT 4.2.4, extract the download, and navigate to.../dita-ot-4.2.4/plugins/org.oasis-open.dita.v1_3/rng/technicalContent/rng(substitute\for/if on Windows) before runningtrangonconcept.rngwithin this directory.Edit 2:
The problem is not relegated to the
dita:moduleDescbrackets. In other files, they also occur elsewhere in a pattern I am not sure I can understand. Comments preceded by##seem intact, but I suspecta:documentationmight be related:svg-basic-clip.rnc:a:documentation [ "\x{a}" ~ " SVG.Clip.attrib\x{a}" ~ " " ]In the interest of clarity, I will provide the following .zip containing all of my generated
.rncfiles (disregardconcept_simplifiedfiles) and some of the corresponding.rngfiles that were provided within the DITA-OT 4.2.4 distribution. They might not work without the full DITA-OT archive.rng-files-with-rnc.zip