Description
Describe the bug
When using the latest version 1.13.0 of the Deepl NodeJS lib I notice an issue with XML tags.
When using source text Please start your '<x id=p1>Basic</x>' plan by clicking the button '<x id=p2>Accept</x>'.
, the translation gets different syntax around the <x></x>
tags.
To Reproduce
Steps to reproduce the behavior:
- API integrated correctly in NodeJS project
- Call:
const result = await translator.translateText("Please start your '<x id=p1>Basic</x>' plan by clicking the button '<x id=p2>Accept</x>'.", "en", "de", { tagHandling: 'xml' });
- The console output with the German translation:
Bitte starten Sie Ihren<x id=p1>'Basic</x>'-Plan, indem Sie auf die Schaltfläche<x id=p2>'Akzeptieren</x>' klicken.
- Analysis of translation syntax:
- The original English tags are surrounded by single quotes:
'<x id=p1>Basic</x>'
, while in the German output the opening quote is moved within the tags:<x id=p1>'Basic</x>'
❌ - The original English opening tags have a space in front of them:
your '<x id=p1>Basic</x>'
, while in the German output the opening tag is directly concatenated to the previous word:Ihren<x id=p1>'Basic</x>'
❌
The expected output should be:Bitte starten Sie Ihren '<x id=p1>Basic</x>'-Plan, indem Sie auf die Schaltfläche '<x id=p2>Akzeptieren</x>' klicken.
- Added parameters
preserveFormatting: true
,outlineDetection: true
andnonSplittingTags: ['x']
, but each individual or all possible combinations provide the same German output string.
Expected behavior
It's expected that formatting characters (like spaces) and other non-translatable characters (like quotes) around tags are maintained, especially when option preserveFormatting
is set to true
.
Update
After creating this post I did some more testing. It appears that the (single) quotes might be the issue. When using double quotes, the same issue occurs.
However, when removing the quotes around the XML tags:
const result = await translator.translateText("Please start your <x id=p1>Basic</x> plan by clicking the button <x id=p2>Accept</x>.", "en", "de", { tagHandling: 'xml' });
The output maintains the spaces around the tags ✅:
Bitte starten Sie Ihren <x id=p1>Basic-Plan</x>, indem Sie auf die Schaltfläche <x id=p2>Akzeptieren</x> klicken.
Update 2
After creating this post I noticed that I didn't use quotes for the value of attribute id
(see table "With Attributes" at https://developers.deepl.com/docs/xml-and-html-handling/xml). So basically my input string was malformed XML.
However, when applying quotes around p1
and p2
, the API still returns the same erroneous output:
const result = await translator.translateText("Please start your '<x id="p1">Basic</x>' plan by clicking the button '<x id="p2">Accept</x>'.", "en", "de", { tagHandling: 'xml' });
Question
Why doesn't the API handle quotes around XML tags properly?
Screenshots
N/A
Desktop (please complete the following information):
- OS: macOS 14.5
Additional context
- npm deepl-node 1.13.0
- NodeJS 16.6.0