For lang-sma, the following tts pipeline is specified:
export default function smaTextTTS(entry: StringEntry): Command {
let x = hfst.tokenize("tokenise", entry, { model_path: "tokeniser-tts-cggt-desc.pmhfst" });
x = divvun.blanktag("whitespace", x, { model_path: "analyser-gt-whitespace.hfst" });
x = cg3.vislcg3("remove-lexicalised", x, { model_path: "generated-remove-lexicalised-compounds.bin" });
x = cg3.vislcg3("valency", x, { model_path: "valency.bin" });
x = cg3.vislcg3("mwe-dis", x, { model_path: "mwe-dis.bin" });
x = cg3.mwesplit("mwesplit", x);
x = cg3.vislcg3("disamb", x, { model_path: "disambiguator.bin" });
x = cg3.vislcg3("functions", x, { model_path: "functions.bin" });
x = cg3.vislcg3("deps", x, { model_path: "dependency.bin" });
x = speech.normalize(
"normaliser", x,
{
generator: "generator-tts-gt-norm.hfstol",
analyzer: "analyser-gt-norm.hfstol",
normalizers: {
"Sem/Time-clock": "transcriptor-clock-digit2text.filtered.lookup.hfstol",
"Sem/Date": "transcriptor-ttsdate-digit2text.filtered.lookup.hfstol",
"Sem/Year": "transcriptor-ttsdate-digit2text.filtered.lookup.hfstol",
"Arab": "transcriptor-numbers-digit2text.filtered.lookup.hfstol",
"Roman": "transcriptor-numbers-digit2text.filtered.lookup.hfstol",
"ABBR": "transcriptor-abbrevs2text.filtered.lookup.hfstol",
"ACR": "transcriptor-abbrevs2text.filtered.lookup.hfstol",
"Symbol": "transcriptor-symbols2text.filtered.lookup.hfstol",
"Emoji": "transcriptor-emoji2text.filtered.lookup.hfstol"
}
}
);
x = speech.phon("text2phon", x, { model: "text2phontext.hfstol", tag_models: { "ACR": "acro2text.hfstol" } });
x = cg3.sentences("phon", x, { mode: "phonological" });
return x;
}
But after bundling, the normalisation step looks like this in the Divvun Runtime Playground (line breaks added for readability):
speech::normalize(analyzer = <path>"analyser-gt-norm.hfstol",
generator = <path>"generator-tts-gt-norm.hfstol",
normalizers = <{path}>{Sem/Time-clock: "transcriptor-clock-digit2text.filtered.lookup.hfstol",
Sem/Date: "transcriptor-ttsdate-digit2text.filtered.lookup.hfstol",
Sem/Year: "transcriptor-ttsdate-digit2text.filtered.lookup.hfstol",
Arab: "transcriptor-numbers-digit2text.filtered.lookup.hfstol",
Roman: "transcriptor-numbers-digit2text.filtered.lookup.hfstol",
ABBR: "transcriptor-abbrevs2text.filtered.lookup.hfstol",
ACR: "transcriptor-abbrevs2text.filtered.lookup.hfstol",
Symbol: "transcriptor-symbols2text.filtered.lookup.hfstol",
Emoji: "transcriptor-emoji2text.filtered.lookup.hfstol"}) -> string
Notice the curly braces {} for the first tag-specific normalisation (Sem/Time-clock). The effect of this can be seen by comparing the output for the following two sentences:
- with error:
Joekoen guhkiem, jis edtjebe jaehkedh dam 25 jaepien båeries nyjsenæjjam Kloemegistie.
Output:
"joekoen guhkiem, jis edtjebe jaehkedh dam 25 jaepien båeries nyjsenæjjam kloemegistie"
- without error:
Joekoen guhkiem, jis edtjebe jaehkedh dam 35 jaepien båeries nyjsenæjjam Kloemegistie.
Output:
"joekoen guhkiem, jis edtjebe jaehkedh dam golmeluhkievïjhte jaepien båeries nyjsenæjjam kloemegistie"
The difference is that 25 gets a Sem/Time-clock reading after disambiguation:
"<25>"
"25" Num Arab Sem/Time-clock Sg Nom <W:0.0> <sma>
whereas 35 does not (because 35 can't be an hour in our system):
"<35>"
"35" Num Arab Sg Gen Attr <W:0.0> <sma>
This difference in disambiguated analysis affects which FST is used in the normalisation process: either the first one with the curly braces, or another one. The one with the curly braces is not applied.
If digit2text conversion is done outside the divvun-runtime environment, the Sem/Time-clock FST delivers exactly the same output as the other one, so if it had worked, output would have been correct even if the analysis is wrong (it is not a clock hour, it is an age in years):
echo 25 | hfst-lookup -q tools/tts/transcriptor-ttsdate-digit2text.filtered.lookup.hfstol
25 göökteluhkievïjhte 0.000000
echo 25 | hfst-lookup -q tools/tts/transcriptor-numbers-digit2text.filtered.lookup.hfstol
25 göökteluhkievïjhte 0.000000
The output is the same for both the Playground (above) and the CLI:
echo 'Joekoen guhkiem, jis edtjebe jaehkedh dam 25 jaepien båeries nyjsenæjjam Kloemegistie.' |\
divvun-runtime run -p tools/tts/bundle.drb
Output:
[
"joekoen guhkiem, jis edtjebe jaehkedh dam 25 jaepien båeries nyjsenæjjam kloemegistie",
"\\n",
]
And the working version:
echo 'Joekoen guhkiem, jis edtjebe jaehkedh dam 35 jaepien båeries nyjsenæjjam Kloemegistie.' |\
divvun-runtime run -p tools/tts/bundle.drb
Output:
[
"joekoen guhkiem, jis edtjebe jaehkedh dam golmeluhkievïjhte jaepien båeries nyjsenæjjam kloemegistie",
"\\n",
]
For lang-sma, the following tts pipeline is specified:
But after bundling, the normalisation step looks like this in the Divvun Runtime Playground (line breaks added for readability):
Notice the curly braces
{}for the first tag-specific normalisation (Sem/Time-clock). The effect of this can be seen by comparing the output for the following two sentences:Output:
Output:
The difference is that
25gets aSem/Time-clockreading after disambiguation:whereas
35does not (because 35 can't be an hour in our system):This difference in disambiguated analysis affects which FST is used in the normalisation process: either the first one with the curly braces, or another one. The one with the curly braces is not applied.
If digit2text conversion is done outside the divvun-runtime environment, the
Sem/Time-clockFST delivers exactly the same output as the other one, so if it had worked, output would have been correct even if the analysis is wrong (it is not a clock hour, it is an age in years):The output is the same for both the Playground (above) and the CLI:
Output:
[ "joekoen guhkiem, jis edtjebe jaehkedh dam 25 jaepien båeries nyjsenæjjam kloemegistie", "\\n", ]And the working version:
Output:
[ "joekoen guhkiem, jis edtjebe jaehkedh dam golmeluhkievïjhte jaepien båeries nyjsenæjjam kloemegistie", "\\n", ]