Skip to content

Commit be22248

Browse files
authored
[src] Make word alignment optional (#4802)
* Remove unused variable. * cudadecoder: Make word alignment optional. For CTC models using word pieces or graphemes, there is not enough positional information to use the word alignment. I tried marking every unit as "singleton" word_boundary.txt, but this explodes the state space very, very often. See: nvidia-riva/riva-asrlib-decoder#3 With the "_" character in CTC models predicting word pieces, we at the very least know which word pieces begin a word and which ones are either in the middle of the word or the end of a word, but the algorithm would still need to be rewritten, especially since "blank" is not a silence phoneme (it can appear between). I did look into using the lexicon-based word alignment. I don't have a specific complaint about it, but I did get a weird error where it couldn't create a final state at all in the output lattice, which caused Connect() to output an empty lattice. This may be because I wasn't quite sure how to handle the blank token. I treat it as its own phoneme, bcause of limitations in TransitionInformation, but this doesn't really make any sense. Needless to say, while the CTM outputs of the cuda decoder will be correct from a WER point of view, their time stamps won't be correct, but they probably never were in the first place, for CTC models.
1 parent f6f4cca commit be22248

File tree

2 files changed

+8
-9
lines changed

2 files changed

+8
-9
lines changed

src/cudadecoder/lattice-postprocessor.cc

+8-7
Original file line numberDiff line numberDiff line change
@@ -78,13 +78,14 @@ bool LatticePostprocessor::GetPostprocessedLattice(
7878
KALDI_ASSERT(decoder_frame_shift_ != 0.0 &&
7979
"SetDecoderFrameShift() must be called (typically by pipeline)");
8080

81-
if (!word_info_)
82-
KALDI_ERR << "You must set --word-boundary-rxfilename in the lattice "
83-
"postprocessor config";
84-
// ok &=
85-
// Ignoring the return false for now (but will print a warning),
86-
// because the doc says we can, and it can happen when using endpointing
87-
WordAlignLattice(clat, *tmodel_, *word_info_, max_states, out_clat);
81+
if (word_info_) {
82+
// ok &=
83+
// Ignoring the return false for now (but will print a warning),
84+
// because the doc says we can, and it can happen when using endpointing
85+
WordAlignLattice(clat, *tmodel_, *word_info_, max_states, out_clat);
86+
} else {
87+
*out_clat = clat;
88+
}
8889
return ok;
8990
}
9091

src/fstext/pre-determinize-inl.h

-2
Original file line numberDiff line numberDiff line change
@@ -689,11 +689,9 @@ typename Arc::StateId CreateSuperFinal(MutableFst<Arc> *fst) {
689689
typedef typename Arc::Weight Weight;
690690
assert(fst != NULL);
691691
StateId num_states = fst->NumStates();
692-
StateId num_final = 0;
693692
std::vector<StateId> final_states;
694693
for (StateId s = 0; s < num_states; s++) {
695694
if (fst->Final(s) != Weight::Zero()) {
696-
num_final++;
697695
final_states.push_back(s);
698696
}
699697
}

0 commit comments

Comments
 (0)