1. In Table 1, first row last column, why teacher-forcing cannot reuse kv-cache? 2. Is neighbor forcing a way of "noisy kv-cache" during training, as mentioned in https://arxiv.org/abs/2512.04677?