Can you provide detailed example for MSR dataset. I run into same problem as here https://github.com/google-research/lasertagger/issues/5#issuecomment-582414346 I don't understand how to tokenise this dataset. Can it be tokenised with FullTokenizer in BERT? Please help if you can.