Skip to content

0.2.0

Latest

Choose a tag to compare

@vsemionov vsemionov released this 24 Apr 20:23
· 1 commit to main since this release
  • Use FlashAttention
  • Use SentencePiece BPE tokenizer
  • Increased context length
  • Added gradient checkpointing
  • Support combining multiple datasets
  • Download pre-trained tokenizer
  • Optional encoding on the fly
  • Added greedy search in inference
  • Support continued prompts in inference (not the start of the sequence)
  • Compute dataset and vocabulary statistics
  • Data validation