All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Critical Fix: Resolved
IndexError: index out of range in selfthat occurred when validation sequences exceeded model's max_length - Added automatic sequence truncation in GPT model forward method with warning logs
- Implemented max_length parameter in DataLoader with custom collate function
- Added position embedding size validation in checkpoint loading
- Enhanced error handling with detailed diagnostic messages and actionable solutions
- Fixed tensor contiguity issues by using
reshape()instead ofview()for loss calculation
- Added model configuration logging in Evaluator class (displays max_length, vocab_size, position embedding size)
- Evaluation script now extracts and uses max_length from loaded model
- Enhanced error messages provide clear guidance on fixing sequence length issues
- Added comprehensive unit and integration tests for sequence length validation
- Updated README with position embedding troubleshooting section
- Added implementation guide for applying fixes to existing projects
- Created detailed test results documentation
- Updated smart defaults section to mention automatic sequence length handling
- Added 3 unit tests for sequence length validation
- Added 5 integration tests for evaluation with various sequence lengths
- All tests pass successfully with sequences at, exceeding, and far beyond max_length
- NANO Template (1M params) - For learning and quick experiments
- TINY Template (6M params) - For prototyping and small projects
- SMALL Template (100M params) - For production applications
- BASE Template (1B params) - For research and high-quality models
- Complete PyTorch training infrastructure
- Data preprocessing pipeline
- Tokenizer training (BPE, WordPiece, Unigram)
- Checkpoint management with auto-save
- TensorBoard integration
- Live training dashboard
- Interactive chat interface
- Model comparison tools
- Deployment scripts
- Automatic vocab size detection from tokenizer
- Model/data size mismatch warnings
- Overfitting detection during training
- Cross-platform path handling
- UTF-8 encoding support for Windows
- WandB integration for experiment tracking
- HuggingFace Hub integration for model sharing
- SynthexAI integration for synthetic data
- Comprehensive README with examples
- Detailed project READMEs for scaffolded projects
- Contributing guidelines
- Troubleshooting guides
- Node.js 18+ required
- Python 3.8+ required
- PyTorch 2.0+ required
- Cross-platform support (Windows, macOS, Linux)
- Fixed data loading with 2D tensors
- Fixed vocab size mismatch (32K to auto-detect)
- Fixed Windows UTF-8 encoding issues
- Fixed deploy.py unicode escape errors
- Fixed chat.py cross-platform path handling
- Fixed model forward method to accept attention_mask
- Dashboard may show garbled emojis in Windows PowerShell (functionality works)
- PyTorch FutureWarning about torch.load (will be addressed in PyTorch 2.x)
- More model architectures (BERT, T5)
- Distributed training support
- Model quantization tools
- Fine-tuning templates
- Web UI for project management
- Automatic hyperparameter tuning
- 2.0.1 (2025-10-26) - Position embedding bug fix
- 1.0.0 (2025-01-24) - Initial release
- 0.1.0 (2025-01-20) - Beta release (internal)
For more details, see the full commit history.