A character-level Transformer model for translating English text to Sanskrit using PyTorch.
This project implements a complete neural machine translation system that converts English text to Sanskrit (Devanagari script) using a Transformer architecture with multi-head attention mechanisms.

- Transformer Architecture: Full encoder-decoder with multi-head attention
- Character-Level Translation: Fine-grained tokenization for better accuracy
- Complete Vocabulary: Comprehensive Sanskrit character set
pip install torch numpy matplotlib jupytercd transformer/
jupyter notebook final_transformer.ipynb
# Run all cells to train from scratch- Type: Encoder-Decoder Transformer
- Dimensions: 512 (d_model), 2048 (FFN)
- Attention Heads: 8
- Layers: 1 (configurable)
- Vocabulary: 89 Sanskrit + 183 English characters
- Max Length: 200 characters
- Training: Adam optimizer, Cross-entropy loss
- "I am here" → अहम् अत्र अस्मि
- "Do work don't expect result" → कर्म कुर्वन्तु फलं मा प्रत्याशयन्तु
- Framework: PyTorch 2.0+
- Training Data: English-Sanskrit parallel corpus
- Tokenization: Character-level with special tokens
- Attention: Multi-head self-attention + cross-attention
- Masking: Look-ahead and padding masks
- Data Preparation: Filter valid sentence pairs
- Vocabulary Building: Character-level tokenization
- Model Initialization: Xavier uniform weights
- Training Loop: 50 epoch with checkpoint saving
- Evaluation: Real-time translation testing
- Training: ~50 epochs on parallel corpus
- Inference: Real-time character generation
- Accuracy: Not accurate/Accurate over large corpus