NextWord-AI

A sophisticated next-word prediction system using LSTM neural networks, trained on Shakespeare's Hamlet. This project demonstrates advanced NLP techniques with deep learning for text generation and word prediction.

🚀 Features

LSTM-based Neural Network: Advanced recurrent neural network architecture for sequence prediction
Shakespeare's Hamlet Dataset: Trained on classic literature for rich linguistic patterns
Early Stopping: Prevents overfitting with intelligent training termination
Interactive Web Interface: Streamlit-powered UI for real-time predictions
Model Persistence: Save and load trained models for future use
Preprocessing Pipeline: Complete text tokenization and sequence preparation

📁 Project Structure

NextWord-AI/
├── app.py                 # Streamlit web application
├── experiments.ipynb      # Jupyter notebook with model development
├── next_word_lstm.h5     # Trained LSTM model
├── tokenizer.pickle      # Saved tokenizer for text processing
├── hamlet.txt           # Shakespeare's Hamlet text dataset
├── requirements.txt     # Python dependencies
└── README.md           # Project documentation

🛠️ Installation

Clone the repository

git clone https://github.com/CyberMage7/NextWord-AI.git
cd NextWord-AI

Install dependencies
```
pip install -r requirements.txt
```

🎯 Usage

Running the Web Application

Launch the Streamlit interface for interactive predictions:

streamlit run app.py

Navigate to http://localhost:8501 in your browser and enter text to get next-word predictions.

Training Your Own Model

Open and run the experiments.ipynb notebook to:

Download and preprocess the Hamlet dataset
Train the LSTM model with early stopping
Save the trained model and tokenizer
Test predictions on custom text

🧠 Model Architecture

The LSTM model features:

Embedding Layer: 100-dimensional word embeddings
LSTM Layers: Two LSTM layers (150 and 100 units) with dropout
Output Layer: Softmax activation for word probability distribution
Early Stopping: Monitors validation loss with patience of 5 epochs

📊 Training Process

Data Collection: Downloads Shakespeare's Hamlet from NLTK corpus
Preprocessing: Tokenizes text and creates n-gram sequences
Sequence Padding: Ensures uniform input length
Train/Test Split: 80/20 split for model validation
Model Training: Uses categorical crossentropy loss with Adam optimizer

🎨 Example Predictions

Input: "To be or not to"
Prediction: "be"

Input: "To be bad is better than"
Prediction: [context-dependent prediction]

🔧 Technical Details

Framework: TensorFlow/Keras
Architecture: Sequential LSTM
Optimizer: Adam
Loss Function: Categorical Crossentropy
Validation: Early stopping with best weight restoration

📦 Dependencies

tensorflow>=2.16.0
pandas
numpy
scikit-learn
matplotlib
tensorboard
streamlit
scikeras
nltk

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/enhancement)
Commit changes (git commit -am 'Add new feature')
Push to branch (git push origin feature/enhancement)
Open a Pull Request

📄 License

This project is licensed under the terms specified in the LICENSE file.

🙏 Acknowledgments

Shakespeare's works via NLTK Gutenberg corpus
TensorFlow team for the deep learning framework
Streamlit for the web interface framework

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NextWord-AI

🚀 Features

📁 Project Structure

🛠️ Installation

🎯 Usage

Running the Web Application

Training Your Own Model

🧠 Model Architecture

📊 Training Process

🎨 Example Predictions

🔧 Technical Details

📦 Dependencies

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
app.py		app.py
experiments.ipynb		experiments.ipynb
hamlet.txt		hamlet.txt
next_word_lstm.h5		next_word_lstm.h5
requirements.txt		requirements.txt
tokenizer.pickle		tokenizer.pickle

Folders and files

Latest commit

History

Repository files navigation

NextWord-AI

🚀 Features

📁 Project Structure

🛠️ Installation

🎯 Usage

Running the Web Application

Training Your Own Model

🧠 Model Architecture

📊 Training Process

🎨 Example Predictions

🔧 Technical Details

📦 Dependencies

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages