ResGRUin2025

Residual GRU Language Model — Inference GUI

By Ali Hakim Taşkıran and Ege Aybars Bozkurt

This repository contains a compact, interpretable language model based on a deep Residual GRU architecture, trained on the Wikitext dataset. It includes a real-time interactive GUI (app/) for text generation, designed for simplicity, transparency, and local execution—no internet or cloud required after setup.

🧠 Model Overview

Architecture: 12-layer Residual GRU with embedding and hidden dimensions of 512 and 1024, respectively.
Parameters: ~120M (lightweight compared to transformer-based LLMs).
Training Data: Wikitext-103 (Wikipedia-derived text).
Features:
- Residual connections between GRU layers for improved gradient flow.
- Dropout and layer-wise regularization.
- Log loss and log(log()) loss training in order to reduce unigrm confidence and amplify gradients associated with uncommon tokens.
Output: Autoregressive token-by-token generation with streaming inference.
File: model.pt contains the final trained weights (PyTorch state dict).
Tokenizer: Byte-level BPE via Hugging Face tokenizers, serialized as wikitext_tokenizer.pkl.

🔍 Why GRU?
This model prioritizes architectural simplicity, debuggability, and low-resource deployment—ideal for research, education, or edge devices where transformers are overkill.

▶️ How to Run the App

1. Prerequisites

Python ≥ 3.8
Packages: torch, tokenizers, tkinter

Install dependencies:

pip install torch tokenizers

✅ On Ubuntu/Debian, ensure Tkinter is available:
sudo apt install python3-tk

2. Download `model.pt`

The model checkpoint is not included in the repo to avoid duplication, but you can download it from:

🔗 model.pt on Google Drive

Save it directly into the app/ folder.

You can also download it via command line (requires gdown):

cd app
pip install gdown
gdown "1sZwhT6fII8g_-DMkG7AKkrPWlVQQ3mqX" -O model.py

3. Run the GUI

From the app/ directory:

cd app
python infer-GUI.py

The interface will launch with:

Device selector (CPU / CUDA)
Prompt input box
Controls for temperature, top-k, max tokens, and seed
Real-time streaming output

Type a prompt like "Quantum computing enables..." and click ▶ Generate.

📜 License

Code (infer-GUI.py): MIT License (feel free to use/modify).
Model weights (model.pt): For research and personal use only.
Tokenizer: Derived from Wikitext; redistribution follows original dataset terms.

⚠️ Do not redistribute model.pt without explicit permission.

🙌 Acknowledgements

Inspired by classical RNN language modeling (Mikolov et al.)
Tokenizer built with Hugging Face tokenizers
GUI powered by Python’s built-in tkinter for zero-dependency deployment

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
app		app
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ResGRUin2025

Residual GRU Language Model — Inference GUI

🧠 Model Overview

▶️ How to Run the App

1. Prerequisites

2. Download `model.pt`

3. Run the GUI

📜 License

🙌 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ResGRUin2025

Residual GRU Language Model — Inference GUI

🧠 Model Overview

▶️ How to Run the App

1. Prerequisites

2. Download model.pt

3. Run the GUI

📜 License

🙌 Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

2. Download `model.pt`

Packages