Skip to content

NeuralDreamResearch/ResGRUin2025

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

ResGRUin2025

Residual GRU Language Model — Inference GUI

By Ali Hakim Taşkıran and Ege Aybars Bozkurt

This repository contains a compact, interpretable language model based on a deep Residual GRU architecture, trained on the Wikitext dataset. It includes a real-time interactive GUI (app/) for text generation, designed for simplicity, transparency, and local execution—no internet or cloud required after setup.


🧠 Model Overview

  • Architecture: 12-layer Residual GRU with embedding and hidden dimensions of 512 and 1024, respectively.
  • Parameters: ~120M (lightweight compared to transformer-based LLMs).
  • Training Data: Wikitext-103 (Wikipedia-derived text).
  • Features:
    • Residual connections between GRU layers for improved gradient flow.
    • Dropout and layer-wise regularization.
    • Log loss and log(log()) loss training in order to reduce unigrm confidence and amplify gradients associated with uncommon tokens.
  • Output: Autoregressive token-by-token generation with streaming inference.
  • File: model.pt contains the final trained weights (PyTorch state dict).
  • Tokenizer: Byte-level BPE via Hugging Face tokenizers, serialized as wikitext_tokenizer.pkl.

🔍 Why GRU?
This model prioritizes architectural simplicity, debuggability, and low-resource deployment—ideal for research, education, or edge devices where transformers are overkill.

▶️ How to Run the App

1. Prerequisites

  • Python ≥ 3.8
  • Packages: torch, tokenizers, tkinter

Install dependencies:

pip install torch tokenizers

✅ On Ubuntu/Debian, ensure Tkinter is available:

sudo apt install python3-tk

2. Download model.pt

The model checkpoint is not included in the repo to avoid duplication, but you can download it from:

🔗 model.pt on Google Drive

Save it directly into the app/ folder.

You can also download it via command line (requires gdown):

cd app
pip install gdown
gdown "1sZwhT6fII8g_-DMkG7AKkrPWlVQQ3mqX" -O model.py

3. Run the GUI

From the app/ directory:

cd app
python infer-GUI.py

The interface will launch with:

  • Device selector (CPU / CUDA)
  • Prompt input box
  • Controls for temperature, top-k, max tokens, and seed
  • Real-time streaming output

Type a prompt like "Quantum computing enables..." and click ▶ Generate.


📜 License

  • Code (infer-GUI.py): MIT License (feel free to use/modify).
  • Model weights (model.pt): For research and personal use only.
  • Tokenizer: Derived from Wikitext; redistribution follows original dataset terms.

⚠️ Do not redistribute model.pt without explicit permission.


🙌 Acknowledgements

  • Inspired by classical RNN language modeling (Mikolov et al.)
  • Tokenizer built with Hugging Face tokenizers
  • GUI powered by Python’s built-in tkinter for zero-dependency deployment

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages