Pretraining of GPT2

This is a llm pretraining of GPT2 124M architecture on a small ./data dataset from scratch. Trained model can generate text for a given token size.

To run and train on low-end cpu only machines, gpt2 architecture pretrained with 256 context size and on a small short-story book text data ./data/the-verdict.txt . Alternatively, can use OpenAI gpt2 pretrained weights mentioned in below section. Model architecture configuration is given model_info.txt.

Datasets consists of 5145 tokens, 4608 token are used in training set.

Setup

Pre-requisites are python<=3.13 and uv package manger, instructions to set up can be found here.

Clone this repository

Either by download as zip option or by git clone https://github.com/lukmanulhakeem97/llm-pretraining.git command in CLI tool.
Create an python environment and install dependencies

create environment: uv venv [name], name is optional.

Navigate to cloned repo directory and install dependency given in pyproject.toml file:

cd llm-pretraining,

uv sync.
Activate venv by .\.venv\Scripts\activate

Run the code

Generate text:

Download pretrained model.pth from my huggingfaceHub and place it on cloned llm-pretraining path.
Run inference.py with any starting prompt

by using model.pth: uv run inference.py "Be now, then will be ".

by using OpenAI gpt2 pretrained weights: uv run inference.py "Be now, then will be " --load_openaigpt2_weight="yes".

Pretraining:

Run uv run train.py, will generate model.pth.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
data		data
utils		utils
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
__init__.py		__init__.py
inference.py		inference.py
load_data.py		load_data.py
loss-plot.pdf		loss-plot.pdf
model.py		model.py
model_info.txt		model_info.txt
pyproject.toml		pyproject.toml
train.py		train.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pretraining of GPT2

Setup

Run the code

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pretraining of GPT2

Setup

Run the code

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages