MiniGPT is custom implementation of ChatGPT following Andrej Karpathy tutorial here (working & good results).
See outputs MiniGPT_large_output_*.txt in /results folder.
MiniGPT_torch is MiniGPT implementation using PyTorch nn.TransformerDecoder and nn.TransformerDecoderLayer (running but bad results).
See output MiniGPT_torch_output.txt in /results folder.
Transformer is Full transformer implementation (encoder + decoder) from scratch (running & training but can't use it for inference).
Special tokens were added not to use the 0 token which caused errors with tiktokenizer and trigger the translation giving [english input + french BOS token].
The usage of custom special token is probably not needed anymore now that BART is used as tokenizer/encoder.
Transformer_torch is Full transformer implementation (encoder + decoder) using nn.Transformer (running but incorrect translation).
This file is independent and is not integrated in the current architecture using the main.py as entry point.
To run it please run:
python src/models/transformer_torch.py