Skip to content

Latest commit

 

History

History
25 lines (18 loc) · 1.61 KB

File metadata and controls

25 lines (18 loc) · 1.61 KB

Llama 2 for PSP - Docs

The core of the adaptation is to map the REU hardware access to pointer access in the PSP's main RAM.

Tokenizer.cpp vs tokenizer64.c

PSP's MIPS Architecture vs. x86 for PCs

The PSP's MIPS CPU is little-endian. The original C64 code that writes binary data (short and float) also writes in little-endian format (the least significant byte first).

Transformer.cpp vs transformer64.c

The TransformerWeights64 struct uses typedef uint32_t REUPtr. An REUPtr is an absolute 32-bit address in the Commodore 64's REU memory.

In the PSP's context, I load the entire weights.psp file into a single memory block allocated with malloc. Therefore, all fields of the REUPtr type will not be absolute addresses in external memory, but rather float pointers that point to different locations within this large memory block.

I can either keep the REUPtr type but change its meaning to be an offset from the beginning of my weights memory block, or just change everything to float*.

nnet.cpp vs nnet64.c

This file is the "brain" of the inference. It contains the implementations of the Transformer algorithms, such as matrix multiplication, normalization, and attention. Most of the heavy computational work occurs here. In my case, I removed the REU_getf and REU_putf calls and replaced them with direct memory access on the PSP.

math.c (from the c64 version)

We don't need it. We will use <math.h>.

generate.cpp vs generatec64.c

generate is the main loop that produces the text, token by token. It orchestrates the text generation process along with sampler64.c (or sampler.cpp).