Skip to content

Latest commit

 

History

History
55 lines (43 loc) · 1.3 KB

File metadata and controls

55 lines (43 loc) · 1.3 KB

ZigFormer Roadmap

This document outlines the features implemented in ZigFormer and the future goals for the project.

Important

This roadmap is a work in progress and is subject to change.

Core Architecture

  • Tokenization (word-based)
  • Vocabulary building
  • Embedding layer (token + positional)
  • Multi-head self-attention
  • Feed-forward network
  • Layer normalization
  • Residual connections

Training

  • Optimizer (Adam)
  • Gradient clipping
  • Cross-entropy loss
  • Training loop (pretraining and fine-tuning)
  • Learning rate scheduling
  • Model checkpointing (save and load)
  • Mini-batch training
  • Gradient accumulation

Inference

  • Greedy decoding
  • KV caching
  • Top-k and top-p sampling
  • Beam search

Usability

  • Command line interface
  • Multi-threading support
  • SIMD optimizations
  • Model loading from a checkpoint
  • Configuration file
  • Improved error handling and validation

GUI

  • Web server
  • Web interface
  • Model loading from a checkpoint
  • Configuration file
  • Improved error handling and validation
  • Markdown rendering and syntax highlighting
  • Interactive sampling controls (Top-k and Top-p)
  • Dark and light mode toggle
  • Model statistics display