dankit

Modern applied AI/ML
_{Projects grounded in practicality for today's world— multi-modal models, finetuning, agents, RAG, and other useful AI/ML tooling.}

Currently working with reinforcement learning on unique and challenging domains. Ideas include vision language models and captcha solving

lambda_cloud MCP — View Lambda Cloud GPU capacity in near real time: stock, alerts, optional auto-launch (“Snipe”), instance list/terminate—built because GPU availability was the main bottleneck for months. Supports MCP so that users can agentically orchestrate the entire ML training lifecycle via text message, maintaining full productivity & agency even when they are away from the computer.

discord_style_sft — Multi-turn supervised fine-tuning from raw, noisy personal conversational data with focus on high-quality dialogue SFT to instill writing style and tone/behavior. Uses Unsloth's training library for fused MoE kernels & LoRA on qwen3.5-35b-a3b, training based on model interpretability research papers & my own layer/module probing via gradient analysis, vLLM for efficient inference and parallelized multi-modal evals, and more.

Legal Retrieval Augmented Generation — Hosting models locally for embeddings and ranking, Chroma and Elasticsearch, reciprocal rank fusion for hybrid retrieval, agentic search, chatbot layer. Reranker finetuning on Google Cloud Platform (Kubernetes Engine and Compute Engine) with ephemeral/spot GPU considerations—training infrastructure code. High performance on evals across the board: millions of embeddings and 250,000+ PDF pages from real-world data such as the US Code of Federal Regulations. 3,000 downloads on huggingface and actively growing

LLM vram calculator — Heavily vibe coded, but useful for telling if training or inference will OOM; contains interactive visuals showing the impact that different parameters have on vram: sequence length, batch size, gradient checkpointing, LoRA, optimizers, float precisions, and model weights. Refactor planned for the future.

Foundations: applied ML theory & history (click here)

I spent a long stretch going deep on fundamentals—math, classical ML, to modern deep learning—with hands-on implementations. The main idea was to apply what I learned from research papers along the way.

language-model-pretraining — Roughly 450M-parameter modern dense transformer, trained on the order of 10B tokens; distributed data parallel on 8×A100 GPUs, my own training loops with checkpointing and model loading, PyTorch transformer implementation with SwiGLU, RMSNorm, Grouped Query Attention (GQA), RoPE, etc.

llama_3.1_8b_base_sft — Instruction tuning the Llama 3.1 8B base checkpoint with LoRA, quantization exploration, complete with evals across the board (tinyMMLU, IFEval).

Attention is all you need — After learning classical ML theory, worked up to understanding and implementing the original transformers paper in PyTorch: sinusoidal positional encodings, LayerNorm, and different flavors of transformers (encoder-only, decoder-only, encoder-decoder). Some statistics learning for LayerNorm/RMSNorm; geometry/trigonometry for positional encodings, whether sinusoidal or eventually RoPE.

Other — The RAG project above touches on encoder-only transformers and their applications: from the original BERT model to variants (e.g. XLM-RoBERTa) for embeddings and reranking, plus how they're trained.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dankit

Block or report dankit

Pinned Loading

Uh oh!