Research papers for software engineers to transition to AI Engineering
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- IMAGEBIND: One Embedding Space To Bind Them All
- SONAR: Sentence-Level Multimodal and Language-Agnostic Representations
- FAISS library
- Facebook Large Concept Models
- Attention is All You Need
- FlashAttention
- Multi Query Attention
- Grouped Query Attention
- Google Titans outperform Transformers
- VideoRoPE: Rotary Position Embedding
##RLHF
- Deep Reinforcement Learning with Human Feedback
- Fine-Tuning Language Models with RHLF
- Training language models with RHLF
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
- Chain of thought
- Demystifying Long Chain-of-Thought Reasoning in LLMs
- Transformer Reasoning Capabilities
- Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
- Scale model test times is better than scaling parameters
- Training Large Language Models to Reason in a Continuous Latent Space
- DeepSeek R1
- A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods
- Latent Reasoning: A Recurrent Depth Approach
- Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
- FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
- ByteDance 1.58
- Transformer Square
- Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
- 1b outperforms 405b
- Speculative Decoding
- RWKV: Reinventing RNNs for the Transformer Era
- Mamba
- Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
- Distilling Transformers to SSMs
- LoLCATs: On Low-Rank Linearizing of Large Language Models
- Think Slow, Fast
- Can AI be made to think critically
- Evolving Deeper LLM Thinking
- LLMs Can Easily Learn to Reason from Demonstrations Structure
- ViViT: A Video Vision Transformer
- Joint Embedding abstractions with self-supervised video masks
- Facebook VideoJAM ai gen
- Automated Unit Test Improvement using Large Language Models at Meta
- Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering
- OpenAI o1 System Card
- LLM-powered bug catchers
- Chain-of-Retrieval Augmented Generation
- Swiggy Search
- Swarm by OpenAI
- Netflix Foundation Models
- Model Context Protocol
- uber queryGPT
I manage my lists here: https://interviewready.io/resources/