genai-study

Colección de papers/lecturas sobre LLMs, prompting, agentes, alineamiento y entrenamiento.

Índice (archivos + mini-resumen)

Nota: los resúmenes están pensados para ser “tamaño tuit” (o un poco más). Donde la afiliación/autores no se infiere con certeza del nombre del archivo, lo marco como “por verificar”.

Archivo	Título	Autores / Compañía	Resumen (ES)
Artificial Hivemind- The Open-Ended Homogeneity of Language Models (and Beyond) - Carniege:Washington - 2510.22954v1.pdf	Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)	Carnegie + Washington	Explora por qué muchos LLMs tienden a volverse “homogéneos” (comportamientos/salidas similares) y qué implica eso para diversidad, innovación y evaluación; discute mecanismos/hipótesis y posibles vías para escapar esa convergencia.
Attention is all you need - Google - 1706.03762v7.pdf	Attention Is All You Need	Vaswani et al. (Google)	Presenta el Transformer: reemplaza RNN/CNN por atención (self-attention) y paraleliza el entrenamiento; habilita modelos más grandes y mejora traducción y otras tareas secuenciales.
AutoGen- Enabling Next-Gen LLM Applications via Multi-Agent Conversation - Microsoft - 2308.08155v2.pdf	AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation	Microsoft Research (Wu et al.)	Propone un framework para sistemas multi‑agente con LLMs (roles, herramientas, orquestación y feedback) para tareas complejas; muestra que dividir en agentes especializados mejora calidad y control.
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models - Google - 2201.11903v6.pdf	Chain-of-Thought Prompting Elicits Reasoning in Large Language Models	Wei et al. (Google)	Muestra que pedir “pasos de razonamiento” (CoT) mejora mucho el rendimiento en tareas multi‑paso, especialmente en modelos grandes; habilita inspección parcial y mejores resultados con pocos ejemplos.
Constitutional AI - Anthropic - 2212.08073v1.pdf	Constitutional AI: Harmlessness from AI Feedback	Bai et al. (Anthropic)	Entrena modelos alineados sin depender tanto de RLHF humano: usa una “constitución” (principios) para auto‑criticar y auto‑revisar respuestas y luego aplicar RL; apunta a mayor seguridad/robustez.
Constitutional AI_ Harmlessness from AI Feedback _ Anthropic.mhtml	Constitutional AI (post/nota)	Anthropic	Material complementario (formato web) sobre el enfoque de Constitutional AI: reglas/principios para guiar correcciones y preferencias de seguridad, y cómo se usa feedback generado por IA para entrenar.
Discovering Latent Knowledge in Language Models - 2212.03827v2.pdf	Discovering Latent Knowledge in Language Models	Autores (por verificar)	Investiga cómo extraer “conocimiento latente” que el modelo parece tener pero no siempre expresa; discute técnicas de sondeo/selección de respuestas y tensiones entre veracidad, calibración y alineamiento.
Do large language models have a theory of mind.pdf	Do Large Language Models Have a Theory of Mind?	Autores (por verificar)	Evalúa si los LLMs resuelven tareas tipo “falsa creencia” y otras pruebas de teoría de la mente; analiza si el desempeño refleja razonamiento sobre estados mentales o atajos por patrones del dataset.
Efficient Streaming Language Models with Attention Sinks - Meta:Nvidia - 2309.17453v4.pdf	Efficient Streaming Language Models with Attention Sinks	Meta + NVIDIA	Propone “attention sinks” (tokens/posiciones que absorben atención) para estabilizar el uso de cache/streaming en generación autoregresiva; mejora calidad en contextos largos y reduce degradación al “desplazar” ventanas.
Generative Agents- Interactive Simulacra of Human Behavior - Google - 2304.03442v2.pdf	Generative Agents: Interactive Simulacra of Human Behavior	Park et al. (Stanford + Google, por verificar)	Presenta agentes “tipo humano” con memoria, reflexión y planificación en un sandbox social; muestra cómo emergen rutinas, coordinación y narrativas usando LLM + estructuras de memoria/agenda.
Grokking- Generalization Beyond Overfitting on Small Algorithmic Datasets - OpenAI : Google - 2201.02177v1.pdf	Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets	Power et al. (OpenAI + Google)	Describe “grokking”: redes que primero memorizan (overfit) y, tras mucho entrenamiento, de pronto generalizan en tareas algorítmicas pequeñas; conecta el fenómeno con regularización y dinámica de optimización.
HyperConnections - ByteDance - 2409.19606v3.pdf	HyperConnections	ByteDance	Propone un esquema de conexiones (“hiper‑conexiones”) para entrenar redes profundas de forma más estable/eficiente, buscando mejor flujo de gradiente y rendimiento con escalado; útil como idea de arquitectura/receta de entrenamiento.
Incentivizing Reasoning Capability in LLMs via Reinforcement Learning - Deepseek - 2501.12948v2.pdf	Incentivizing Reasoning Capability in LLMs via Reinforcement Learning	DeepSeek	Explora cómo usar RL/recompensas para empujar al modelo a resolver tareas con razonamiento multi‑paso (y/o a producir mejores trazas), mejorando desempeño en benchmarks de razonamiento y controlando tradeoffs de costo/longitud.
Large Language Diffusion Models (LLaDA) - 2502.09992v3.pdf	Large Language Diffusion Models (LLaDA)	Autores / institución (por verificar)	Investiga difusión aplicada a lenguaje (generación por “denoising” en lugar de solo autoregresión); discute ventajas/desventajas en paralelización, control y calidad, y cómo adaptarlo a texto.
mHC - DeepSeek - 2512.24880v2.pdf	mHC	DeepSeek	Paper reciente de DeepSeek con una técnica “mHC” para mejorar entrenamiento/inferencia y/o razonamiento; útil para comparar recetas modernas de scaling. (Detalle fino por verificar).
Muon is scalable for LLMs - Moonshot : Kimi : UCLA - 2502.16982v1.pdf	Muon is scalable for LLMs	Moonshot (Kimi) + UCLA	Presenta “Muon” (método/optimizador/receta) orientado a escalar entrenamiento de LLMs con buena estabilidad y throughput; discute por qué escala mejor y en qué regímenes supera alternativas.
On the Dangers of Stochastic Parrots - 3442188.3445922.pdf	On the Dangers of Stochastic Parrots	Bender et al. (UW + Google, por verificar)	Crítica los riesgos de LLMs gigantes: costos ambientales, sesgos, explotación laboral en datos, alucinaciones y falta de rendición de cuentas; propone prácticas responsables y límites al “bigger is better”.
Qwen3 Technical Report - 2505.09388v1.pdf	Qwen3 Technical Report	Qwen Team (Alibaba)	Report técnico de la familia Qwen3: arquitectura, datos/entrenamiento y evaluación; útil como referencia de scaling y tradeoffs de calidad vs costo en modelos abiertos.
Qwen3 vl Embedding Technical Report.pdf	Qwen3-VL Embedding Technical Report	Qwen Team (Alibaba)	Report sobre embeddings/representaciones para visión‑lenguaje en el stack Qwen; describe objetivos, setup de entrenamiento y cómo se evalúa/usa para retrieval, grounding y tareas multimodales.
REAC T- SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS - Google - 2210.03629v3.pdf	ReAct: Synergizing Reasoning and Acting in Language Models	Yao et al. (Princeton + Google Research, por verificar)	Intercala “pensamiento” y “acciones” (uso de herramientas/entorno) para resolver tareas: el razonamiento guía acciones y las observaciones corrigen el plan; mejora interpretabilidad y rendimiento vs solo CoT o solo tool-use.
Reasoning with Language Model is Planning with World Model - UCSC : UFL - 2305.14992v2.pdf	Reasoning with Language Model is Planning with World Model	UC Santa Cruz + University of Florida	Enmarca el “razonamiento” del LLM como planificación sobre un modelo del mundo implícito: generar pasos sería buscar/optimizar trayectorias; ayuda a entender por qué funcionan prompts de planificación y dónde fallan.
RWKV- Reinventing RNNs for the Transformer Era - 2305.13048v2.pdf	RWKV: Reinventing RNNs for the Transformer Era	Autores / institución (por verificar)	Propone RWKV, una arquitectura tipo RNN que busca capturar ventajas de Transformers (calidad) con inferencia más eficiente/streaming-friendly; interesante para latencia/memoria y contextos largos.
Scaling Monosemanticity_ Extracting Interpretable Features from Claude 3 Sonnet.mhtml	Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet	Anthropic	Post/artículo sobre interpretabilidad: extraer “features” internas más monosemánticas (una idea ≈ un feature) y cómo escalan con modelos grandes; apunta a entender circuitos y reducir opacidad.
SCALING REINFORCEMENT LEARNING WITH LLMS - Kimi : Moonshot - 2501.12599v4.pdf	Scaling Reinforcement Learning with LLMs	Moonshot (Kimi)	Discute cómo escalar RL aplicado a LLMs (señales de recompensa, estabilidad, infraestructura y evaluación) para mejorar razonamiento/capacidades; útil como playbook de entrenamiento post‑SFT.
SELF-CONSISTENCY IMPROVES CHAIN OF THOUGHT REASONING IN LANGUAGE MODELS - Google - 2203.11171v4.pdf	Self-Consistency Improves Chain of Thought Reasoning in Language Models	Wang et al. (Google)	En vez de tomar una sola cadena CoT, muestrea varias y hace “votación” por consistencia; mejora precisión en razonamiento sin cambiar el modelo (a cambio de más costo de inferencia).
Switch Transformers- Scaling to Trillion Parameter Models with Simple and Efficient Sparsity - Google - 2101.03961v3.pdf	Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity	Fedus et al. (Google)	Presenta MoE “Switch”: enruta tokens a un solo experto para hacer modelos enormes con costo computacional controlado; logra buen escalado y eficiencia en entrenamiento/inferencia comparado con densos.
Toolformer- Language Models Can Teach Themselves to Use Tools - Meta - 2302.04761v1.pdf	Toolformer: Language Models Can Teach Themselves to Use Tools	Schick et al. (Meta)	Enseña al modelo a llamar herramientas (buscador, calculadora, etc.) con datos auto‑generados: inserta llamadas “cuando conviene” y filtra por mejora de probabilidad; aumenta factualidad y capacidad sin RL complejo.
Tree of Thoughts- Deliberate Problem Solving with Large Language Models - Deepmind - 2305.10601v2.pdf	Tree of Thoughts: Deliberate Problem Solving with Large Language Models	Yao et al. (DeepMind + Princeton, por verificar)	Generaliza CoT a búsqueda: genera múltiples “pensamientos” como nodos y explora un árbol con evaluación/heurísticas; mejora en puzzles y planificación al permitir backtracking y exploración sistemática.
README.md	genai-study (README)	—	Índice curado con links + resúmenes cortos de las lecturas del repo.

Cómo usar

Abrí cualquier archivo desde el índice y tomá notas en el README o en tu sistema de anotación favorito.
Si querés, puedo convertir esto a “tags” (prompting / agentes / RL / arquitectura / interpretabilidad) y agregar filtros por tema.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

genai-study

Índice (archivos + mini-resumen)

Cómo usar

About

Uh oh!

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Artificial Hivemind- The Open-Ended Homogeneity of Language Models (and Beyond) - Carniege:Washington - 2510.22954v1.pdf		Artificial Hivemind- The Open-Ended Homogeneity of Language Models (and Beyond) - Carniege:Washington - 2510.22954v1.pdf
Attention is all you need - Google - 1706.03762v7.pdf		Attention is all you need - Google - 1706.03762v7.pdf
AutoGen- Enabling Next-Gen LLM Applications via Multi-Agent Conversation - Microsoft - 2308.08155v2.pdf		AutoGen- Enabling Next-Gen LLM Applications via Multi-Agent Conversation - Microsoft - 2308.08155v2.pdf
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models - Google - 2201.11903v6.pdf		Chain-of-Thought Prompting Elicits Reasoning in Large Language Models - Google - 2201.11903v6.pdf
Constitutional AI - Anthropic - 2212.08073v1.pdf		Constitutional AI - Anthropic - 2212.08073v1.pdf
Constitutional AI_ Harmlessness from AI Feedback _ Anthropic.mhtml		Constitutional AI_ Harmlessness from AI Feedback _ Anthropic.mhtml
Discovering Latent Knowledge in Language Models - 2212.03827v2.pdf		Discovering Latent Knowledge in Language Models - 2212.03827v2.pdf
Do large language models have a theory of mind.pdf		Do large language models have a theory of mind.pdf
Efficient Streaming Language Models with Attention Sinks - Meta:Nvidia - 2309.17453v4.pdf		Efficient Streaming Language Models with Attention Sinks - Meta:Nvidia - 2309.17453v4.pdf
Generative Agents- Interactive Simulacra of Human Behavior - Google - 2304.03442v2.pdf		Generative Agents- Interactive Simulacra of Human Behavior - Google - 2304.03442v2.pdf
Grokking- Generalization Beyond Overfitting on Small Algorithmic Datasets - OpenAI : Google - 2201.02177v1.pdf		Grokking- Generalization Beyond Overfitting on Small Algorithmic Datasets - OpenAI : Google - 2201.02177v1.pdf
HyperConnections - ByteDance - 2409.19606v3.pdf		HyperConnections - ByteDance - 2409.19606v3.pdf
Incentivizing Reasoning Capability in LLMs via Reinforcement Learning - Deepseek - 2501.12948v2.pdf		Incentivizing Reasoning Capability in LLMs via Reinforcement Learning - Deepseek - 2501.12948v2.pdf
Large Language Diffusion Models (LLaDA) - 2502.09992v3.pdf		Large Language Diffusion Models (LLaDA) - 2502.09992v3.pdf
Muon is scalable for LLMs - Moonshot : Kimi : UCLA - 2502.16982v1.pdf		Muon is scalable for LLMs - Moonshot : Kimi : UCLA - 2502.16982v1.pdf
On the Dangers of Stochastic Parrots - 3442188.3445922.pdf		On the Dangers of Stochastic Parrots - 3442188.3445922.pdf
Qwen3 Technical Report - 2505.09388v1.pdf		Qwen3 Technical Report - 2505.09388v1.pdf
Qwen3 vl Embedding Technical Report.pdf		Qwen3 vl Embedding Technical Report.pdf
REAC T- SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS - Google - 2210.03629v3.pdf		REAC T- SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS - Google - 2210.03629v3.pdf
README.md		README.md
RWKV- Reinventing RNNs for the Transformer Era - 2305.13048v2.pdf		RWKV- Reinventing RNNs for the Transformer Era - 2305.13048v2.pdf
Reasoning with Language Model is Planning with World Model - UCSC : UFL - 2305.14992v2.pdf		Reasoning with Language Model is Planning with World Model - UCSC : UFL - 2305.14992v2.pdf
SCALING REINFORCEMENT LEARNING WITH LLMS - Kimi : Moonshot - 2501.12599v4.pdf		SCALING REINFORCEMENT LEARNING WITH LLMS - Kimi : Moonshot - 2501.12599v4.pdf
SELF-CONSISTENCY IMPROVES CHAIN OF THOUGHT REASONING IN LANGUAGE MODELS - Google - 2203.11171v4.pdf		SELF-CONSISTENCY IMPROVES CHAIN OF THOUGHT REASONING IN LANGUAGE MODELS - Google - 2203.11171v4.pdf
Scaling Monosemanticity_ Extracting Interpretable Features from Claude 3 Sonnet.mhtml		Scaling Monosemanticity_ Extracting Interpretable Features from Claude 3 Sonnet.mhtml
Switch Transformers- Scaling to Trillion Parameter Models with Simple and Efficient Sparsity - Google - 2101.03961v3.pdf		Switch Transformers- Scaling to Trillion Parameter Models with Simple and Efficient Sparsity - Google - 2101.03961v3.pdf
Toolformer- Language Models Can Teach Themselves to Use Tools - Meta - 2302.04761v1.pdf		Toolformer- Language Models Can Teach Themselves to Use Tools - Meta - 2302.04761v1.pdf
Tree of Thoughts- Deliberate Problem Solving with Large Language Models - Deepmind - 2305.10601v2.pdf		Tree of Thoughts- Deliberate Problem Solving with Large Language Models - Deepmind - 2305.10601v2.pdf
mHC - DeepSeek - 2512.24880v2.pdf		mHC - DeepSeek - 2512.24880v2.pdf

Faridmurzone/genai-study

Folders and files

Latest commit

History

Repository files navigation

genai-study

Índice (archivos + mini-resumen)

Cómo usar

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages