Grok Papers

AI research papers, explained as slides

Built with the same AI that research labs use to turn complex data into presentations.Linedot →

32 papers

Expertise level:
Simplified overview for general audiences
Preview of DeepSeek-V3 Technical Report
★ Hall of Fame

DeepSeek-V3 Technical Report

DeepSeek-AI·Dec 2024·420 citations

The Efficiency Manifesto. Introduced Multi-Head Latent Attention (MLA) and DeepSeekMoE, proving GPT-4 class models can be trained for $5.5M.

Preview of Mamba: Linear-Time Sequence Modeling with Selective State Spaces
★ Hall of Fame

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Gu, A., Dao, T.·Dec 2023·1.5k citations

The Transformer Challenger. Proposed a modern State Space Model (SSM) architecture that offers linear scaling, influencing new "hybrid" architectures.

Preview of Direct Preference Optimization (DPO)
★ Hall of Fame

Direct Preference Optimization (DPO)

Rafailov, R., Sharma, A., Mitchell, E. et al.·May 2023·2.8k citations

Killed PPO. Simplified alignment by mathematically showing you can optimize for human preferences directly without training a separate Reward Model.

Preview of QLoRA: Efficient Finetuning of Quantized LLMs
★ Hall of Fame

QLoRA: Efficient Finetuning of Quantized LLMs

Dettmers, T., Pagnoni, A., Holtzman, A. et al.·May 2023·2.2k citations

The Democratizer. Combined 4-bit quantization with LoRA, allowing anyone to finetune a 65B parameter model on a single consumer GPU.

Preview of Voyager: An Open-Ended Embodied Agent with Large Language Models
★ Hall of Fame

Voyager: An Open-Ended Embodied Agent with Large Language Models

Wang, G., Xie, Y., Jiang, Y. et al.·May 2023·950 citations

The Agent Blueprint. One of the first papers to successfully use an LLM to write code, execute it in Minecraft, fail, and self-correct via a feedback loop.

Preview of Segment Anything (SAM)
★ Hall of Fame

Segment Anything (SAM)

Kirillov, A., Mintun, E., Ravi, N. et al.·Apr 2023·4.2k citations

Meta's foundation model for image segmentation that generalizes to zero-shot objects.

Preview of LLaMA: Open and Efficient Foundation Language Models
★ Hall of Fame

LLaMA: Open and Efficient Foundation Language Models

Touvron, H., Lavril, T., Izacard, G. et al.·Feb 2023·8.5k citations

Meta's release that kickstarted the open-source LLM race by proving smaller, better-trained models can rival giants.

Preview of Adding Conditional Control to Text-to-Image Diffusion Models (ControlNet)
★ Hall of Fame

Adding Conditional Control to Text-to-Image Diffusion Models (ControlNet)

Zhang, L., Rao, A., Agrawala, M.·Feb 2023·3.8k citations

Allowed precise structural control (edges, pose, depth) over diffusion generation.