Arxiv Papers
Podcast image
[QA] Cascade: Token-Sharded Private LLM Inference
7 mins; July 08, 2025
Cascade: Token-Sharded Private LLM Inference
35 mins; July 08, 2025
[QA] Real-TabPFN: Improving Tabular Foundation Models via Continued Pre-training With Real-World Data
7 mins; July 08, 2025
Real-TabPFN: Improving Tabular Foundation Models via Continued Pre-training With Real-World Data
10 mins; July 08, 2025
[QA] Strategic Intelligence in Large Language Models Evidence from evolutionary Game Theory.
7 mins; July 07, 2025
Strategic Intelligence in Large Language Models Evidence from evolutionary Game Theory.
34 mins; July 07, 2025
[QA] Fast and Simplex: 2-Simplicial Attention in Triton
7 mins; July 07, 2025
Fast and Simplex: 2-Simplicial Attention in Triton
17 mins; July 07, 2025
[QA] Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
7 mins; July 01, 2025
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
15 mins; July 01, 2025
[QA] DABstep: Data Agent Benchmark for Multi-step Reasoning
7 mins; July 01, 2025
DABstep: Data Agent Benchmark for Multi-step Reasoning
16 mins; July 01, 2025
[QA] Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling?
8 mins; June 30, 2025
Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling?
16 mins; June 30, 2025
[QA] LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs
8 mins; June 29, 2025
LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs
14 mins; June 29, 2025
[QA] Performance Prediction for Large Systems via Text-to-Text Regression
8 mins; June 29, 2025
Performance Prediction for Large Systems via Text-to-Text Regression
20 mins; June 29, 2025
[QA] From Memories to Maps: Mechanisms of In-Context Reinforcement Learning in Transformers
7 mins; June 29, 2025
From Memories to Maps: Mechanisms of In-Context Reinforcement Learning in Transformers
20 mins; June 29, 2025
[QA] OmniGen2: Exploration to Advanced Multimodal Generation
7 mins; June 29, 2025
OmniGen2: Exploration to Advanced Multimodal Generation
32 mins; June 29, 2025
[QA] OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling
7 mins; June 27, 2025
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling
25 mins; June 27, 2025
[QA] Potemkin Understanding in Large Language Models
8 mins; June 27, 2025
Potemkin Understanding in Large Language Models
17 mins; June 27, 2025
[QA] Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test
7 mins; June 26, 2025
Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test
18 mins; June 26, 2025
[QA] MMSearch-R1: Incentivizing LMMs to Search
8 mins; June 26, 2025
MMSearch-R1: Incentivizing LMMs to Search
18 mins; June 26, 2025
[QA] Thought Anchors: Which LLM Reasoning Steps Matter?
7 mins; June 25, 2025
Thought Anchors: Which LLM Reasoning Steps Matter?
15 mins; June 25, 2025
[QA] Scaling Speculative Decoding with LOOKAHEAD REASONING
8 mins; June 25, 2025
Scaling Speculative Decoding with LOOKAHEAD REASONING
22 mins; June 25, 2025
[QA] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
7 mins; June 23, 2025
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
16 mins; June 23, 2025
[QA] Watermarking Autoregressive Image Generation
7 mins; June 22, 2025
Watermarking Autoregressive Image Generation
27 mins; June 22, 2025
[QA] Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
6 mins; June 22, 2025
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
11 mins; June 22, 2025
[QA] Flat Channels to Infinity in Neural Loss Landscapes
7 mins; June 21, 2025
Flat Channels to Infinity in Neural Loss Landscapes
15 mins; June 21, 2025
[QA] Approximating Language Model Training Data from Weights
7 mins; June 21, 2025
Approximating Language Model Training Data from Weights
21 mins; June 21, 2025
[QA] GenRecal: Generation after Recalibration from Large to Small Vision-Language Models
7 mins; June 19, 2025
GenRecal: Generation after Recalibration from Large to Small Vision-Language Models
17 mins; June 19, 2025
[QA] ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs
8 mins; June 19, 2025
ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs
12 mins; June 19, 2025
[QA] Sampling from Your Language Model One Byte at a Time
7 mins; June 17, 2025
Sampling from Your Language Model One Byte at a Time
13 mins; June 17, 2025
[QA] Don't throw the baby out with the bathwater: How and why deep learning for ARC
7 mins; June 17, 2025
Don't throw the baby out with the bathwater: How and why deep learning for ARC
32 mins; June 17, 2025
[QA] What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers
7 mins; June 16, 2025
What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers
19 mins; June 16, 2025
[QA] MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
8 mins; June 16, 2025
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
25 mins; June 16, 2025
[QA] Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation
8 mins; June 16, 2025
Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation
16 mins; June 16, 2025
[QA] TreeRL: LLM Reinforcement Learning with On-Policy Tree Search
7 mins; June 16, 2025
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search
19 mins; June 16, 2025
[QA] Solving Inequality Proofs with Large Language Models
8 mins; June 13, 2025
Solving Inequality Proofs with Large Language Models
23 mins; June 13, 2025
[QA] Reinforcement Learning Teachers of Test Time Scaling
7 mins; June 13, 2025
Reinforcement Learning Teachers of Test Time Scaling
22 mins; June 13, 2025
[QA] Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers
7 mins; June 12, 2025
Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers
18 mins; June 12, 2025
[QA] Spurious Rewards: Rethinking Training Signals in RLVR
7 mins; June 12, 2025
Spurious Rewards: Rethinking Training Signals in RLVR
30 mins; June 12, 2025
[QA] Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
8 mins; June 11, 2025
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
24 mins; June 11, 2025
[QA] Reinforcement Pre-Training
7 mins; June 11, 2025
Reinforcement Pre-Training
11 mins; June 11, 2025
[QA] Corrector Sampling in Language Models
7 mins; June 08, 2025
Corrector Sampling in Language Models
19 mins; June 08, 2025
[QA] Distillation Robustifies Unlearning
7 mins; June 08, 2025
Distillation Robustifies Unlearning
14 mins; June 08, 2025
[QA] Log-Linear Attention
7 mins; June 07, 2025
Log-Linear Attention
21 mins; June 07, 2025
[QA] Rewarding the Unlikely: Lifting GRPO Beyond Distribution Sharpening
7 mins; June 07, 2025
Rewarding the Unlikely: Lifting GRPO Beyond Distribution Sharpening
16 mins; June 07, 2025
[QA] Self-Challenging Language Model Agents
7 mins; June 07, 2025
Self-Challenging Language Model Agents
22 mins; June 07, 2025
[QA] Why Gradients Rapidly Increase Near the End of Training
7 mins; June 07, 2025
Why Gradients Rapidly Increase Near the End of Training
11 mins; June 07, 2025
[QA] GEM: Empowering LLM for both Embedding Generation and Language Understanding
7 mins; June 05, 2025
GEM: Empowering LLM for both Embedding Generation and Language Understanding
20 mins; June 05, 2025
[QA] HYPERSTEER: Activation Steering at Scale with Hypernetworks
7 mins; June 04, 2025
HYPERSTEER: Activation Steering at Scale with Hypernetworks
9 mins; June 04, 2025
[QA] Data Recipes for Reasoning Models
8 mins; June 04, 2025
Data Recipes for Reasoning Models
18 mins; June 04, 2025
[QA] Accelerating Diffusion LLMs via Adaptive Parallel Decoding
8 mins; June 04, 2025
Accelerating Diffusion LLMs via Adaptive Parallel Decoding
21 mins; June 04, 2025
[QA] Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
7 mins; June 04, 2025
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
16 mins; June 04, 2025
[QA] Esoteric Language Models
8 mins; June 03, 2025
Esoteric Language Models
34 mins; June 03, 2025
[QA] Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
8 mins; June 03, 2025
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
23 mins; June 03, 2025
[QA] ALPHAONE: Reasoning Models Thinking Slow and Fast at Test Time
7 mins; June 02, 2025
ALPHAONE: Reasoning Models Thinking Slow and Fast at Test Time
17 mins; June 02, 2025