LanguaTalk

Arxiv Papers

Running out of time to catch up with new arXiv papers? We take the most impactful papers and present them as convenient podcasts. If you're a visual learner, we offer these papers in an engaging video format. Our service fills the gap between overly brief paper summaries and time-consuming full paper reads. You gain academic insights in a time-efficient, digestible format. Code behind this work: https://github.com/imelnyk/ArxivPapers

[QA] What Matters in Transformers? Not All Attention is Needed

8 mins; October 16, 2024

What Matters in Transformers? Not All Attention is Needed

16 mins; October 16, 2024

[QA] Language Models Encode Numbers Using Digit Representations in Base 10

7 mins; October 16, 2024

Language Models Encode Numbers Using Digit Representations in Base 10

10 mins; October 16, 2024

[QA] Don't Transform the Code, Code the Transforms: Towards Precise Code Rewriting using LLMs

7 mins; October 13, 2024

Don't Transform the Code, Code the Transforms: Towards Precise Code Rewriting using LLMs

6 mins; October 13, 2024

[QA] Do Unlearning Methods Remove Information from Language Model Weights?

8 mins; October 13, 2024

Do Unlearning Methods Remove Information from Language Model Weights?

17 mins; October 13, 2024

[QA] MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

7 mins; October 12, 2024

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

16 mins; October 12, 2024

[QA] Pixtral 12B

7 mins; October 12, 2024

Pixtral 12B

13 mins; October 12, 2024

[QA] Differential Transformer

7 mins; October 11, 2024

Differential Transformer

13 mins; October 11, 2024

[QA] GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

7 mins; October 11, 2024

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

12 mins; October 11, 2024

[QA] Efficient Dictionary Learning with Switch Sparse Autoencoders

7 mins; October 10, 2024

Efficient Dictionary Learning with Switch Sparse Autoencoders

15 mins; October 10, 2024

[QA] Visual Scratchpads: Enabling Global Reasoning in Vision

7 mins; October 10, 2024

Visual Scratchpads: Enabling Global Reasoning in Vision

26 mins; October 10, 2024

[QA] RL, but don't do anything I wouldn't do

7 mins; October 10, 2024

RL, but don't do anything I wouldn't do

16 mins; October 10, 2024

[QA] Restructuring Vector Quantization with the Rotation Trick

7 mins; October 10, 2024

Restructuring Vector Quantization with the Rotation Trick

23 mins; October 10, 2024

[QA] EnsemW2S: Can an Ensemble of LLMs be Leveraged to Obtain a Stronger LLM?

7 mins; October 07, 2024

EnsemW2S: Can an Ensemble of LLMs be Leveraged to Obtain a Stronger LLM?

18 mins; October 07, 2024

[QA] Density estimation with LLMs: a geometric investigation of in-context learning trajectories

8 mins; October 07, 2024

Density estimation with LLMs: a geometric investigation of in-context learning trajectories

12 mins; October 07, 2024

[QA] Teaching Transformers Modular Arithmetic at Scale

8 mins; October 06, 2024

Teaching Transformers Modular Arithmetic at Scale

13 mins; October 06, 2024

[QA] What Matters for Model Merging at Scale?

7 mins; October 06, 2024

What Matters for Model Merging at Scale?

24 mins; October 06, 2024

[QA] Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

7 mins; October 04, 2024

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

14 mins; October 04, 2024

[QA] Were RNNs All We Needed?

9 mins; October 04, 2024

Were RNNs All We Needed?

16 mins; October 04, 2024

[QA] OOD-CHAMELEON: Is Algorithm Selection for OOD Generalization Learnable?

8 mins; October 03, 2024

OOD-CHAMELEON: Is Algorithm Selection for OOD Generalization Learnable?

21 mins; October 03, 2024

[QA] Training Language Models on Synthetic Edit Sequences Improves Code Synthesis

7 mins; October 03, 2024

Training Language Models on Synthetic Edit Sequences Improves Code Synthesis

18 mins; October 03, 2024

[QA] Automated Red Teaming with GOAT: the Generative Offensive Agent Tester

7 mins; October 02, 2024

Automated Red Teaming with GOAT: the Generative Offensive Agent Tester

13 mins; October 02, 2024

[QA] Not All LLM Reasoners Are Created Equal

7 mins; October 02, 2024

Not All LLM Reasoners Are Created Equal

9 mins; October 02, 2024

[QA] Law of the Weakest Link: Cross Capabilities of Large Language Models

7 mins; October 02, 2024

Law of the Weakest Link: Cross Capabilities of Large Language Models

16 mins; October 02, 2024

[QA] Realistic Evaluation of Model Merging for Compositional Generalization

8 mins; September 30, 2024

Realistic Evaluation of Model Merging for Compositional Generalization

21 mins; September 30, 2024

[QA] Emu3: Next-Token Prediction is All You Need

7 mins; September 30, 2024

Emu3: Next-Token Prediction is All You Need

17 mins; September 30, 2024

[QA] MIO: A Foundation Model on Multimodal Tokens

8 mins; September 30, 2024

MIO: A Foundation Model on Multimodal Tokens

19 mins; September 30, 2024

[QA] A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor ?

7 mins; September 28, 2024

A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor ?

8 mins; September 28, 2024

[QA] Logic-of-Thought: Injecting Logic into Contexts for Full Reasoning in Large Language Models

8 mins; September 28, 2024

Logic-of-Thought: Injecting Logic into Contexts for Full Reasoning in Large Language Models

16 mins; September 28, 2024

[QA] Making Text Embedders Few-Shot Learners

7 mins; September 27, 2024

Making Text Embedders Few-Shot Learners

16 mins; September 27, 2024

[QA] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale

6 mins; September 27, 2024

Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale

8 mins; September 27, 2024

[QA] Infer Human's Intentions Before Following Natural Language Instruction

8 mins; September 27, 2024

Infer Human's Intentions Before Following Natural Language Instruction

27 mins; September 27, 2024

[QA] MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

7 mins; September 27, 2024

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

15 mins; September 27, 2024

[QA] Counterfactual Token Generation in Large Language Models

7 mins; September 25, 2024

Counterfactual Token Generation in Large Language Models

14 mins; September 25, 2024

[QA] Characterizing stable regions in the residual stream of LLMs

7 mins; September 25, 2024

Characterizing stable regions in the residual stream of LLMs

5 mins; September 25, 2024

[QA] Watch Your Steps: Observable and Modular Chains of Thought

7 mins; September 24, 2024

Watch Your Steps: Observable and Modular Chains of Thought

29 mins; September 24, 2024

[QA] Seeing Faces in Things: A Model and Dataset for Pareidolia

7 mins; September 24, 2024

Seeing Faces in Things: A Model and Dataset for Pareidolia

10 mins; September 24, 2024

[QA] Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts

8 mins; September 23, 2024

Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts

29 mins; September 23, 2024

[QA] Style over Substance: Failure Modes of LLM Judges in Alignment Benchmarking

7 mins; September 23, 2024

Style over Substance: Failure Modes of LLM Judges in Alignment Benchmarking

11 mins; September 23, 2024

[QA] LLM Surgery: Efficient Knowledge Unlearning and Editing in Large Language Models

7 mins; September 22, 2024

LLM Surgery: Efficient Knowledge Unlearning and Editing in Large Language Models

13 mins; September 22, 2024

[QA] Embedding Geometries of Contrastive Language-Image Pre-Training

7 mins; September 22, 2024

Embedding Geometries of Contrastive Language-Image Pre-Training

15 mins; September 22, 2024

[QA] Kolmogorov–Arnold Transformer

8 mins; September 20, 2024

Kolmogorov–Arnold Transformer

15 mins; September 20, 2024

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

11 mins; September 20, 2024

[QA] Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

6 mins; September 20, 2024

[QA] Re-Introducing LayerNorm: Geometric Meaning, Irreversibility and a Comparative Study with RMSNorm

7 mins; September 19, 2024

Re-Introducing LayerNorm: Geometric Meaning, Irreversibility and a Comparative Study with RMSNorm

12 mins; September 19, 2024

[QA] Is Tokenization Needed for Masked Particle Modelling?

7 mins; September 19, 2024

Is Tokenization Needed for Masked Particle Modelling?

20 mins; September 19, 2024

[QA] Finetuning Language Models to Emit Linguistic Expressions of Uncertainty

6 mins; September 18, 2024

Finetuning Language Models to Emit Linguistic Expressions of Uncertainty

12 mins; September 18, 2024

[QA] To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

7 mins; September 18, 2024

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

26 mins; September 18, 2024

[QA] On the limits of agency in agent-based models

8 mins; September 17, 2024

On the limits of agency in agent-based models

19 mins; September 17, 2024

[QA] Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

7 mins; September 17, 2024

Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

15 mins; September 17, 2024

[QA] Finetuning CLIP to Reason about Pairwise Differences

7 mins; September 17, 2024

Finetuning CLIP to Reason about Pairwise Differences

16 mins; September 17, 2024

[QA] Think Twice Before You Act: Improving Inverse Problem Solving With MCMC

9 mins; September 15, 2024

Think Twice Before You Act: Improving Inverse Problem Solving With MCMC

11 mins; September 15, 2024

Click here to see more

Arxiv Papers

Useful pages

Find a tutor

Languages