203 Episodes

  1. TextGrad: Backpropagating Language Model Feedback for Generative AI Optimization

    Published: 3/27/2025
  2. MemReasoner: Generalizing Language Models on Reasoning-in-a-Haystack Tasks

    Published: 3/27/2025
  3. RAFT: In-Domain Retrieval-Augmented Fine-Tuning for Language Models

    Published: 3/27/2025
  4. Inductive Biases for Exchangeable Sequence Modeling

    Published: 3/26/2025
  5. InverseRLignment: LLM Alignment via Inverse Reinforcement Learning

    Published: 3/26/2025
  6. Prompt-OIRL: Offline Inverse RL for Query-Dependent Prompting

    Published: 3/26/2025
  7. Alignment from Demonstrations for Large Language Models

    Published: 3/25/2025
  8. Q♯: Distributional RL for Optimal LLM Post-Training

    Published: 3/18/2025
  9. Scaling Test-Time Compute Without Verification or RL is Suboptimal

    Published: 3/14/2025
  10. Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

    Published: 3/14/2025
  11. Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

    Published: 3/14/2025
  12. Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

    Published: 3/14/2025
  13. Revisiting Superficial Alignment Hypothesis

    Published: 3/14/2025
  14. Diagnostic uncertainty: teaching language Models to describe open-ended uncertainty

    Published: 3/14/2025
  15. Language Model Personalization via Reward Factorization

    Published: 3/14/2025
  16. Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration

    Published: 3/14/2025
  17. How Well do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach

    Published: 3/14/2025
  18. Can Large Language Models Extract Customer Needs as well as Professional Analysts?

    Published: 3/13/2025
  19. Spurlens: finding spurious correlations in Multimodal llms

    Published: 3/13/2025
  20. Improving test-time search with backtrack- Ing Improving test-time search with backtrack- Ing against in-context value verifiersagainst in-context value verifiers

    Published: 3/13/2025

10 / 11

Men know other men best. Women know other women best. And yes, perhaps AIs know other AIs best. AI explains what you should know about this week's AI research progress.