Podchaser Logo
Home
791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

Released Tuesday, 11th June 2024
Good episode? Give it some love!
791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

Tuesday, 11th June 2024
Good episode? Give it some love!
Rate Episode

Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique’s origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education.

This episode is brought to you by AWS Inferentia and AWS Trainium, and by Crawlbase, the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

In this episode you will learn:
• Why it is important that AI is open [03:13]
• The efficacy and scalability of direct preference optimization [07:32]
• Robotics and LLMs [14:32]
• The challenges to aligning reward models with human preferences [23:00]
• How to make sure AI’s decision making on preferences reflect desirable behavior [28:52]
• Why Nathan believes AI is closer to alchemy than science [37:38]

Additional materials: www.superdatascience.com/791

Show More
Rate

Join Podchaser to...

  • Rate podcasts and episodes
  • Follow podcasts and creators
  • Create podcast and episode lists
  • & much more

Episode Tags

Do you host or manage this podcast?
Claim and edit this page to your liking.
,

Unlock more with Podchaser Pro

  • Audience Insights
  • Contact Information
  • Demographics
  • Charts
  • Sponsor History
  • and More!
Pro Features