Demis Hassabis - Scaling, Superhuman AIs…

Dwarkesh Patel

Feb 28

"scaling is an artform"

Listen →

4 Comments

Nathan Lambert

Interconnects

Feb 28Liked by Dwarkesh Patel

Your first question being about RL, wow you’re really trying to make me listen asap

Expand full comment

Reply (1)

Mar 4

I enjoy the podcast but sadly I have to unsubscribe from the YouTube, because the volume of shorts is plain annoying. Will have to manage with substack RSS (or inevitable links from MR, Zvi or other....)

Expand full comment

Trelis Research

Trelis Research Updates

Mar 3

Fascinating the comment of chess pros being more efficient at move selection... not just than tree search like Stockfish, but than reinforcement learning + tree search like Alpha Zero.

This begs the question of whether this decrement in efficiency is because of a) a lack of breadth of data, b) an incomplete/wrong reward function, or c) RL + tree search being insufficient.

My guess is that RL + tree search (or perhaps RL alone?) could be enough BUT probably one can't get the correct RL objective function without building out a system of genes (which spawn phenomena such as emotions and intuition that underly efficient move selection).

Said differently, if one doesn't build out the genes, then one needs to figure out the mechanism those genes use and replicate it in another way... for that one needs RL + Tree Search + understanding that other mechanism(s).

Expand full comment

Dwarkesh Podcast

Demis Hassabis - Scaling, Superhuman AIs…