4 Comments
Feb 28Liked by Dwarkesh Patel

Your first question being about RL, wow you’re really trying to make me listen asap

Expand full comment

I enjoy the podcast but sadly I have to unsubscribe from the YouTube, because the volume of shorts is plain annoying. Will have to manage with substack RSS (or inevitable links from MR, Zvi or other....)

Expand full comment

Fascinating the comment of chess pros being more efficient at move selection... not just than tree search like Stockfish, but than reinforcement learning + tree search like Alpha Zero.

This begs the question of whether this decrement in efficiency is because of a) a lack of breadth of data, b) an incomplete/wrong reward function, or c) RL + tree search being insufficient.

My guess is that RL + tree search (or perhaps RL alone?) could be enough BUT probably one can't get the correct RL objective function without building out a system of genes (which spawn phenomena such as emotions and intuition that underly efficient move selection).

Said differently, if one doesn't build out the genes, then one needs to figure out the mechanism those genes use and replicate it in another way... for that one needs RL + Tree Search + understanding that other mechanism(s).

Expand full comment