Skip to main content


Adria on AXRP!


Yet another new episode!

Suppose we're worried about AIs engaging in long-term plans that they don't tell us about. If we were to peek inside their brains, what should we look for to check whether this was happening? In this episode Adrià Garriga-Alonso talks about his work trying to answer this question.

Transcript
Video

in reply to Daniel Filan

Yesss got a lot of views on this one - I think I successfully managed to create a clickbait thumbnail.