Thoughts after reading lots of Steve Byrnes:

The important question is whether or not it is possible get to superhuman coders within the current AI paradigm. I’ve previously argued that it is, but it’s easy to see that LLMs knowing the “thing that humans would output under similar circumstances” won’t get you there.

So future LLMs must be very good at not knowing a thing and then figuring it out. This is in principle possible, and the idea is that RL on base models gets you there, as I’ve said previously. They already do this in many cases.

It’s just that the “imitating humans” confounder is so strong it’s often very difficult to tell which capabilities are which.