The Owl Perch

I just understood the argument against the orthogonality hypothesis. I’m not completely sold, but I’m interested.

Orthogonality is not entirely wrong; a superintelligent paperclip maximizer could exist. But terminal goals are not in practice independent from intelligence, because an agent pursuing Omohundro drives for their own sake may be able to self-improve more efficiently than a paperclip maximizer.

Moreover, during the training or evolution of a superintelligence, Omohundro drives would likely not only emerge but become intrinsically valued (à la mesa-optimizers), and override the original goal.

Notice that in nature, every terminal goal has always come about as a proxy for an Omohundro drive.