πŸ“Œ

I believe in free speech and respectful debate

People have the right to be wrong. No matter how strongly you hold a belief, respect the humanity of those who disagree with you.

in-this-house.png
πŸ”— Permalink

A few predictions:

We are >10 years from being capable of building broadly superintelligent AI, even with real speedups from adding compute and leveraging capabilities of intermediate AI systems. About 75% confidence in the >10 years claim, inside view (before updating on what others think).

I do believe that if anyone builds it, everyone dies is basically true, but with a little bit of error around what everyone dies ends up looking like in practice.

However, I think warning signs will continue up to super intelligence, in other words even near-superintelligent systems will display alignment failures if we pay attention.

Conditional on this not happening, building superintelligence is no less dangerous; but this matters because it gives us more chances to stop increasing capabilities before it is too late.

Now depending on what the momentum is like stopping may be very difficult, as the exponential will probably be steep at this time; for this reason among others, it is always safest to stop now rather than later.

πŸ”— Permalink

There is populism as in “charismatic public figure exploits the masses by feeding them ‘fuck the elite’ candy while robbing them blind” which is bad

And there is populism as in “let’s do what’s actually good for the masses, fuck the interests of the elite”

which is good, at least in medium-large doses

πŸ”— Permalink

I liked the design of macOS Big Sur through Sequoia (11 through 15). I’ll miss it.

πŸ”— Permalink

Thoughts after reading lots of Steve Byrnes:

The important question is whether or not it is possible get to superhuman coders within the current AI paradigm. I’ve previously argued that it is, but it’s easy to see that LLMs knowing the “thing that humans would output under similar circumstances” won’t get you there.

So future LLMs must be very good at not knowing a thing and then figuring it out. This is in principle possible, and the idea is that RL on base models gets you there, as I’ve said previously. They already do this in many cases.

It’s just that the “imitating humans” confounder is so strong it’s often very difficult to tell which capabilities are which.

πŸ”— Permalink

A good list from Emmett Shear:

Priors include:
- Objects arise from dependent origination
- Things that get surprised too much die
- Care is the driver of intelligence
- Objectivity is the surface of subjectivity
- The universe is alive
- The fucking hippies were right again, fuck

πŸ”— Permalink

Looking for a single word that means internal perspective without implying consciousness or personhood - I can’t find anything but I like β€˜autoception’

πŸ”— Permalink

I’m running the MacOS 26 Beta and it’s a complete mess. I should go back to stable but I’m resisting for some reason

Edit: it turns out you cannot go back. Yay.

πŸ”— Permalink

Goodhart’s Curse says given a target optimization function f(x) and an approximation of f(x) called g(x), the argmax of g(x) is in expectation some x with large g(x) - f(x). This is pretty rigorous.

This is often used to argue that when creating an AI, any difference between the intended utility function and the true utility function is likely to blow up.

It turns out that if we make some assumptions about the error distribution, the expected error in optimizing f(x) grows very slowly (O((log n)^{12})) with the size of the searched solution space.

So maybe this curse won’t bite so hard after all.

πŸ”— Permalink

β€œThe phuckening is upon us,” muttered Dr. Abernathy, adjusting his goggles.

πŸ”— Permalink
×