Coming back to this again - it remains really really great.
I wonder if there is a more "viral", more focused version of it that talks only about perverse pessimization, and lists examples from EA and AI safety, like you hinted at in this tweet: https://x.com/RichardMCNgo/status/1858948025336688882
"A second major reason why perverse pessimization arises is that anti-X behavior can help someone rise within a (nominally) pro-X coalition. This is a kind of vice signalling, showing that you have enough power to directly defy the values of your own coalition."
This feels slightly off to me - in my head, it's not to signal power, but to signal that you are someone who "plays the game", that you are not a mark, that you don't have scruples and are therefore easier to cooperate with for others like you. People who care about winning at all costs need to find each other.
The first three might be considered interesting convergent *downsides* of instrumental goals. Ironic in our authors context, maybe! And I find I can use them to support generalising my intuitions that horrible but recoverable warning shots are likely.
> Fear-based motivations [induce single bit weights that]
reduce the range of policies you'll consider or invent (Pause AI)
Don't mistake the simple brand for the wider range of thinking within though. All solutions are up for discussion in said movement.
Ironically, unless a movement is a tightly controlled conspiracy, the loudest, most influential voices are likely to be the less discerning / most simple-minded / most extreme.
In part because that’s what carries well, but also because people tend to specialize in either nuance or messaging, not both.
>the AGI design should be widely separated in the design space from any design that would constitute a hyperexistential risk
https://www.lesswrong.com/w/separation-from-hyperexistential-risk
Coming back to this again - it remains really really great.
I wonder if there is a more "viral", more focused version of it that talks only about perverse pessimization, and lists examples from EA and AI safety, like you hinted at in this tweet: https://x.com/RichardMCNgo/status/1858948025336688882
"A second major reason why perverse pessimization arises is that anti-X behavior can help someone rise within a (nominally) pro-X coalition. This is a kind of vice signalling, showing that you have enough power to directly defy the values of your own coalition."
This feels slightly off to me - in my head, it's not to signal power, but to signal that you are someone who "plays the game", that you are not a mark, that you don't have scruples and are therefore easier to cooperate with for others like you. People who care about winning at all costs need to find each other.
I've thought about this too. But you left out some of the most important drivers:
- Having any goal makes you want to grab power, which makes you more vulnerable to existing incentives (often the incentives you you didn't like!)
- Having any goal makes you want to grab power, then your movement is full of sociopaths who just want power.
- Having any goal makes you want to grab power and it's hard to know when to stop— when to switch from grabbing power to spending it.
-> The more scared you are, the more likely you continue to grab power rather than stop at the relevant time.
- Fear-based motivations lower your bit-rate & make you see the world in black and white terms related to your fear.
-> This means they reduce the range of policies you'll consider or invent (Pause AI, Defund the police, etc)
-> This creates a bitter policy environment where either nothing can get done, or your opponents look more reasonable.
The first three might be considered interesting convergent *downsides* of instrumental goals. Ironic in our authors context, maybe! And I find I can use them to support generalising my intuitions that horrible but recoverable warning shots are likely.
> Fear-based motivations [induce single bit weights that]
reduce the range of policies you'll consider or invent (Pause AI)
Don't mistake the simple brand for the wider range of thinking within though. All solutions are up for discussion in said movement.
Oh, that’s one more I left out!
Ironically, unless a movement is a tightly controlled conspiracy, the loudest, most influential voices are likely to be the less discerning / most simple-minded / most extreme.
In part because that’s what carries well, but also because people tend to specialize in either nuance or messaging, not both.