Discussion about this post

User's avatar
Carter's avatar

I was recently directed to your post via a comment on my LessWrong post (https://www.lesswrong.com/posts/xHMEtKz68fXDjA9H3/third-order-cognition-as-a-model-of-superintelligence), and I believe my model of superintelligence may add clearer flavour to many of the ideas that you talk about — please let me know your thoughts or if this is overstating my own post! :)

You call out “the two best candidate theories of intelligent agency that we currently have (expected utility maximization and active inference), [and] explain why neither of them is fully satisfactory, and outline how we might do better.”

I propose that my third-order cognition model be a solution — expected utility maximisation is hard to reconcile (unifying goals and beliefs) in your case. With tightly bound third-order cognition I describe agency permeability capturing the idea of "influence of global action policy flowing between subsystems", which relates to this idea of predictive utility maximisation (third-order) dovetailing with stated preferences (second-order).

Your description of “active inference — prediction of lower layers at increasing levels of abstraction” directly relates to mine of “lower-order irreconcilability [of higher level layers]”.

As a sticking point of active inference you state “So what does expected utility maximization have to add to active inference? I think that what active inference is missing is the ability to model strategic interactions between different goals. That is: we know how to talk about EUMs playing games against each other, bargaining against each other, etc. But, based on my (admittedly incomplete) understanding of active inference, we don’t yet know how to talk about goals doing so within a single active inference agent.

Why does that matter? One reason: the biggest obstacle to a goal being achieved is often other conflicting goals. So any goal capable of learning from experience will naturally develop strategies for avoiding or winning conflicts with other goals—which, indeed, seems to happen in human minds.

More generally, any theory of intelligent agency needs to model internal conflict in order to be scale-free. By a scale-free theory I mean one which applies at many different levels of abstraction, remaining true even when you “zoom in” or “zoom out”. I see so many similarities in how intelligent agency works at different scales (on the level of human subagents, human individuals, companies, countries, civilizations, etc) that I strongly expect our eventual theory of it to be scale-free."

I deal with this by stating that a metaphysically bound [5 conditions] third-order cognition being exhibits properties including “Homeostatic unity: all subsystems participate in the same self-maintenance goal (e.g biological survival and personal welfare)”. This provides an overriding goal to defer to — resolving scale-free conflicts.

You then reason about how to determine an “incentive compatible decision procedure”, closing on the most promising angle as “On a more theoretical level, one tantalizing hint is that the ROSE bargaining solution is also constructed by abandoning the axiom of independence—just as Garrabrant does in his rejection of EUM above. This connection seems worth exploring further.”

I hint towards some the same thing — through optimising second-order identity coupling (specifically operationalisable via self-other overlap) I propose this improves alignment of the overall being.

Expand full comment
Abhimanyu Pallavi Sudhir's avatar

> But if I want a complex plan to happen, I need to successfully coordinate every aspect of it. So, unlike predictions of observations, predictions of actions need to have some mechanism for giving a single plan control over many different actuators.

You might be interested in the "wide market" framework in our recent paper: https://arxiv.org/pdf/2503.05828 where we point out that a key feature of real-world markets is that full control isn't auctioned off at each step, but instead intelligently partitioned into different components (which economists call goods) and bids are placed over portions of control.

> Re: Alice-Bob cake-cutting

Unfortunately this criticism is not convincing to me, both in this context and in various "social choice" settings where it is described as a problem. Our intuition about fairness comes from diminishing marginal utility (and risk-aversion) -- if the problem instead assumes linear utilities, then it is totally right for this intuition to be violated.

But in general I do agree with your intuition that "aggregating agents corresponds to incentive-compatible decision procedures". I had expressed a similar thought recently in terms of markets in particular:

> *Collectives of agents can be modeled as agents when they trade*

> Take two agents A and B which action sets $X_A$ and $X_B$ and utility functions $U_A(x_A, x_B)$, $U_B(x_A, x_B)$. In general if they are individually maximizing their utility functions, then maybe their chosen actions $(x_A^*,x_B^*)$ will be some Nash-equilibrium of the game — but it may not be possible to interpret this as the action of a “super-agent”.

>

> There are two ways to interpret the words “the action of a super-agent”:

>

> - as the action that maximizes some “total utility function”, where this total utility function has some sensible properties (like being increasing in each agent’s utility)

> - if the two agents could co-ordinate and choose their action, there is no other action that would be better for both of them — i.e. is Pareto-optimal. In fact it is a basic result in welfare economics that these two definitions are (under some assumptions) equivalent. And furthermore by the fundamental theorems trade achieves Pareto-optimality — i.e. trade allows the collective to be taken as a super-agent.

>

> More precisely by “trade” we mean: perfect information, zero transaction costs, and (I guess the last one is only relevant in economics) competitive market structure. I guess it also depends on perfect property rights/perfect assurance, which is implicitly assumed under zero transaction costs.

Expand full comment
12 more comments...

No posts