Discussion about this post

User's avatar
dmytryl's avatar

Well, suppose I make a copy of you, and give it $1000 , and if it declines, I give real you $10000 . Which I don't tell to the copy of you. Now some sort of agent that declines $1000, but takes $10000 (being overwhelmed by greed for example) wins, and nobody needs to look at anyone's source code in any detail, everything can be black boxed.

Expand full comment
dmytryl's avatar

I think in Newcomb's there's severe confusion of aspects that are part of decision theory and aspects that belong in the world model. If the predictor works by time travel, you 1-box (One could implement a world with time travel where software would find a stable solution). If the predictor works by simulation, you also 1-box if your world model is flexible enough to represent a copied instance of any deterministic system (including you), otherwise you may 2-box but chiefly because you can't represent the predictor correctly - the agent is expecting, on the formal level, that it is getting unknown+1000 , a clear cut case of failure to even predict what you are getting let alone make a choice.

If the predictor works by magic, there is a problem that it is not representable in reasonable world models. The canonical predictor works like charisma of the King David, and there's no actual decision happening, your decision is predetermined.

It is all a lot clearer from the perspective of writing some simple practical AI that models the world, tries it's actions in the world, and decides.

Expand full comment
70 more comments...