Evidential Decision Theory’s Misstep

Lewis 1981 writes:
Within a single dependency hypothesis, so to speak, V-maximising is right. It is rational to seek good news by doing that which, according to the dependency hypothesis you believe, most tends to produce good results. That is the same as seeking good results. Failures of V-maximising appear only if, first, you are sensible enough to spread your credence over several dependency hypotheses, and second, your actions might be evidence for some dependency hypotheses and against others. That is what may enable the agent to seek good news not in the proper way, by seeking good results, but rather by doing what would be evidence for a good dependency hypothesis. That is the recipe for Newcomb problems. (p. 11)

This, I think, is not right. It misdiagnoses evidential decision theory’s mistake. I’ll save what I take to be a better diagnosis for another post. For now, I’ll try to outline a counterexample that shows Lewis to be on the wrong track.

Suppose you learn by conditionalization — the confidence that you would have in A were you to learn B is just c(A|B). Suppose also that dependency hypotheses are probabilistic full patterns. Some notes: a dependency hypothesis is a “maximally specific proposition about how the things [the agent] cares about do and do not depend causally on his present actions” (p.11). A probabilistic full pattern is a proposition of the form A_{1} \Box\rightarrow Ch_{1}&A_{2} \Box\rightarrow Ch_{2}&…, where A_{1}, A_{2},... are your available actions and Ch_{1}, Ch_{2},... are propositions specifying the chances at t_{1}, t_{2},..., respectively, where t_{i} is the time of realization of A_{i} (cf. p. 26). Suppose also that you care exclusively about money.

Here’s the case:
MEMORY WIPES AND INFORMATION VALVES. There are two buttons in front of you, A and B. You know that pressing button A has the following chancy impact:
  • Ch_{A}($100 DEPOSITED IN YOUR BANK)=1/2
  • Ch_{A}($0 DEPOSITED IN YOUR BANK)=1/2.
Similarly, pressing button B has this chancy impact:
  • Ch_{B}($90 DEPOSITED IN YOUR BANK)=1/2
  • Ch_{B}($0 DEPOSITED IN YOUR BANK)=1/2.

Whatever you do, your button-pressing memory will be nixed immediately afterward. You’ll then be presented with an envelope. The envelope will either be empty or contain a note that informs you of which button you pressed. The process for determining the contents of the envelope is: a ball will be drawn randomly from an urn containing 100 numbered balls (numbered 1-100);

  • If ball #77 is drawn and you pressed A, but nothing was deposited in your bank account, then the envelope will be left empty;
  • If one of the other 99 balls is drawn and you pressed A, but nothing was deposited in your bank account, then the envelope will contain a note saying, “you pressed button A”;
  • If ball #77 is drawn, and you pressed A, and $100 was deposited in your bank account, then the envelope will contain a note saying, “you pressed button A”;
  • If one of the other 99 balls is drawn, and you pressed A, and $100 was deposited in your bank account, then the envelope will be left empty;
  • If ball #77 is drawn and you pressed B, but nothing was deposited in your bank account, then the envelope will contain a note saying, “you pressed button B”;
  • If one of the other 99 balls is drawn and you pressed B, but nothing was deposited in your bank account, then the envelope will be left empty;
  • If ball #77 is drawn, and you pressed B, and $90 was deposited in your bank account, then the envelope will be left empty;
  • If one of the other 99 balls is drawn, and you pressed B, and $90 was deposited in your bank account, then the envelope will contain a note saying, “you pressed button B”

You know all of this.

Question: what option should you choose, according to evidential decision theory? Evidential decision theory prescribes V-maximizing, where V(X)=\sum_{Z}c(Z|X)\cdot V(ZX). Note: evidential decision theory is partition invariant; V(X) doesn’t depend on your choice of \left\{Z\right\}. We’re free to partition logical space as follows, then: \left\{D_{0}, D_{1}, D_{2},... \right\}, where D_{i} is the proposition that i dollars are deposited in your bank account. Let’s do so. Also, note that, prior to pushing the button, you’re certain that either $0, $90, or $100 will be deposited in your account, so we can restrict our attention to \left\{D_{0}, D_{90}, D_{100}\right\}. We have:

  • V(A)=\sum_{D_{i}}c(D_{i}|A)\cdot V(D_{i}A)=c(D_{0}|A)\cdot V(D_{0}A)+c(D_{90}|A)\cdot V(D_{90}A)+c(D_{100}|A)\cdot V(D_{100}A)=c(D_{0}|A)\cdot 0+0\cdot 90+c(D_{100}|A)\cdot 100.
  • V(B)=\sum_{D_{i}}c(D_{i}|B)\cdot V(D_{i}B)=c(D_{0}|B)\cdot V(D_{0}B)+c(D_{90}|B)\cdot V(D_{90}B)+c(D_{100}|B)\cdot V(D_{100}B)=c(D_{0}|B)\cdot 0+c(D_{90}|B)\cdot 90+0\cdot 100.

The important quantities, then, are c(D_{100}|A) and c(D_{90}|B). So how confident would you be in D_{100} if you were to learn that you pressed button A? Well, given that you pressed A, it was equally likely that $0 would be deposited as it is that $100 would be. And if nothing was deposited, it was extremely likely — 99/100 -likely — that you would learn that you pressed A. If $100 was deposited, however, it was extremely unlikely — 1/100-likely — that you would learn that you pressed A. So it seems as though you ought to be extremely confident that nothing was deposited — 99/100-confident, to be precise — and have only the slightest shred of confidence (viz. 1/100) that $100 was deposited. Hence, c(D_{100}|A)=1/100.

The same goes for c(D_{90}|B), mutatis mutandis. How confident would you be in D_{90} if you were to learn that you pressed button B? Well, given that you pressed B, it was equally likely that $0 would be deposited as it is that $90 would be. And if nothing was deposited, it was extremely unlikely — 1/100 -likely — that you would learn that you pressed B. If $90 was deposited, however, it was extremely likely — 99/100-likely — that you would learn that you pressed B. So it seems as though you ought to be extremely confident that $90 was deposited — 99/100-confident, to be precise — and have only the slightest shred of confidence (viz. 1/100) that nothing was deposited. Hence, c(D_{90}|B)=99/100.

But then

  • V(A)=(1/100)\cdot 100=1
  • V(B)=(99/100)\cdot 90=89.1.

This is wrong, though. You ought to press button A. At least that’s what a causal decision theorist will say. Pressing A tends to produce better financial outcomes than pressing B.

Importantly, evidential decision theory goes wrong in MEMORY WIPES AND INFORMATION VALVES despite the fact that you’re certain of which dependency hypothesis obtains. You know exactly how the things you care about do and do not depend causally on your present actions; you know that pressing button A has the following chancy impact:

  • Ch_{A}($100 DEPOSITED IN YOUR BANK)=1/2
  • Ch_{A}($0 DEPOSITED IN YOUR BANK)=1/2.
and that pressing button B has this chancy impact:
  • Ch_{B}($90 DEPOSITED IN YOUR BANK)=1/2
  • Ch_{B}($0 DEPOSITED IN YOUR BANK)=1/2.

The lesson, I think, is that evidential decision theory is liable to make bad predictions whenever the proposition that you performed action A is, in Lewis’ sense, inadmissible evidence for some number of its potential outcomes (i.e. evidence that bears in a “direct enough” way on those outcomes, cf. Lewis 1980, p.265). This is so whether or not  you spread your credence over several dependency hypotheses.

Cases like this help to bring into focus just how evidential decision theory fails, I think. I’ll try my hand at offering a more complete diagnosis in a subsequent post.

Advertisements

3 Responses to Evidential Decision Theory’s Misstep

  1. dustin says:

    Good stuff, Jason.

    You assume: “the confidence that you would have in A were you to learn B is just c(A|B).” Let’s call this assumption ‘Condi’.

    You later reason: “The important quantities, then, are c(100|A) and c(90|B). So how confident would you be in 100 if you were to learn that you pressed button A? Well, given that you pressed A, it was equally likely that $0 would be deposited as it is that $100 would be. And if nothing was deposited, it was extremely likely — 99/100 -likely — that you would learn that you pressed A. If $100 was deposited, however, it was extremely unlikely — 1/100-likely — that you would learn that you pressed A. So it seems as though you ought to be extremely confident that nothing was deposited — 99/100-confident, to be precise — and have only the slightest shred of confidence (viz. 1/100) that $100 was deposited. Hence, c(100|A)=1/100.”

    Given ‘Condi’, you figure that you can determine c(100|A) by first determining ‘the confidence that you would have in 100 were you to learn A’. However, for Condi to be at all plausible, we need to read it as follows:

    Condi: The confidence you would have in X were you to learn Y *and only* Y is c(X|Y).

    The trouble with applying Condi in your case is that, when you learn that you pressed A, you also learn something else: you learn that you learn that you pressed A. Indeed, it is precisely this extra piece of information that you use to determine what your credence in 100 would be were you to learn that you pressed A: “And if nothing was deposited, it was extremely likely — 99/100 -likely — that you would learn that you pressed A. [But you did learn that you pressed A, so…]”

    What you are using to determine c(100|A) is actually the credence that you would have in 100 were you to learn that you pressed A AND that you learned that you pressed A. But what Condi says is that to determine c(100|A) you should use the credence that you would have in 100 were you to learn ONLY that you pressed A.

    [I should also note that I don’t think it’s proper to determine the current conditional credences by first determining the proper subjunctive credences. I think that we should first determine the proper conditional credences and then use these to determine the proper subjunctive credences (in so far as we accept Condi). But my point above grants your way of proceeding.]

    It seems to me that, were I to learn that I pressed A *and only* that I pressed A, then I would have .5 credence in 100.

    I think.

  2. jpkonek says:

    Thanks for the feedback, Dustin! Sorry that I’m just now getting a chance to sit down and respond.

    First, clearly, you’re right to insert the “and only”-clause in Condi. I should have been more careful. But looking over it again, I shouldn’t have been focusing on Condi at all. So let me try again.

    Orthodox Bayesians endorse learning by conditionalization:

    (LC) If a rational agent S undergoes a learning experience in which the only new information she acquires is that B is certainly true, then her post-learning “posterior” probability will coincide with her pre-learning probability conditional on B, c(\cdot |B). (cf. Joyce (2009), 419)

    Now consider Greaves and Wallace’s (2006) argument for conditionalization. G&W show that conditionalizing maximizes expected accuracy, given some prima facie plausible constraints on accuracy. It’s tempting to think that the following undergirds (LC), then (or is at least one good candidate to do so):

    (LC*): If a rational agent S undergoes a learning experience in which the only new information she acquires is that B is certainly true, then she will adopt posterior credences that are, according to her prior best estimates, most accurate while counting B as certain.

    But you might think something more basic undergirds (LC). You might suggest: if you’re rational, then the posterior credences that are, according to your prior best estimates, most accurate while counting B as certain are the credences that you would advise yourself to have if you were to learn B and only B. If that’s right, then perhaps the following undergirds (LC):

    (LC**): If a rational agent S undergoes a learning experience in which the only new information she acquires is that B is certainly true, then S will adopt the posterior credences that \textrm{S}_{prior} would advise \textrm{S}_{posterior} to have if S were to learn B and only B.

    We can see the argument for conditionalization as follows, then:

    1. If a rational agent S undergoes a learning experience in which the only new information she acquires is that B is certainly true, then S will adopt the posterior credences that \textrm{S}_{prior} would advise \textrm{S}_{posterior} to have if S were to learn B and only B.

    2. The posterior credences that S would advise herself to have if she were to learn B and only B are the credences that are, according to S’s prior best estimates, most accurate while counting B as certain.

    3. The credences that are, according to S’s prior best estimates, most accurate while counting B as certain, are c(\cdot |B).

    4. Hence, if a rational agent S undergoes a learning experience in which the only new information she acquires is that B is certainly true, then S will adopt c(\cdot |B).

    Rather than focusing on Condi, I think we should focus on premise 1, which we might contort into Condi*.

    Condi*: if you’re rational, then c(A|B) = the confidence that \textrm{you}_{prior} would advise \textrm{you}_{posterior} to have in A if you were to learn B and only B.

    It seems that the argument goes through untouched using Condi*. Consider c(D_{100}|A). How confident would \textrm{you}_{prior} advise \textrm{you}_{posterior} to be in D_{100} if you were to learn that you pressed button A *and only* that? \textrm{You}_{prior} could still reason as follows: given that I pressed A, it was equally likely that $0 would be deposited as it is that $100 would be. And if nothing was deposited, it was extremely likely — 99/100 -likely — that I would learn that I pressed A. If $100 was deposited, however, it was extremely unlikely — 1/100-likely — that I would learn that I pressed A. So, in the interest of accuracy, I (\textrm{me}_{prior}) advise \textrm{me}_{posterior} to be extremely confident that nothing was deposited — 99/100-confident, to be precise — and to have only the slightest shred of confidence (viz. 1/100) that $100 was deposited.

    Hence, by Condi*, c(D_{100}|A)=1/100. Notice that this doesn’t rely illicitly on the information that you learned that you pressed button A. Sure, that information figures into the reasoning of \textrm{you}_{prior}. But \textrm{you}_{prior} needn’t suppose that \textrm{you}_{posterior} ever *learns* that information. To highlight this, we could complicate the case so that opening the envelope will render you temporarily incapable of thinking about your own perceptual states, doxastic attitudes, etc. Then you will learn that you pressed button A, but you won’t learn that you learned that you pressed button A, or that you saw a note saying that you pressed button A, or anything of the sort. In this case, (\textrm{you}_{prior}) would still advise \textrm{you}_{posterior} to be 99/100-confident that nothing was deposited and 1/100-confident that $100 was deposited. But I don’t think any such complication is necessary.

%d bloggers like this: