Syndicate this site: (RSS)

Best of Seven

My father asked a fairly straight forward question last week; "How do you calculate the probability of winning a seven game series, given that the probability of winning any individual game is p?"

There are two different ways to calculate it. The process that seems most like the real problem is to work out the odds of winning 3 of 3 and game 4, 3 of 4 and game 5, 3 of 5 and game 6, 3 of 6 and game 7 - the sum of these values is the likelyhood of winning a best of 7.

On the other hand, you can simply sum the probability of winning 4 of 7, 5 of 7, 6 of 7, 7 of 7, and get the same result.

But looking at the formulas, it wasn't obvious to me that had to be so.

First, a hand wavy demonstration. Let P(n) be the probability of winning the 4th game in game n. Let Z(x:y) be the probability of winning y games out of z. Logic alone should confirm for you that Z(x:0) + Z(x:1) + ... + Z(x:x) = 1 for any value x - if you play x games without draws, winning some number of games between 0 and x is inevitable.

Starting from the first equation

W
= P(4) + P(5) + P(6) + P(7)
= P(4)*1 + P(5)*1 + P(6)*1 + P(7)*1
= P(4)*[Z(3:0)+Z(3:1)+Z(3:2)+Z(3:3)] + P(5)*[Z(2:0)+Z(2:1)+Z(2:2)] + P(6)*[Z(1:0)+Z(1:1)]+P(7)
= P(4)Z(3:0) + P(5)Z(2:0) + P(6)Z(1:0) + P(7) + P(4)Z(3:1) + P(5)Z(2:1)+P(6)Z(1:1) + P(4)Z(3:2) + P(5)Z(2:2) + P(4)Z(3:3)
= Z(7:4) + Z(7:5) + Z(7:6) + Z(7:7)

In prose, starting from the probability of winning the series in exactly n games, you expand the appropriate representation of unity, regroup by the total number of wins, and end up with the sum of winning n games in 7.

So we can see that it works, though the above is a bit informal. It could, of course, be fleshed out into something rigorous. Instead I attacked the problem from the equations suggested by the statements of the different approaches, and tried to demonstrate that they were equivalent.

To do this, I leaned heavily on Feller's Introduction to Probability Theory and Its Applications, which dedicates a chapter to Combinatorial Analysis. This book has turned out to be quite handy, an excellent recommendation by... whoever it was that suggested it to me.

Notation: I'm not happy with html for expressing sums, so I'll use [x:y,z] to indicate a sum where index x ranges from y to z.

OK, what is the probability of winning at least n games out of 2n-1? The first step is to write out the expression, in terms of probabilities p and q. Of course, q is simply (1-p), so an expansion of that using the binomial theorem will get us to the first resting spot

= [x:n,2n-1] Z(2n-1:x)p^x q^(2n-1-x)
= [x:n,2n-1] Z(2n-1:x)p^x (1-p)^^(2n-1-x)
= [x:n,2n-1] Z(2n-1:x)p^x [y:0,2n-1-x] Z(2n-1-x:y)(-p)^y
= [x:n,2n-1] [y:0,2n-1-x] Z(2n-1:x)p^x Z(2n-1-x:y)(-p)^y
= [x:n,2n-1] [y:0,2n-1-x] Z(2n-1:x) Z(2n-1-x:y)p^(x+y)(-1)^y

The next step is to recombine those two Z() expressions. To do so, we take advantage of the following identity:

Z(a:x)Z(a-x:y)
= a!/(x!(a-x)!) * (a-x)!/(y!(a-x-y)!)
= a!/x! * (a-x)!/(a-x)! * 1!/(y!(a-x-y)!)
= a!/x! * (x+y)!/(x+y)! * 1!/(y!(a-x-y)!)
= a!/((x+y)!(a-x-y)!) * (x+y)!/x!y!
= Z(a:x+y)Z(x+y:x)

That turns out to be very handy if you can recombine Z() so that the expression does not depend on the indexes of the summation - you can then pull it through.

= [x:n,2n-1] [y:0,2n-1-x] Z(2n-1:x) Z(2n-1-x:y)p^(x+y)(-1)^y
= [x:n,2n-1] [y:0,2n-1-x] Z(2n-1:x+y) Z(x+y:y)p^(x+y)(-1)^y
= [x:n,2n-1] [y:0,2n-1-x] Z(2n-1:x+y) Z(x+y:x)p^(x+y)(-1)^y

p can have any value in the range 0 to 1. So to demonstrate that the two formulas are the same, what we need to do is demonstrate that the powers of p have the same coefficients. It makes sense then to regroup the expression above so that like powers of p are together. So we look at k=x+y; notice that to achieve k, given that y is non negative, we can restrict x to the range n to k.

= [x:n,2n-1] [y:0,2n-1-x] Z(2n-1:x+y) Z(x+y:x)p^(x+y)(-1)^y
= [k:n,2n-1][x:n,k] Z(2n-1:k) Z(k:x)p^(k)(-1)^(k-x)
= [k:n,2n-1]Z(2n-1:k)p^(k)(-1)^k [x:n,k]Z(k:x)(-1)^(-x)
= [k:n,2n-1]Z(2n-1:k)p^(k)(-1)^k [x:n,k]Z(k:x)(-1)^(x)

The last summation is eliminated by the following tricks. First, we notice that [x:n,k] is really [x:0,k]-[x:0,n-1]. But by the binomial theorem

[x:0,k]Z(k:x)(-1)^x
= (1 + (-1) )^x
= 0 ^x
=0

Therefore, we can make the following change:

= [k:n,2n-1]Z(2n-1:k)p^(k)(-1)^k [x:n,k]Z(k:x)(-1)^(x)
= [k:n,2n-1]Z(2n-1:k)p^(k)(-1)^k (-1)[x:0,n-1]Z(k:x)(-1)^(x)

We need a pair of identities to get rid of the last bit

Z(x:r-1) + Z(x:r)
= x!/((r-1)!(x-r+1)!) + x!/(r!(x-r)!)
= rx!/((r)!(x-r+1)(x-r)!) + x!/(r!(x-r)!)
=(r/(x-r+1)+1)(x!/(r!(x-r)!)
=((r+x-r+1)/(x-r+1))(x!/(r!(x-r)!)
= (x+1)/(x-r+1) * (x!/(r!(x-r)!)
= (x+1)!/(r!(x+1-r)!
= Z(x+1:r)

Now, since Z(x:0) = Z(x-1:0) = 1, and by the above Z(x,v) = Z(x-1,v-1)+Z(x-1,v), we conclude

[v:0,n](-1)^vZ(a:v)
= Z(a-1:0) + [v:1,n](-1)^vZ(a:v)
= Z(a-1:0) + [v:1,n](-1)^vZ(a-1:v-1)+Z(a-1,v)
= (-1)^nZ(a-1:n)
The leading term of the sum cancels out the previous....
= [k:n,2n-1]Z(2n-1:k)p^(k)(-1)^k (-1)[x:0,n-1]Z(k:x)(-1)^(x)
= [k:n,2n-1]Z(2n-1:k)p^(k)(-1)^k (-1)Z(k-1:n-1)(-1)^(n-1)
= [k:n,2n-1]Z(2n-1:k)p^(k)(-1)^k Z(k-1:n-1)(-1)^(n)
= [k:n,2n-1]Z(2n-1:k)Z(k-1:n-1)(-1)^(k+n)p^(k)

The alternative approach will get to the same answer - a bit more quickly this time, because the identities have been established.

[x:n,2n-1]Z(x-1,n-1)p^nq^(x-n)
= [x:n,2n-1]Z(x-1,n-1)p^n [y:0,x-n](-p)^yZ(x-n:y)
= [x:n,2n-1][y:0,x-n]Z(x-1,n-1)Z(x-n:y)p^(n+y) (-1)^y
= [k:n,2n-1][x:k,2n-1]Z(x-1,n-1)Z(x-n:k-n)p^(k) (-1)^(k-n)
= [k:n,2n-1][x:k,2n-1]Z(x-1,k-1)Z(k-1:n-1)p^(k) (-1)^(k-n)
= [k:n,2n-1]Z(k-1:n-1)p^(k) (-1)^(k-n)[x:k,2n-1]Z(x-1:k-1)
= [k:n,2n-1]Z(k-1:n-1)p^(k) (-1)^(k-n)[v:0,2n-1-k]Z(k+v-1:k-1)

Almost home, we just need one more twist on a previous identity

[v:0,n]Z(k+v-1:k-1)
=Z(k-1:k-1) + [v:1,n]Z(k+v-1:k-1)
=Z(k:k) + [v:1,n]Z(k+v:k) - Z(k+v-1:k)
=Z(k+n:k)

And cashing out, taking advantage of the fact that n is an integer in the final step.

= [k:n,2n-1]Z(k-1:n-1)p^(k) (-1)^(k-n)[v:0,2n-1-k]Z(k+v-1:k-1)
= [k:n,2n-1]Z(k-1:n-1)p^(k) (-1)^(k-n)Z(2n-1:k)
= [k:n,2n-1]Z(2n-1:k)Z(k-1:n-1) (-1)^(k-n) p^(k)
= [k:n,2n-1]Z(2n-1:k)Z(k-1:n-1) (-1)^(k+n) p^(k)


May 31, 2004 8:53 AM | TrackBack

Comments

Wow, this seems way more complicated that I believe it is. The chance of winning a seven game series given probability p of winning any one game can be obtained simply by the binomial distribution. It's the probability of winning at least four games in a seven game series. You just include the probability of winning 5 of 7, 6 of 7 and 7 of 7, even though you wouldn't play the extra games.

Comment by: David Pinto June 2,2004

Exactly right, David, that's what all this mess proves.

Comment by: Danil June 2,2004
Post a comment




Who are you?