Syndicate this site: (RSS)

Pythag win predictors - fitted coefficients

Excel was used to determine the equation below, using the tricks previously described. The form of the coefficients was determined by first drawing a plot, and looking to see which trendline gave the right shape. The datapoints used for the fitting covered each 1/10 of a run increment in the scoring rate from 0.1 to 20 (we assume that tango's distribution continues to hold at this level of offense). 19900 or so datapoints in all, as the reflections are known to be anti-symmetric.

W / Y
= .5232 X ^ -0.7234
- .0603 X ^ -1.8241 Y^2
+ .0036 X ^ -2.8565 Y^4
-  8e-5 X ^ -3.3888 Y^6

There's nothing magic here about using 4 terms beyond convenience. If we were to be truly rigorous, there would be one last term to ensure that W/X = .5

As a check, let's consider the linear predictor derived by Ben Vollmayr-Lee.

p = 1/2 + n * ( x - 1/2 )
p - 1/2 = n * ( RS/(RS+RA) - 1/2 )
W = n/2 (RS-RA)/(RS+RA)
W = n/2 Y/X

And he found a best fit for n = 1.819. Taking only the first order term of the fitted estimator, and rearranging

W 
= .5232 X ^ -0.7234 Y 
= (.5232 X ^  0.2766) (Y/X)
= (1.064 X ^ 0.2766)/2 (Y/X)
n = (1.064 X ^ 0.2766)

If we look at the 2003AL, the scoring average was roughly 9.72 runs per game. This translates to an X vaue of 6.8731, which gives n = 1.813.

The coefficients associated with X are likely some function of n (the index of the term in the series), the number of innings in the game, perhaps also Tango's magic coefficient. You could work it out by dickering with the data that is input.

June 26, 2004 12:33 AM | TrackBack

Comments
Post a comment




Who are you?