Syndicate this site: (RSS)

Another approach to guessing

I ended up attacking this yet again after my walk today, in the hopes of making further progress, and sure enough a few more things broke loose.

Again, the object of the exercise is to come up with a reasonable form of the solution, which has all the right properties, and enough flexibility left over to allow fitting.

Back to square one. To keep the math simple, I rescale the axis so that a team scoring an average number of runs and winning an average number of games is plotted as (0,0), and a team which scores all the runs and wins all the games is plotted as (1,1). We are looking for a function F, such that


F(0) = 0
F(1) = 1

This can be achieved in a straight forward manner; we simply define the function F in terms of some other function G, in such a way that F is guaranteed to have the correct properties, and then explore G.


F(x) = ( G(x) - G(0) ) / ( G(1) - G(0) )

What else do we know? F should be odd, therefore G(x) - G(0) is also odd. G(0) is a constant, so in theory G(x) could be some odd function plus a constant, but since we are just going to subtract it out anyway we might as well take that constant to be 0. G itself is then odd, and F is a touch simpler


F(x) = G(x) / G(1)

In the low run limit, F should be a line of slope 1. In the high run limit, F should look like a step function - +1 for positive numbers, -1 for negative numbers, almost straight up in the middle.

In other words, I'm looking for something like the Heaviside Step Function, which looks like a step. H(x)= 0 when x < 0, H(x) = 1 when x>1. So the G I'm looking for eventually turns into 2H-1 in the high run limit.

So we inspect the various Heaviside approximators to find one that looks right. We can eliminate at once any which don't fit the boundary conditions - strictly between 0 and 1, and positive slope everywhere. So don't bother with Si. It may also be worth checking that the low limit is correct.

For instance, consider H(x) = 1/ ( 1 + exp(-x/t)). For simplicity, I choose to express N = 1/t, recognizing that N cannot be zero (think of N here as being related to the run environment - runs can be very scarce, but no runs at all makes no sense, so we are ok).


2H(x)-1
= 2 / ( 1+exp(-Nx) ) - 1
= 2 - ( 1 + exp(-Nx) ) / ( 1 + exp(-Nx) )
= ( 1 - exp(-Nx) ) / ( 1 + exp(-Nx) )
= ( 1 - ( 1 - Nx + O(N^2)) / ( 1 + exp(-Nx))
= ( Nx + O(N^2) ) / ( 1 + exp(-Nx) )
= Nx / 2

In the last step we take the limit N -> 0. So at very small values of N

F(x)
= (2H(x) - 1 ) / ( 2H(1) - 1 )
= (Nx/2) / ( N/2)
= x

Which matches our low limit condition. Life is good.

How do we choose N? We choose N by picking the value such that the slope of the curve at the origin is correct. It may be possible to actually calculate the desired N value for the appropriate slope; our small N approximation is also valid for small x, so the derivative must be:


F'(x)
= (N/2) ( 1 + exp(-N) ) / ( 1 - exp(-N) )

For high offense environments, where the slope is quite steep at the origin, it looks like N = 2F'(0) is a safe approximation. But the slope isn't nearly that steep in baseball, so it makes sense to plug the problem into excel.

From my previous notes, in a 9.7 run environment, the slope is about 1.78, which solver claims comes from an N value of 2 (a pythagorean coincidence, I hope). So the proper equation would look like


F(x) = (1 + exp(-2x)) ( 1 - exp(-2)) / (1-exp(-2x))(1+exp(-2))

Can we do better? Sure - this choice of Heaviside approximation was arbitrary. The next step is to examine several; for each, set the slope appropriately, then measure which is the best fit along the entire curve.

Tune in next time for those results.

September 11, 2004 10:11 AM | TrackBack

Comments
Post a comment




Who are you?