I ended up attacking this yet again after my walk today, in the hopes of making further progress, and sure enough a few more things broke loose.
Again, the object of the exercise is to come up with a reasonable form of the solution, which has all the right properties, and enough flexibility left over to allow fitting.
Back to square one. To keep the math simple, I rescale the axis so that a team scoring an average number of runs and winning an average number of games is plotted as (0,0), and a team which scores all the runs and wins all the games is plotted as (1,1). We are looking for a function F, such that
F(0) = 0
F(1) = 1
This can be achieved in a straight forward manner; we simply define the function F in terms of some other function G, in such a way that F is guaranteed to have the correct properties, and then explore G.
F(x) = ( G(x) - G(0) ) / ( G(1) - G(0) )
What else do we know? F should be odd, therefore G(x) - G(0) is also odd. G(0) is a constant, so in theory G(x) could be some odd function plus a constant, but since we are just going to subtract it out anyway we might as well take that constant to be 0. G itself is then odd, and F is a touch simpler
F(x) = G(x) / G(1)
In the low run limit, F should be a line of slope 1. In the high run limit, F should look like a step function - +1 for positive numbers, -1 for negative numbers, almost straight up in the middle.
In other words, I'm looking for something like the Heaviside Step Function, which looks like a step. H(x)= 0 when x < 0, H(x) = 1 when x>1. So the G I'm looking for eventually turns into 2H-1 in the high run limit.
So we inspect the various Heaviside approximators to find one that looks right. We can eliminate at once any which don't fit the boundary conditions - strictly between 0 and 1, and positive slope everywhere. So don't bother with Si. It may also be worth checking that the low limit is correct.
For instance, consider H(x) = 1/ ( 1 + exp(-x/t)). For simplicity, I choose to express N = 1/t, recognizing that N cannot be zero (think of N here as being related to the run environment - runs can be very scarce, but no runs at all makes no sense, so we are ok).
2H(x)-1
= 2 / ( 1+exp(-Nx) ) - 1
= 2 - ( 1 + exp(-Nx) ) / ( 1 + exp(-Nx) )
= ( 1 - exp(-Nx) ) / ( 1 + exp(-Nx) )
= ( 1 - ( 1 - Nx + O(N^2)) / ( 1 + exp(-Nx))
= ( Nx + O(N^2) ) / ( 1 + exp(-Nx) )
= Nx / 2
F(x)
= (2H(x) - 1 ) / ( 2H(1) - 1 )
= (Nx/2) / ( N/2)
= x
How do we choose N? We choose N by picking the value such that the slope of the curve at the origin is correct. It may be possible to actually calculate the desired N value for the appropriate slope; our small N approximation is also valid for small x, so the derivative must be:
F'(x)
= (N/2) ( 1 + exp(-N) ) / ( 1 - exp(-N) )
From my previous notes, in a 9.7 run environment, the slope is about 1.78, which solver claims comes from an N value of 2 (a pythagorean coincidence, I hope). So the proper equation would look like
F(x) = (1 + exp(-2x)) ( 1 - exp(-2)) / (1-exp(-2x))(1+exp(-2))
Can we do better? Sure - this choice of Heaviside approximation was arbitrary. The next step is to examine several; for each, set the slope appropriately, then measure which is the best fit along the entire curve.
Tune in next time for those results.
September 11, 2004 10:11 AM
| TrackBack