Dosage: Pedigree &
The Relationship Between Time, Distance
Evidence for a Record-Breaking Preakness
The following is an update of an unpublished article originally written in 1982 about the fatigue characteristics of race horses. It presents a mathematical model of fatigue in the race horse and applies it to a real-world situation that many racing fans will recall. As the example, the described methodology was applied to Secretariat's Preakness Stakes in an effort to mathematically confirm what we already know - that he broke the track record despite the "official" result. The article is technical in nature, but hopefully the principles behind the numbers and equations will be clear. The intent is to show that time, distance, and fatigue are inextricably intertwined.
At 5:40 PM Eastern Daylight Time, in Maryland on a Saturday in May, a chestnut colt of imposing physical and historical stature broke from the gate and leisurely settled in behind the other horses racing to the clubhouse turn. Within a few seconds, the colt would initiate a move of such spectacular proportions that he would circle the entire field and take command entering the backstretch, to be hand-ridden the rest of the way to an overwhelming victory. In one of racing’s most memorable displays of power and grace, Secretariat had won the second leg of the Triple Crown on his march toward immortality. A malfunctioning teletimer clicked off 1:55 for the mile and three—sixteenths, one second slower than Canonero II’s mark set two years earlier. An immediate controversy arose as two Daily Racing Form clockers had separately timed the race one and three-fifths seconds faster. A debate followed, and evidence was presented in an attempt to resolve the dispute. Two days after the race it was learned that the official track clocker had timed the race manually, catching Secretariat under the wire in 1:54.2, still over the track mark. Pimlico officials compromised and lowered the time to that recorded by the track clocker. To this day, the result chart of the race lists the track time as official but includes, parenthetically, the faster Daily Racing Form time.
Secretariat’s reputation as a runner is not challenged if the official time is correct, just as Mark Spitz’s reputation as a swimmer would be secure had he gotten only six world records rather than seven in his sweep of the 1972 Olympics. On the other hand, a record Preakness in 1:53.2 combined with the record-shattering performances in the Derby and the Belmont would elevate the 1973 Triple Crown to an even higher plane of unparalleled achievement. The purity and romance of the thing is so compelling that it should not be ignored.
Events such as this, and the discussion which surrounds them, add zest to the already colorful world of the turf. Hearts beat more rapidly when people argue whether Codex really did foul Genuine Risk or whether Ruffian would have beaten Foolish Pleasure. But these are judgment calls, never to be resolved. Secretariat’s Preakness is another matter. The time of a race transcends judgment since time is intimately woven into the fabric of physical laws. It exists independently of a malfunctioning electronic timer or stopwatch on any given day. Is it possible, then, to predict the time of a race based on scientific calculations? Perhaps so.
In the spring of 1981, Peter S. Riegel, a research engineer at Batelle Memorial Institute and a competitive long-distance runner, published an article in American Scientist magazine (Volume 69, p 285) in which he derived an equation to express the relationship between distance and the time of world-record performances in several human athletic events. The equation takes the form T=bDM [note the resemblance to the famous Einstein equation e=mc2]where T=time, D=distance, b is a constant which provides a measure of relative speed, and M is a "fatigue factor", so-called because its value determines the rate at which average speed changes with distance, and the time required to finish a race. A mathematically equivalent form of the equation is log T=log b + Mlog D, which many will recognize as the equation for a straight line. The best way to explain a log (short for logarithm) for those unfamiliar with the term is by example. We know that 102=100 and that 103=1000. A log, in this case, is simply the power to which the number 10 must be raised to get another number. In the examples, log 100=2 and log 1000=3 since these are the powers to which 10 must be raised to get 100 and 1000, respectively. It is not necessary to understand the mathematical significance of the concept. It is more important to understand that logs can be used as mathematical tools and are simply alternative expressions of common numbers.
Riegel found that when plotting log T versus log D for a number of events, the results did, in fact, approximate a straight line, confirming the correctness of the equation as a description of the relationship between distance and time. Thus, if one knows the times of races run at various distances, these values can be plotted in a straight line and log b and M can be calculated.
Figure 1 is a graphical representation of the log-log equation. The vertical axis is log T and the horizontal axis is log D. Line A is derived by an operation called linear regression in which the data points (shown as the x-marks along either side of line A) are used to generate the best straight line defined by the points. A measure of how closely the points fit the theoretical line generated is called the correlation coefficient. The maximum value for the correlation coefficient is 1.00000, in which case all of the data points lie exactly on the line. In all other cases where data points deviate from the line, the correlation coefficient is something less than 1.00000 but can be relatively close to it. In a log-log plot, small differences in the correlation coefficient reflect rather large deviations of time and distance from the line. The point at which line A crosses the vertical axis is the intercept, equal to log b in our straight line equation. The "fatigue factor", M, is called the slope of the line and is the change in log T per unit of log D. For example, point 1 on line A is defined by the coordinates of log T1 and log D1. Similarly, point 2 on line A is defined by the coordinates of log T2 and log D2. The slope of line A is then the difference between log T2 and log T1 divided by the difference between log D2 and log D1. If log T increases at the same rate as log D, then the slope is 1. If log T increases twice as fast as log D, then the slope is 2, and so on. The higher the value for M (the greater the slope), the more time it takes to cover an additional distance relative to a lower value of M. Therefore, large slopes imply a greater fatigue with increasing distance. For our purpose it is only important to recognize the concepts of slope and intercept since these two values define the characteristics of any straight line. If we plug these values into the log-log equation, we can calculate the distance associated with any given time, and vice versa.
Figure 1 also contains a second line, B. The reader will note that line B is parallel to line A but has a higher intercept. The implication here is that, if lines A and B represent the time-distance equations derived from separate races for two runners, both have the same value for M (the same slope) or the same "fatigue factor". They differ only in terms of relative speed (different intercepts). The runner represented by line A is faster at all distances along the log D scale. The key to comprehension of the differences in the two lines lies in the fact that the further up the log T scale we go, the slower the time, while the further to the right we go on the log D scale, the longer the distance.
Whereas Mr. Riegel’s interests lie in human racing, mine lies in horse racing, and we have applied his equation to American records on dirt and grass to see if the correlations hold. Tables 1 and 2 list the respective records and include the slopes, intercepts, and correlation coefficients for the lines produced by the data via regression analysis. The fit in both cases is outstanding, with correlation coefficients in excess of 0.9999. In fact, over the range from five furlongs to one and one-half miles, no American dirt record deviates by more than two-fifths of a second, and no American turf record deviates by more then three-fifths of a second, from the times predicted by the lines. The applicability of the equation to horse racing is, therefore, confirmed.
TABLE 1. BEST STRAIGHT LINE DERIVED
FROM AMERICAN DIRT RECORDS
|DISTANCE||ACTUAL TIME||PROJECTED TIME|
TABLE 2. BEST STRAIGHT LINE DERIVED
FROM AMERICAN TURF RECORDS
|DISTANCE||ACTUAL TIME||PROJECTED TIME|
What, you may ask, does any of this have to do with Secretariat’s Preakness? Well, we have available, in one form or another, records of the time it has taken for individual horses to run various distances. Correlations may be derived from the information analogous to those from the American records just discussed. From the correlation we may predict what time it should take to cover a given distance based on demonstrated ability at other distances. Furthermore, fractional time, or pace, is another set of data points relating time to distance, but now in one race. Similar correlations here may also lead to some predictive capability. In the following discussion, we will attempt to show that on the basis of overall form and pace, Secretariat did actually set a new track and Preakness record, and in so doing, completed the greatest three-race feat in the history of American racing.
Before presenting arguments in support of the hypothesis, I would emphasize that the slope and intercept obtained from a set of time-distance data points are uniquely characteristic of a horse’s performance. They are a model of his performance over a number of races through analysis of several races run at different distances or they define his performance in one race through an analysis of the pace. The method then becomes a powerful tool for comparison of time-distance relationships among different horses or for one horse in different races. A relative measure of speed, as well as an absolute indication of running style, is contained in the values of the slope and intercept.
In Figure 2 we have other examples which illustrate the point. Represented in
this graph are three lines, A, B, and C, defined by the time-distance equations
for three horses, A, B, and C. The slope of line B is greater than the slope for
line A while the intercepts are the reverse. Thus, when comparing horse A to
horse B we conclude that to the left of the crossover point of the two lines (at
shorter distances), horse B is faster. To the right of the crossover point (at
longer distances), horse A is faster. There may be some relationship here to
competitive distance potential.
Let us assume that horses B and C are racing against one another over a distance corresponding to the point where lines B and C intersect and that these lines signify the time-distance equations for their pace in the race. It is apparent that they finish in the same time. Furthermore, we can recognize from the different slopes of lines B and C that horse B is a front-runner since he is faster at the intermediate distances while horse C comes from behind. These illustrations provide some idea of the possible utility of this kind of analysis as a mathematical model of performance.
Table 3 lists the times recorded for three Triple Crown winners - Seattle Slew, Affirmed, and Secretariat - in their best routes as three-year-olds through the Belmont Stakes but excluding the Preakness. We select their best times in winning races in order to ensure a consistency of effort along with optimum form. Since they each won the Preakness, that race would fall in the same category. In addition, we consider only routes so that the discontinuity in time observed by speed handicappers between sprints and routes does not become a complicating factor. For each horse, we solve the equation for the best straight line defined by the data, and from it project a time at the Preakness distance of nine and one-half furlongs.
TABLE 3. FINAL TIME OF
PREAKNESS PREDICTED FROM BEST STRAIGHT LINE
DETERMINED BY BEST THREE-YEAR-OLD FORM THROUGH THE BELMONT STAKES
|8 1/2f||1:42.3 4)|
|9f||1:47.2 1)||1:48.0 5)|
|10f||2:02.1 2)||2:01.1 2)||1:59.2 2)|
|12f||2:29.3 3)||2:26.4 3)||2:24.0 3)|
|9 1/2f (predicted)||1:54.3||1:54.4||1:52.3|
|9 1/2f (actual)||1:54.2 7)||1:54.2 7)||1:54.2 or 1:53.2 7)|
|1) Flamingo Stakes|
|2) Kentucky Derby|
|3) Belmont Stakes|
|4) San Felipe Stakes|
|5) Santa Anita Derby|
|6) Gotham Stakes|
|7) Preakness Stakes|
The results show that the final predicted times for Seattle Slew and Affirmed fall within two-fifths of a second of their actual official times - a very good approximation. Secretariat, by contrast, shows a predicted final time one and four-fifths seconds under the official track time and four-fifths of a second under the Daily Racing Form time. None of Secretariat’s final times in his other routes is more than three-fifths of a second off the time predicted from the line: 1:33.3 for eight furlongs, 1:58.4 for 10 furlongs, and 2:24.2 for 12 furlongs. A deviation of four-fifths of a second is reasonable for the hand timing of the Daily Racing Form clockers, but one and four-fifths seconds seems extremely unlikely. Subsequent incorporation of the 1:54.2 time into the data for Secretariat’s other routes affords a correlation coefficient of 0.99869, compared to the 1:53.2 time giving a line with a correlation coefficient of 0.99955, a much better fit.
Pace analysis of each of the classics won by the three most recent Triple Crown winners, using the same methodology, also supports the faster Preakness for Secretariat. Table 4 contains the recorded fractions of the three champions in each of the races, including the track and Daily Racing Form times in the 1973 edition of the Preakness. Comparison of correlation coefficients for the nine races reveals that the 1:54.2 Preakness for Secretariat provides the only value below 0.9999; in other words, the poorest fit of the data among all of these races. A 1:53.2 Preakness gives a correlation coefficient right in line with the others. More compelling than better fit, however, is the implied finishing power of Secretariat in his classics. The Preakness, at nine and one-half furlongs, is the only race of the three in which fractions are recorded for the last three-sixteenths of a mile (i.e., at one mile and at the wire). The slower Preakness time requires :18.4 seconds for the last three sixteenths of a mile while the Daily Racing Form time requires :17.4 seconds. Calculation of the running times for the last three-sixteenths of a mile in the Derby and the Belmont (using the best straight line equation for these races) indicates fractions of :17.2 and :18.3, respectively. If Secretariat, noted for his powerful stretch moves, had covered the distance in 1:54.2, he would have had to have registered a final fraction slower than that for the Belmont, a race which is five-sixteenths of a mile longer than the Preakness. The last fraction of :17.4 seconds associated with the Daily Racing Form time is certainly more consistent with Secretariat’s other classic performances.
TABLE 4. FRACTIONAL TIMES AND BEST STRAIGHT LINE PARAMETERS FOR THE PREAKNESS STAKES
|KENTUCKY DERBY||BELMONT STAKES||PREAKNESS STAKES|
|DISTANCE||SEATTLE SLEW||AFFIRMED||SECRETARIAT||SEATTLE SLEW||AFFIRMED||SECRETARIAT||SEATTLE SLEW||AFFIRMED||SECRETARIAT1)||SECRETARIAT2)|
|1) Official final time|
|2) DRF final time|
|3) Calculated from the equation for the best straight-line fit of the data|
|4) Actual time from the 3/16ths pole to the finish|
A third analysis, also using pace, involves prediction of final time based on the best straight line determined by the early fractions. In this case we exclude the final time and project it from the line obtained by consideration of the early calls only - an application of the concept that "pace makes the race". The fractions are those presented in Table 4 and the results are listed in Table 5. Each of Secretariat’s Triple Crown races is included. Also included, for direct comparison, are the Preakness results for Seattle Slew and Affirmed. In every example other than Secretariat’s Preakness, the predicted time is no more than three-fifths of a second off the actual final time. The Daily Racing Form time for Secretariat is just two-fifths of a second from the predicted time, while the official track time deviates by one and two-fifths seconds. The latter result is inconsistent with the other data.
TABLE 5. FINAL TIMES OF CLASSIC RACES PREDICTED FROM BEST STRAIGHT LINE DETERMINED BY EARLY PACE *
|HORSE||RACE||PREDICTED FINAL TIME||ACTUAL FINAL TIME|
|Secretariat||Preakness Stakes||1:53.0||1:54.2 or 1:53.2|
|Seattle Slew||Preakness Stakes||1:53.4||1:54.2|
|* Calculated from early fractions only, excluding final time.|
These analyses don't necessarily prove that the track and stakes records were shattered in the 1973 Preakness. They do provide, however, convincing evidence that such a conclusion is correct. The possibility remains that Secretariat ran a bad race even while winning. The power of his stretch drive may have been an illusion. If so, then Sham, which finished second in a "game effort" according to the result chart and equaled the track record in the Derby while second in that one, must have fallen apart at the end as well. This hypothetical scenario is not a likely one. Secretariat maintained his two and one-half length advantage through the lane while Sham was drawing clear from the remainder of the field to finish eight lengths ahead of the third place finisher, Our Native. These were convincing, decisive performances. The evidence is strong that the Daily Racing Form clockers came closer to recording the real time of the race than did the Pimlico clocker. Many of us who witnessed the race will always believe that Secretariat swept the Triple Crown with three record-breaking efforts. The techniques outlined above support that belief.