``Best Change'' Rates for All Test Positions
Table 1.4:
``Best Change'' Rates of Belle, Crafty, and DarkThought.
Search |
Belle |
(Standard) |
Crafty |
(Standard) |
DarkThought |
(Standard) |
Depth |
1985 |
(Error) |
1997 |
(Error) |
1998 |
(Error) |
2 |
- - |
- - |
38.78% |
(2.63%) |
35.28% |
(2.58%) |
3 |
- - |
- - |
36.73% |
(2.60%) |
39.65% |
(2.64%) |
4 |
33.10% |
(2.23%) |
30.61% |
(2.49%) |
31.78% |
(2.51%) |
5 |
33.10% |
(2.23%) |
30.32% |
(2.48%) |
29.45% |
(2.46%) |
6 |
27.70% |
(2.12%) |
27.41% |
(2.41%) |
24.49% |
(2.32%) |
7 |
29.50% |
(2.16%) |
24.49% |
(2.32%) |
21.28% |
(2.21%) |
8 |
26.00% |
(2.07%) |
22.45% |
(2.25%) |
25.07% |
(2.34%) |
9 |
22.60% |
(1.98%) |
**18.37%** |
(2.09%) |
21.57% |
(2.22%) |
10 |
**17.70%** |
(1.81%) |
17.20% |
(2.04%) |
24.20% |
(2.31%) |
11 |
18.10% |
(1.82%) |
16.62% |
(2.01%) |
**17.49%** |
(2.05%) |
12 |
- - |
- - |
16.91% |
(2.02%) |
15.45% |
(1.95%) |
13 |
- - |
- - |
14.58% |
(1.91%) |
16.62% |
(2.01%) |
14 |
- - |
- - |
15.45% |
(1.95%) |
**13.70%** |
(1.86%) |
|
Table 1.4 summarizes the ``Best Change'' rates
BC(i) and their estimated standard deviations = standard errors
as observed in our experiment for all
343 corrected test positiosn at search depths of 2-14 plies.
These percentages of DARKTHOUGHT closely resemble the according
numbers of CRAFTY from 1997 for the same set of positions
and search depths as well as the numbers of BELLE from 1985 for a
different set of 447 test positions and search depths of 4-11
plies [163]. For the convenience of the reader and in
order to make our subsequent discussions more transparent, we also
include the numbers of BELLE and CRAFTY in
Table 1.4 showing them side-by-side with our own new
data of DARKTHOUGHT.
The table illustrates that BELLE, CRAFTY, and DARKTHOUGHT
feature very similar ``Best Change'' behaviours on average. This is
quite surprising if you consider the substantial differences of the
three programs regarding such fundamental issues as node expansion,
position evaluation, and search strategy. The experimental results of
DARKTHOUGHT support the pioneering findings of Hyatt and Newborn
at high search depths of 12-14 plies in particular. For these
search depths the ``Best Change'' rates of both CRAFTY and
DARKTHOUGHT stayed range-bound around 16%. As a tentative conclusion
we conjecture that the three columns of Table 1.4 taken
together provide convincing empirical evidence that the very gradual
decreases of the ``Best Change'' rates at high search depths are not
only artifacts of specific implementations but rather represent an
actually general phenomenon of chess programs which rely on depth-first
alpha-beta search with iterative deepening. Despite the overall
similarities, however, two numbers of DARKTHOUGHT roused our
attention because they differ notably from those of BELLE and
CRAFTY.
- Drop below 20%.
- The ``Best Change'' rates of both BELLE and CRAFTY dropped at
least one iteration earlier to 17%-18% than that of
DARKTHOUGHT (see numbers marked by ** in
Table 1.4) which stayed well above 20% until iteration
#10 inclusively. We attribute the more unstable behaviour of
DARKTHOUGHT to the increased selectivity of its search as compared
with the two other programs. While the standard errors still leave some
room for doubting the statistical significance of the drops below 20%,
Appendix 1.5.10 nullifies the corresponding concerns by
deriving 80%-confident and 90%-confident bounds on the ``Best Change''
probabilities of BELLE, CRAFTY, and DARKTHOUGHT.
- Iteration #14.
- The ``Best Change'' rates of CRAFTY remained surprisingly constant
at roughly 15%-17% from iteration #9 onwards. DARKTHOUGHT
only behaved like this from iteration #11 to iteration #13 and then
recorded another drop of its ``Best Change'' rate to 13.7% for the
final iteration #14 (see number marked by ** in
Table 1.4). This constitutes the first experimental
result reported so far which hints at the validity of the intuitive
notion that the average ``Best Change'' rates should taper off even
further at search depths beyond 14 plies. The experimental results of
CRAFTY do not really support this notion because the ``Best Change''
rate of CRAFTY does not decrease but rather increases again for
iteration #14. Unfortunately, it remains totally unclear whether the
special behaviour of DARKTHOUGHT signals a consistent trend towards
lower ``Best Change'' rates at higher search depths than 14 plies or if
it is just a fluctuation at the end of our data curve. The statistical
calculations of Appendix 1.5.10 do not suffice to discriminate
the outstanding data point because the 80%-confident and the
90%-confident upper bounds on the ``Best Change'' probability of
DARKTHOUGHT in iteration #14 equal 15.34% and 16.26% respectively.
Thence, new experiments with search depths of at least 16 plies are
needed to resolve this interesting question.
Created by Ernst A. Heinz, Thu Dec 16 23:28:11 EST 1999