Conclusion

The repetition of Hyatt and Newborn's experiment with our own chess program DARKTHOUGHT confirmed their findings regarding the ``Best Change'' rates of CRAFTY. Based on the experimental results of both CRAFTY and DARKTHOUGHT taken together, we are confident to project steady discoveries of new best moves in at least 16% of all searches on average for modern chess programs even at high search depths of 11-14 plies. Surprisingly enough, the experiments do not provide any conclusive empirical evidence for the intuitive notion that the ``Best Change'' rates taper off continuously with increasing search depths. They rather remained range-bound within 15%-17% for most of the deep searches as investigated in the experiments. If at all, primarily the behaviour of DARKTHOUGHT with a drop to roughly 13.7% new best moves on average in iteration #14 hinted at decreasing ``changes of mind'' for search depths of 15 plies and more. Further experiments with markedly higher search depths of at least 16 plies are needed to resolve the interesting question whether this signals a consistent trend towards even lower ``Best Change'' rates at search depths beyond 14 plies or if it was just a fluctuation at the end of our data curve.

At this point we take the opportunity in order to call on all prospective experimenters to derive or record at least as much data as we did for the experiments with CRAFTY and DARKTHOUGHT.^1.11 Studying the rates of ``Fresh Best'' moves and ``(I - 2) Best'' moves will surely be worth the additional efforts without any doubt. These rates revealed novel traits and insights concerning the search behaviours of both CRAFTY and DARKTHOUGHT. The ``Fresh Best'' rates led to the astonishing observation that regardless of the actual search depth sizable 30%-50% of all new best moves on average represented ``fresh ideas'' which the programs never deemed best before. This finding supports the validity of Newborn's hypothesis about the playing strength of chess programs as originally formulated in 1985. The ``(I - 2) Best'' rates educated us about continuing search instabilities of modern chess programs at odd and even search depths respectively. These odd/even instabilities decreased solely at high search depths of 9-14 plies in positions with reduced material as found mostly in endgames and late middlegames.

Last but not least, we like to mention that the repetition of Hyatt and Newborn's experiment was not only a scientifically rewarding but also a practically useful endeavour. For what could better demonstrate the effective scalability of DARKTHOUGHT than to ``go deep'' at fixed search depths of 14 plies on 343 test positions from real games in less than five days?

Created by Ernst A. Heinz, Thu Dec 16 23:28:11 EST 1999