The ``Best Change'' behaviours of chess programs represent typical count data for a binary-valued random variable in the terms of standard statistics. The count probabilities of binary-valued random variables generally adhere to binomial distributions. For large enough sample sizes n and success counts m with m > 4 and n - m > 4, however, corresponding normal distributions provide practically sufficient approximations of the awkward to handle binomial distributions. Classic engineering statistics [88] derive the following lower and upper bounds of the success probability P for given values of m, n and any desired %-level of confidence as specified by the single-sided percentiles z of the N(0, 1) normal distribution.
(1.1) |
(1.2) |
With the help of these formulas we determined 80%-confident (z = 0.8416) and 90%-confident (z = 1.2816) bounds on the ``Best Change'' probabilities of BELLE (n = 447), CRAFTY (n = 343), and DARKTHOUGHT (n = 343). For BELLE we calculated the success count m from its ``Best Change'' rates of Table 1.4. For CRAFTY and DARKTHOUGHT we used the absolute ``Best Change'' numbers of Table 1.5 and Table 1.6 as their observed success counts m. The resulting bounds clearly discriminate the drops of the ``Best Change'' rates below 20% for all three programs with at least 80% confidence (see Table 1.13 where >= denotes lower bounds and <= upper bounds).