Benchmarking the Numerical Accuracy of Analyse-it
This document describes the performance of Analyse-it for Microsoft Excel version 4.00 against the NIST StRD
Summary
Tested against the industry-recognised NIST StRD, Analyse-it performed consistently amongst the best and performed better than some of the more popular well-known statistical packages.
Downloads
Download Evaluating the Numerical Accuracy of Analyse-it (.pdf)
Download Analyse-it NIST StRD Validation workbooks (.zip)
Related
In response to industry concerns about the numerical accuracy of statistical software, the Statistical Engineering and Mathematical and Computational Sciences Divisions of NIST’s Information Technology Laboratory developed datasets with certified values for a variety of statistical methods.
For more information about the datasets see:
https://www.itl.nist.gov/div898/strd/
The results obtained from statistical software packages can be compared against the certified values. The certified values are accurate to 15 significant digits and computed using ultra-high precision floating point arithmetic.
Most statistical packages use IEEE754 double precision (64bit) floating point arithmetic and due to finite precision, round-off and truncation errors involved in numerical operations, will be unable to obtain the exact certified value. Therefore, a good measure of the accuracy of a result x against the certified value c, can be calculated as the base-10 logarithm of the absolute value of the relative error:
LRE = -log10 (|x - c| / c)
if c ≠0, otherwise,
LRE = -log10 |x|
LRE is the number of significant digits in common with the certified value. Higher LRE values are better, and the maximum LRE obtainable is 15.
Performance benchmarks against the NIST StRD
We tested version 4.00 of Analyse-it using the NIST StRD on an Intel Xeon dual processor PC. No statistical package achieves perfect accuracy for all the tests and no one package performs best for every test. In the tests:
- Analyse-it performed consistently and amongst the best on all tests.
- Analyse-it performed better than some of the more popular well-known statistical packages.
Some developers of popular statistical software packages have published their own benchmarks against the NIST StRD, and some independent authors have also published benchmarks, see:
- A comparative study of the reliability of nine statistical software packages,
Computational Statistics & Data Analysis, 51(8), 3811-3831,
Keeling, K. B., & Pavur, R. J. - On the Accuracy of Statistical Procedures in Microsoft Excel 97,
Computational Statistics and Data Analysis, July 1999, Volume 31, Number 1, pp 27-37,
McCullough, B.D. and Wilson, B. - Assessing the Reliability of Statistical Software: Part I,
The American Statistician, Volume 52, Number 4, pp 358-366,
McCullough, B.D. - Assessing the Reliability of Statistical Software: Part II,
The American Statistician, May 1999, Volume 53, Number 2, pp 149-159,
McCullough, B.D.
To download the Excel worksheets containing the Analyse-it analyses to perform the NIST StRD, and see comparisons against the published results for other packages, see:
http://analyse-it.com/support/Analyse-it-NIST-StRD-Validation.zip
Summary of results
The LRE obtained testing Analyse-it against the NIST StRD are summarized below.
Univariate summary statistics
The univariate tests consist of nine datasets classified by difficulty.
The mean and standard deviation were computed using the Distribution analysis and compared to the certified values.
The lag-1 autocorrelation is not computed by Analyse-it.
| LRE | |||
| Test | Difficulty | Mean | SD |
| PiDigits | Lower | 15.0 | 15.0 |
| Lottery | Lower | 15.0 | 15.0 |
| Lew | Lower | 15.0 | 15.0 |
| Marvo | Lower | 15.0 | 13.1 |
| Michelson | Lower | 15.0 | 13.8 |
| NumAcc-1 | Lower | 15.0 | 15.0 |
| NumAcc-2 | Average | 15.0 | 15.0 |
| NumAcc-3 | Average | 15.0 | 15.0 |
| NumAcc-4 | Average | 15.0 | 15.0 |
| NumAcc-5 | Average | 15.0 | 15.0 |
| NumAcc-6 | Average | 15.0 | 15.0 |
| NumAcc-7 | Average | 15.0 | 15.0 |
| NumAcc-8 | Average | 15.0 | 15.0 |
| NumAcc-9 | Average | 15.0 | 15.0 |
Analysis of variance
The analysis of variance tests include three versions of three datasets representing increasing model complexity. The tests were run using the Fit Model analysis.
| LRE | ||||
| Test | Dataset | F statistic | SS treatment | SS error |
| SmLs01 | SmLs01 | 15.0 | 15.0 | 10.7 |
| SmLs02 | 15.0 | 15.0 | 14.2 | |
| SmLs03 | 12.5 | 10.6 | 11.8 | |
| AtmWtAg | SmLs04 | 15.0 | 15.0 | 15.0 |
| SmLs05 | 14.7 | 12.7 | 12.6 | |
| SmLs06 | 10.5 | 10.0 | 11.5 | |
| SiRstv | SmLs07 | 14.8 | 14.7 | 14.9 |
| SmLs08 | 14.7 | 13.2 | 11.1 | |
| SmLs09 | 13.5 | 10.4 | 9.7 | |
Non-linear regression
The non-linear regression tests include five datasets representing increasing model complexity. The tests were run using the Fit Model analysis.
| LRE | |||
| Test | Dataset | Parameters | Predictions |
| Gauss1 | Gauss1 | 15.0 | 15.0 |
| Gauss2 | 15.0 | 15.0 | |
| Gauss3 | 14.9 | 15.0 | |
| Mavro | Mavro | 15.0 | 15.0 |
| Chwirut | Chwirut | 14.6 | 13.8 |
For further information, to obtain copies of the Analyse-it validation documents, or a copy of the Analyse-it analysis workbooks used to perform the validation against the NIST StRD, please contact support@analyse-it.com.