We are receiving a lot of questions about relevant analyses in the Analyse-it Method Validation edition to help in evaluating new diagnostic tests in the fight against COVID-19. Below are some quick links that will help, but contact us if you have questions - we are working as normal.
Also see our latest blog post: Sensitivity/Specificity and The Importance of Predictive Values for a COVID-19 test
There’s currently a lot of press attention surrounding the finger-prick antibody IgG/IgM strip test to detect if a person has had COVID-19. Here in the UK companies are buying them to test their staff, and some in the media are asking why the government hasn’t made millions of tests available to find out who has had the illness and could potentially get back to work.
We did a quick Google search, and there are many similar-looking test kits for sale. The performance claims on some were sketchy, with some using as few as 20 samples to determine their performance claim! However, we found a webpage for a COVID-19 IgG/IgM Rapid antibody test that used a total of 525 cases, with 397 positives, 128 negatives, clinically confirmed. We have no insight as to the reliability of the claims made in the product information. The purpose of this blog post is not to promote or denigrate any test but to illustrate how to look further than headline figures.
We ran the data through the Analyse-it Method Validation Edition version 5.51. Here's the workbook containing the analysis: COVID-19 IgM-IgG Rapid Test.xlsx
We used Analyse-it to determine the sensitivity/specificity and confirmed the performance claims on the website. The sensitivity is listed as 88.66% and the specificity 90.63%. Some websites also make an “accuracy” claim, usually a combination of (TP+TN)/Total, but that’s a useless statistic.
So, with reasonably impressive numbers around 90%, what’s the problem?
First, we need to look at the meaning of sensitivity and specificity:
These measures give the probability of a correct test result in subjects known to be with and without a condition, respectively. To understand how a test performs in the real world, we need to look at the predictive values. They tell us how useful the test would be when applied to a population. They indicate the probability of correctly identifying a subject's condition given the test result.
Here are the definitions of positive and negative predictive value:
Here’s where things get a little trickier. To calculate predictive values, the important numbers that we need to make decisions, we need to know the prevalence of COVID-19 in the population. And, at present, that’s an unknown.
We ran the numbers in Analyse-it using four scenarios for the prevalence of the illness of 1%, 5%, 10%, 20%.
The positive predictive value, that is, the probability that someone with a positive test result from this test has had COVID-19 illness is 8.7%, 33.2%, 51.2%, and 70.3%, respectively. Whilst the negative predictive values are 99.9%, 99.3%, 98.6%, 97.0%
To understand what’s going on here, we’ll use an example to show how the positive predictive value works.
Let’s assume you have a workforce of 100 staff to test (the population), and that the true unknown prevalence of COVID-19 illness in your workforce is only 5%. Therefore, 5 unknown staff have had the illness, and 95 have not. We apply the test to the population (all the staff) with the following results:
In total, 9 + 4 = 13 people tested positive as having had the illness. But, in truth, only 4 of 13 = 30% (as we see in Scenario 2 above) of those with a positive test have actually had the illness!
As can be seen in the example above, things aren’t as simple as they appear when you read a headline quoting a test with “90% accuracy”.
When applied to a population with a low prevalence of illness, the false positives soon overwhelm the true positives and make the test less useful. If applied to populations with a higher prevalence of the illness, such as workers who have had symptoms and self-isolated, and at the right time when the antibodies become present, the usefulness of the positive test result increases (see, for example, Scenario 4 in the table above).
This paper presents just one example of a single
test, and others with higher sensitivity/specificity will perform better. It highlights the importance of predictive values in decision making, and in evaluating whether a particular test is really that helpful.
Our focus at Analyse-it has always been on the development and improvement of our software. While we provide extensive help, tutorials, and technical support for Analyse-it, one area we do not cover is training and consultancy. As many of you will know we are based in England in the United Kingdom, and providing training and consultancy is often done better locally, in-person.
Instead we partner with experts who can provide training and consultancy in various disciplines, in local language, and geographically near (or at least nearer) to our customers. You can always find a list of current consultant and training partners at
It’s been a long-requested feature, and today we’re happy to announce that Analyse-it version 5.10 now includes the ability to save the dataset filter with an analysis and re-apply it on recalculation.
Analyse-it always allowed you to use Excel auto-filters to quickly limit analysis to just a subset of the data, but until now that filter wasn’t saved. Each time you recalculated the analysis it was based on the currently active filter rather than the filter in-effect when you created the analysis.
Update 19-Sep-2019: Unfortunately this continues to be an issue for some users and unfortunately there is currently no solution from Microsoft except to suggest the use of compatibility mode as detailed below. We have requested this be fixed, so please up-vote it at
Update 27-Jun-2018: Although we have a fix for this issue on an internal build, it appears that Microsoft Office version 1807 (which is currently only available on the Office Insider track) fixes this issue. The missing user-interface problem was caused by a bug in Microsoft Office 1805/1806 updates. We will release our fix shortly, but the 1807 version update will also become available to everyone over the next month or so. If you want to get it immediately see .
Prediction intervals on Deming regression are a major new feature in the Analyse-it Method Validation Edition version 4.90, just released.
A prediction interval is an interval that has a given probability of including a future observation(s). They are very useful in method validation for testing the commutability of reference materials or processed samples with patient samples. Two CLSI protocols, and both use prediction intervals.
Often we collect a sample of data not to make statements about that particular sample but to generalize our statements to say something about the population. Estimation is the process of making inferences about an unknown population parameter from a random sample drawn from the population of interest. An estimator is a method for arriving at an estimate of the value of an unknown parameter. Often there are many competing estimators for the population parameter that differ based on the underlying statistical theory.
As we mentioned last week in the , in this release we took the opportunity to revamp the documentation.
The revamp involved rewriting many topics to make the content clearer, adding new task-oriented topics, including refresher topics on common statistical concepts, and improving the indexing and links between topics so you can more easily navigate the help system.
The new task-oriented topics give you step-by-step instructions on completing common tasks. For example you will now find topics on how to , , , and even simple tasks like . We have also fully documented the supported dataset layouts for each type of analysis so you can see how to arrange your data for Analyse-it. The links in each topic help you more easily find related topics, for example links to topics on how to interpret the statistics, links to explain the pros and cons of the available statistical tests, links to topics for common tasks, and a link showing you how to arrange the dataset.
Last week we released version 4.80 of Analyse-it.
The new release includes multi-way , , and in the Standard edition, and since every licence includes the Standard edition, these features are available to all users. We also took the opportunity to revamp the and develop a . We’ll go into more details on the improvements in the next few weeks.
If you have you can download and install the update now, see . If maintenance on your license has expired you can renew it to get this update and forthcoming updates, see .
Today we released version 3.80 of the Analyse-it Standard edition.
The new release includes Principal Component Analysis (PCA), an extension to the multivariate analysis already available in Analyse-it. It also includes probably the most advanced implementation of biplots available in any commercial package.
New features include:
The tutorial walks you through a guided example looking at how to use correlation and principal component analysis to discover the underlying relationships in data about New York Neighbourhoods. It demonstrates the amazing new features and helps you understand how to use them. You can either follow the tutorial yourself, at your own pace, or .
If you you will no doubt already know about the recent improvements in the Analyse-it Method Validation edition and the release of our first video tutorial. If not, now is a good time to since we post short announcements and feature previews on Facebook, and use the blog only for news about major releases.
The latest changes and improvements to the Analyse-it Method Validation edition include:
What is a sample quantile or percentile? Take the 0.25 quantile (also known as the 25th percentile, or 1st quartile) -- it defines the value (let’s call it x) for a random variable, such that the probability that a random observation of the variable is less than x is 0.25 (25% chance).
A simple question, with a simple definition? The problem is calculating quantiles. The formulas are simple enough, but a take a quick look on Wikipedia and you’ll see there are at least 9 alternative methods . Consequently, statistical packages use different formulas to calculate quantiles. And we're sometimes asked why the quantiles calculated by Analyse-it sometimes don’t agree with Excel, SAS, or R.
Yesterday we improved the help in the and added a statistical reference guide. The guide tells you about the statistical procedures in Analyse-it, with help on using and understanding the plots and statistics. It’s a work in progress, and we intend to improve it further with your comments and feedback, but it’s important to understand the role of the guide.
Firstly, the guide is not intended to be a statistics textbook. While it covers key concepts in statistical analysis, it is no substitute for learning statistics from a good teacher or textbook.
In clearly titling this blog post, we’ve probably already revealed the answer, but... Can you spot the difference between the two rows of values in the Excel spreadsheet shown below?
Sorry, it’s a trick question, because (visually) there is no difference. The difference is how the values are stored by Microsoft Excel. The value 57 in the cell on second row is actually stored as a text string, not a number.
Today we’re delighted to publish the second case study into the use of Analyse-it.
The case study features a national clinical laboratory in the USA that offers more than 2,000 tests and combinations to major commercial and government laboratories. They use Analyse-it to determine analytical performance of automated immunoassays for some of the industry’s leading in-vitro diagnostic device makers -- including Abbott Diagnostics, Bayer Diagnostics, Beckman Coulter and Roche Diagnostics.
In a previous post, , we explained the tests provided in Analyse-it to determine if a sample has normal distribution. In that post, we mentioned that although hypothesis tests are useful you should not solely rely on them. You should always look at the histogram and, maybe more importantly, the normal plot.
The beauty of the normal plot is that it is designed specifically for judging normality. The plot is very easy to interpret and lets you see where the sample deviates from normality.
A customer contacted us last week to ask how to refer to cells on an Analyse-it report worksheet, from a formula on another worksheet. The customer often used Analyse-it's refresh feature, to repeat the statistical analysis and update the statistics, and direct references to cells on the report were being lost on refresh.
As an example, suppose you have used Analyse-it linear regression to calculate the linear relationship between installation cost and the number of employees required, distance to the site, and the cost of machine being installed. Analyse-it would calculate the effect of each variable on the final cost, technically known as regression coefficients, which you can then use to predict installation costs for jobs in future.
Today we’re delighted to publish the first case study into the use of Analyse-it.
Marco Balerna Ph.D., a Clinical Chemist at the in Switzerland, used Analyse-it when replacing the clinical chemistry and immunological analysers in EOC’s laboratories.
Since the EOC provides clinical chemistry services to five large hospitals and three small clinics in the region, it was essential the transition to the new analysers went smoothly. Marco used Analyse-it to ensure the analyser’s performance met the manufacturer’s claims, to ensure the reporting of patient results was not affected, and to comply with the regulations of the EOC’s accreditation.
Although the charts in Analyse-it are large so they’re easy to read when printed, sometimes you need to print a chart to fill the full page. You can do so easily, without resizing the chart, in just a few steps:
Chart size is only limited by the page size your printer supports.
Identifying what was analysed, when, and by who, is the first step in understanding any Analyse-it report. The top rows of each Analyse-it report provide you with this information. The statistical test used, dataset and variables analysed, user who analysed, and the date and time last analysed, are included (see below). When you print the report the header is repeated at the top of printed page.
In May this year, we surveyed users of the Analyse-it Method Evaluation edition to gain insight into how we can improve Analyse-it in future. Thank you to all those who responded.
In the responses, one issue became clear: the unfiled reports feature causes confusion.
When you run an analysis, Analyse-it creates a new worksheet containing the statistics and charts for that analysis (what we call a report). Analyse-it places the report in a temporary workbook called . From there you can then decide what you want to do with the analysis: keep it, print it, e-mail it, or discard it. If you want to keep it you click the (see below), and Analyse-it moves the report into the same workbook as your dataset.
The most used distribution in statistical analysis is the normal distribution. Sometimes called the Gaussian distribution, after , the normal distribution is the basis of much parametric statistical analysis.
Parametric statistical tests often assume the sample under test is from a population with normal distribution. By making this assumption about the data, parametric tests are more powerful than their equivalent non-parametric counterparts and can detect differences with smaller sample sizes, or detect smaller differences with the same sample size.
For new and occasional Analyse-it users, datasets can sometimes seem confusing. Today we’ll explain why we devised the 'dataset' concept, a concept now copied by some other Excel add-ins.
We introduced the dataset concept so Analyse-it could automatically pick-up the data and variables from your Excel worksheet. As we found with , the Analysis Toolpak, and other Excel add-ins, forcing you to select cells containing the data to be analysed can be problematic:
A few readers have e-mailed to ask for more information about the book by David J. Sheskin we alluded to in the comment reply re: the , last week.
The book is the Handbook of Parametric & Non-parametric Statistical procedures, by David J. Sheskin, ISBN: 1584888148.
We have the third edition of the book which runs to over 1,200 pages -- a phenomenal piece of work for a single (obviously very dedicated) author. While it’s not a book you would sit down and read cover-to-cover, it is a very readable reference guide, covering all the parametric and non-parametric statistical procedures included in Analyse-it.
Most of you know where to find the help and examples provided with Analyse-it, but if not, today we’d like to explain what’s available. If you're stuck we're always happy to help, and usually respond within a few hours, but it's always faster for you to check if the help answers your question first.
If you’re new to Analyse-it, or want a quick refresher, the best place to start is the Getting Started tutorial. It’s completely automated, no typing is required, so all you have to do is sit back and watch. In just 10 minutes it will demonstrate how to setup a dataset, how to filter the dataset, how to run a statistical test, and how to edit, refresh, and print the reports.