Recent improvements to the , in version 5.50 and later, include the addition of probit regression. Probit regression is useful when establishing the detection limit (LoD) for an RT-qPCR assay.
The protocol provides guidance for estimating LoD and is recognized by the FDA. In this blog post, we will look at how to perform the relevant part of the CLSI EP17-A2 protocol using Analyse-it.
For details on experimental design, see section 5.5 in the CLSI EP17-A2 guideline. In Analyse-it, you should arrange the data in 2 columns: the first should be the concentration, and the second should be the result, positive or negative. You should have a minimum of 20 replicates at each concentration. We have put together a hypothetical example in the workbook which you can use the follow the steps below:
The analysis task pane opens.
NOTE: If using Analyse-it pre-version 5.65, on the Fit panel, in the Predict X given Probability edit box, type 0.95.
Following our last blog post, today, we will show how to calculate binary agreement using the . The protocol is a useful companion resource for laboratories and diagnostic companies developing qualitative diagnostic tests.
In Analyse-it, you should arrange the data in frequency or case form, as discussed in the blog post: . You can find an example of both and follow the steps below, using the workbook .
NOTE: The Average method is useful when comparing two laboratories or observers where neither is considered a natural comparator. The reference method is asymmetric, and the result will depend on the assignment of the X and Y methods, whereas the average method is symmetric, and the result does not change when swapping the X and Y methods.
INFO: Older versions of Analyse-it do not support the Average method, and the Agreement by category checkbox is called Agreement.
The analysis report shows positive and negative agreement statistics.
Due to COVID-19, there is currently a lot of interest surrounding the sensitivity and specificity of a diagnostic test. These terms relate to the accuracy of a test in diagnosing an illness or condition. To calculate these statistics, the true state of the subject, whether the subject does have the illness or condition, must be known.
In recent FDA guidance for laboratories and manufacturers, , the FDA state that users should use a clinical agreement study to establish performance characteristics (sensitivity/PPA, specificity/NPA). While the terms sensitivity/specificity are widely known and used, the terms PPA/NPA are not.
protocol describes the terms positive percent agreement (PPA) and negative percent agreement (NPA). When you have two binary diagnostic tests to compare, you can use an agreement study to calculate these statistics.
As you can see, these measures are asymmetric. That is, interchanging the test and comparative methods, and therefore the values of b and c, changes the statistics. They do, however, have a natural, simple, interpretation when one method is a reference/comparative method and the other a test method.
It is important in diagnostic accuracy studies that the true clinical state of the patient is known. For example, in developing a SARS-CoV-2 anti-body test, for the positive subgroup, you might enlist subjects who had a positive SARS-CoV-2 PCR test and clinically confirmed illness. Then, for the negative subgroup, you might use samples taken from subjects before the illness was in circulation. It is also essential to consider other factors, such as the severity of illness, as they can have a marked effect on the performance characteristics of the test. A test that shows high sensitivity/specificity in a hospital situation in very ill patients can be much less effective in population screening where the severity of the illness is less.
In cases where the true condition of the subject is not known, and only results from a comparative method and a new test method are available, an agreement measure is more suitable. We will cover that scenario in detail in a future blog post.
In our last post, we mentioned that the 'accuracy' statistic, also known as the probability of a correct result, was a useless measure for diagnostic test performance. Today we'll explain why.
Let's take a hypothetical test with a sensitivity of 86% and specificity of 98%.
As a first scenario we simulated test results on 200 subjects with, and 200 without, the condition. The accuracy statistic (TP+TN)/N is equal to (172+196)/400 = 92%. See below:
In a second scenario we again simulated test results on 400 subjects, but only 50 with, and 350 without, the condition. The accuracy statistic is (43+343)/400 = 96.5%. See below:
The accuracy statistic is effectively a weighted average of sensitivity and specificity, with weights equal to the sample prevalence P(D=1) and the complement of the prevalence (that is, P(D=0) = 1-P(D=1)).
Accuracy = P(TP or TN) = (TP+TN)/N = Sensitivity * P(D=1) + Specificity * P(D=0)
Therefore as the prevalence in the sample changes so does the statistic. The prevalence of the condition in the sample may vary due to the availability of subjects or it may be fixed during the design of the study. It's easy to see how to manipulate the accuracy statistic to weigh in favor of the measure that performs best.
There’s currently a lot of press attention surrounding the finger-prick antibody IgG/IgM strip test to detect if a person has had COVID-19. Here in the UK companies are buying them to test their staff, and some in the media are asking why the government hasn’t made millions of tests available to find out who has had the illness and could potentially get back to work.
We did a quick Google search, and there are many similar-looking test kits for sale. The performance claims on some were sketchy, with some using as few as 20 samples to determine their performance claim! However, we found a webpage for a COVID-19 IgG/IgM Rapid antibody test that used a total of 525 cases, with 397 positives, 128 negatives, clinically confirmed. We have no insight as to the reliability of the claims made in the product information. The purpose of this blog post is not to promote or denigrate any test but to illustrate how to look further than headline figures.
We ran the data through the version 5.51. Here's the workbook containing the analysis:
Our focus at Analyse-it has always been on the development and improvement of our software. While we provide extensive help, tutorials, and technical support for Analyse-it, one area we do not cover is training and consultancy. As many of you will know we are based in England in the United Kingdom, and providing training and consultancy is often done better locally, in-person.
Instead we partner with experts who can provide training and consultancy in various disciplines, in local language, and geographically near (or at least nearer) to our customers. You can always find a list of current consultant and training partners at
One of the experts we have had a long relationship with is Dr. Thomas Keller. Dr Keller is an independent statistician and has run for 15 years. One his many areas of expertise is the planning and evaluation of experiments for method validation and he has been involved in international working groups (IFCC, CLSI) in the fields of clinical chemistry and laboratory medicine. Dr. Keller was actually a customer and started to provide training in Analyse-it shortly after. His reputation is second to none in the industry and he has provided consultancy and training to many companies using Analyse-it. See an example of a offered by Dr. Keller. He also provides for anything from simple questions to full courses for individuals and small groups.
Prediction intervals on Deming regression are a major new feature in the Analyse-it Method Validation Edition version 4.90, just released.
A prediction interval is an interval that has a given probability of including a future observation(s). They are very useful in method validation for testing the commutability of reference materials or processed samples with patient samples. Two CLSI protocols, and both use prediction intervals.
We will illustrate this new feature using an example from CLSI EP14-A3:
1) Open the workbook .
2) On the Analyse-it ribbon tab, in the Statistical Analysis group, click Method Comparison and then click Ordinary Deming regression.
3) In the X (Reference / Comparative) drop-down list, select Cholesterol: A.
4) In the Y (Test / New) drop-down list, select Cholesterol: B.
5) On the Analyse-it ribbon tab, in the Method Comparison group, click Restrict to Group.
Often we collect a sample of data not to make statements about that particular sample but to generalize our statements to say something about the population. Estimation is the process of making inferences about an unknown population parameter from a random sample drawn from the population of interest. An estimator is a method for arriving at an estimate of the value of an unknown parameter. Often there are many competing estimators for the population parameter that differ based on the underlying statistical theory.
A point estimate is the best estimate, in some sense, of the population parameter. The most well-known estimator is the sample mean which produces an estimate of the population mean.
It should be obvious that any point estimate is not absolutely accurate. It is an estimate based on only a single random sample. If repeated random samples were taken from the population the point estimate would be expected to vary from sample to sample. This leads to the definition of an interval estimator which provides a range of values defined by the limits [L, U].
A critical feature of any analytical and statistical software is accuracy. You are making decisions based on the statistics obtained and you need to know you can rely on them.
We have documented our previously, but another good benchmark to test statistical software against is the NIST StRD. The Statistical Engineering and Mathematical and Computational Sciences Divisions of NIST’s Information Technology Laboratory developed datasets with certified values for a variety of statistical methods against which statistical software packages can be benchmarked. The certified values are computed using ultra-high precision floating point arithmetic and are accurate to 15 significant digits.
For more information about the NIST StRD see:
We tested version 4.00 of Analyse-it against the NIST StRD on an Intel Xeon dual processor PC.
No statistical package achieves perfect accuracy for all the tests and no one package performs best for every test. Most statistical packages use IEEE754 double precision (64bit) floating point arithmetic and due to finite precision, round-off, and truncation errors in numerical operations, are unable to obtain the exact certified value.
The recent of passing of Professor Rick Jones (see ) caused me to reflect on the past.
I was very fortunate to earn a work placement with Dr Rick Jones at The University of Leeds in the summer of 1990. Rick was enthusiastic about the role of IT in medicine, and after securing funding for a full-time position he employed me as a computer programmer. Early projects included software for automating the monitoring of various blood marker tests and software to diagnose Down’s syndrome. At the time many hospitals had in-house solutions for diagnosing Down’s syndrome, and although the project took many years and the help of many other people to complete, it eventually gained widespread adoption.
Around 1992, Rick came up with the idea of a statistics package that integrated into Microsoft Excel. Armed with a ring bound folder containing the Excel SDK and a pile of medical statistics books, I set about the task of writing the software in C++. It wasn’t long before the first version of Astute was ready and commercially released.
Today we released version 3.80 of the Analyse-it Standard edition.
The new release includes Principal Component Analysis (PCA), an extension to the multivariate analysis already available in Analyse-it. It also includes probably the most advanced implementation of biplots available in any commercial package.
New features include:
The tutorial walks you through a guided example looking at how to use correlation and principal component analysis to discover the underlying relationships in data about New York Neighbourhoods. It demonstrates the amazing new features and helps you understand how to use them. You can either follow the tutorial yourself, at your own pace, or .
If you have you can download and install the update now, see . If maintenance on your licence has expired you can renew it to get this update and forthcoming updates, see .
What is a sample quantile or percentile? Take the 0.25 quantile (also known as the 25th percentile, or 1st quartile) -- it defines the value (let’s call it x) for a random variable, such that the probability that a random observation of the variable is less than x is 0.25 (25% chance).
A simple question, with a simple definition? The problem is calculating quantiles. The formulas are simple enough, but a take a quick look on Wikipedia and you’ll see there are at least 9 alternative methods . Consequently, statistical packages use different formulas to calculate quantiles. And we're sometimes asked why the quantiles calculated by Analyse-it sometimes don’t agree with Excel, SAS, or R.
Excel uses formula R-7 (in the Wikipedia article) to calculate the QUARTILE and PERCENTILE functions. Excel 2010 introduced two new functions that use slightly different formulas, with different denominators: PERCENTILE.INC and PERCENTILE.EXC.
SAS, R and some other packages let you choose which formula is used to calculate the quantiles. While this provides some flexibility, as it lets you reproduce statistics calculated using another package, the options can be confusing. Most non-statisticians don’t know when to use one method over another. When would you use the "Linear interpolation of the empirical distribution function" versus the "Linear interpolation of the modes for the order statistics for the uniform distribution on [0,1]" method?
In a previous post, , we explained the tests provided in Analyse-it to determine if a sample has normal distribution. In that post, we mentioned that although hypothesis tests are useful you should not solely rely on them. You should always look at the histogram and, maybe more importantly, the normal plot.
The beauty of the normal plot is that it is designed specifically for judging normality. The plot is very easy to interpret and lets you see where the sample deviates from normality.
As an example, let’s look at the distribution of systolic blood pressure, for a random group of healthy patients. Analyse-it creates the histogram (left) and normal plot (right) below:
Looking at the histogram, you can see the sample is approximately normally distributed. The bar heights for 120-122 and 122-124 make the distribution look slightly skewed, so it’s not perfectly clear.
The normal plot is clearer. It shows the observations on the X axis plotted against the expected normal score (Z-score) on the Y axis. It’s not necessary to understand what an expected normal score is, nor how it’s calculated, to interpret the plot. All you need to do is check is that the points roughly follow the red-line. The red-line shows the ideal normal distribution with mean and standard-deviation of the sample. If the points roughly follow the line – as they do in this case – the sample has normal distribution.