Calculating the Implications of Test Accuracy - Choosing a Test 

For frog and iTEACH, the goal was to design a test kit that individuals could use appropriately, interpret the results accurately, and get help as needed. The design was meant to be a model, not a specific solution for one vendor’s test, and it had to be flexible enough to adapt to a number of different off-the-shelf tests.  For example, the partnership developed designs for tests using oral swabs and as well as for tests using blood samples.

Although not a factor in the development of Project M integrated design, one important choice for marketing a self-testing option would be deciding which clinical test kit to include in the package. Several types of rapid test kits for HIV or antibodies had been approved for use in clinic settings, but none for home use. The one home test available for sale in the U.S. was a “home collection” kit, where samples were collected at home and sent to a lab for evaluation.

Even if administered and interpreted perfectly, medical tests were never 100% accurate. Manufacturers determined the percentage of failures before tests were licensed. Test inaccuracies created a burden on individuals and healthcare systems, and a more accurate test was better than a less accurate one. However, this difference could be offset by other considerations, such as cost in time or money, or the ease of administering a particular test. In a particular setting, a less accurate test might be more useful than a more accurate one.

A likely option for self testing were the “quick tests” were on the market for screening in clinic settings. These tests gave results that could be determined in 20 minutes. Positive results would be retested when the individual went for treatment. The kits were not immediately effective in determining infection. They tested for antibodies to the HIV virus, which only showed up several weeks after the individual was exposed to HIV/AIDS. However, the screening tests were quite accurate and not as expensive as more definitive tests, which required several weeks in a laboratory setting to get results.

Some quick tests used either oral fluids or blood samples. Tests using oral fluids had more false positives than tests using blood samples, but collecting oral fluids would be much easier than getting blood samples outside a medical setting.

Tests can be inaccurate in two ways. They can give a “false positive” result – telling an individual that he has the disease when he does not. They can also give a false negative – telling an individual that she does not have the disease when she does. These errors are given in testing literature as probabilities of accuracy – sensitivity is the probability of positive test results when the individual has the disease, and specificity is the probability of negative test results for an individual with no disease.

So, for example: consider the OraQuick 20 minute screening test. It is approved for use in clinical settings and can be used with either blood samples or oral samples collected with a swab. The blood sample test is more accurate, but might be more difficult for a home user to collect. The literature for the OraQuick test for HIV/AIDS using oral samples shows the following:

The sensitivity of the test is 99.1%, or a probability of 0.9910. So if 10,000 people with HIV antibodies took the test, 9910 would be expected to have a positive result and the remainder, 90 people, would get a negative result, a false negative.

The specificity of the test is 0.9960, meaning that if 10,000 people who were antibody-negative took the test, 9960 would test negative and 40 would test positive, a false positive. (Presumably those 40 people would learn that they did not have the virus when additional tests were run.)

The same test used with blood samples is more accurate. The sensitivity with blood samples is 99.7% or 0.9970 probability. The specificity with blood samples is 99.9%, or 0.9990 probability.

Consider the test from another perspective, the test results: Of those getting a positive result on the test, how many have the virus? How many have false positives? And on the other side, for those getting a negative result, how many are infection free and how many received a false negative? It turns out that those calculations depend on the prevalence of the disease in the population, how rare or common the disease is. (Prevalence should not be confused with incidence, new cases diagnosed in a time period.)

Consider first comparing the two versions of the test in a population where the virus is rare. For example, the U.S. population as a whole has a prevalence of HIV that is less than 1%.

In that population, testing 10,000 people with oral samples, the calculations give an expected result of one false negative and 40 false positives. (See explanation of calculations below.)

In the same population, if 10,000 people were tested using blood samples, researchers would expect no false negatives and 903 false positives.

The false positive would be extremely distressing to the individual, but the error would be recognized when the individual sought medical help, since treatment of HIV/AIDS in a clinical setting begins with more accurate testing to determine the severity of the infection and thus develop a plan for treatment. The individual with a false negative result would consider him or herself not a source of infection, which might lead to more risky behavior spreading the virus, but in the U.S. population the number of individuals misdiagnosed would be small.

These results would be quite different if the same test were used in KwaZulu-Natal, where 40% of the population is estimated to be HIV positive. The methodology described in text and video below can be used to calculate false positives and false negatives of the two tests in the US and in KwaZulu-Natal.

 

 resources

The U.S. Food and Drug Administration has approved one home test for U.S. use, which includes telephone education, counseling, and follow-up. The test provides for home "specimen collection" rather than testing. Users collect blood samples and send them to a laboratory. The user then calls for results, using the anonymous code number in the kit. The FDA website includes a copy of the package directions (online here)

 definitions

False positive = a test showing the presence of the disease when it is not present

False negative = a test showing the disease not present when it is

Prevalence = prior (pre-test) probability of having the disease.

Sensitivity = probability of a “positive” test in people with the disease.

Specificity = probability of a “negative” test in people without the disease.

Incidence = new cases diagnosed in a time period.

The possibilities can be displayed on a chart, showing the probabilities of "true" positives and "false" positives for those who have the disease, and the probabilities of "true" negatives and "false" negatives for those who do not have the disease.

 

(Click here for a printable pdf file of this page)

 statistical notebook: Calulating test accuracy in specific populations

A patient receives a "positive" test result. What is the probability he or she carries the disease?

With Bayesian analysis, we use the test accuracy data and population statistics to measure the impact of testing in the population. Here we use sensitivity and specificity from 20-minute OraQuick Rapid HIV test for HIV using oral swab, and calculate the results in two populations with different prevalence of the disease:

Sensitivity = 0.9910 = probability of positive test results for an individual WITH the disease

Specificity = 0.9960 = probability of negative results for an individual with NO disease

Prevalence of HIV/AIDS in the US population as a whole = 1%.

Prevalence of HIV/AIDS in KwaZulu-Natal = 40%.

 step 1

Draw a tree diagram of the possibilities, dividing the population into two possibilities – those with the disease and those with no disease.

The prevalence in this population is 1%, so the probabilities can be written as 0.010 with the disease and 99% or 0.990 with no disease. (The two probabilities must total 1.000 for the total population.)

The sensitivity is 0.9910, so 99.1% of the population with the disease will test positive. Again, the positive and negative have to add to 1.0, so the rest, 0.090 will get a negative test (a false negative.)

Fill in the tree for those with no disease, 0.9960 get a negative test and the rest, 0.0040, receive a positive test result (a false positive).

For each branch of the tree multiply the probabilities across the tree to get the joint probability of having the disease and a specific test result. (The results are in red.)

 step 2

Calculating the probability of having the disease, given a particular test result, we need to “flip the tree.” We create a new tree and rearrange the options, combining those getting positive results – the true positives and the false positives at the top of the tree.

We put the two segments of the population getting negative results to the bottom of the tree.

We know the joint probabilities, so we fill them in on the new tree. The question marks show the calculations remaining.

 

 

 step 3

The calculation now focuses on the new tree.

The joint probabilities in the new tree of those with a positive result WITH the disease, added to the joint probability of a positive result with NO disease together give the probability of a positive result, and similarly for the negative test results.

Adding on each branch of the tree gives the probabilities of positive for the upper branch and then negative results on the lower branch, as shown in blue. As a check, the blue numbers add together to 1.000.

 

 step 4

The final step is to fill in the last group of question marks.

As an example, for the top branch, positive and with the disease, set up the equation and solve for X:

0.0139 X  = 

 0.0099

X  = 

 0.0099 
 0.0139

X   = 

0.7145

It was known that the test gave 0,001 or 0.01% false negatives. We have shown that out of a population of 10,000 people we would find 139 people with positive test results. In that group, 99 would have positive results and the disease the true positives, and 40 would not have the disease, receiving a false positive result.

 additional analysis for KwaZulu-Natal and 2-by-2 matrix

The tree to the right shows the results of Bayesian analysis looking at a population like KwaZulu-Natal, where the incidence of the disease is estimated at 40%. This calculation uses the same test specifications for sensitivity and specificity as in the example using US incidence of the disease.

 

 

 

 

We can also present the overall test results for the population as a whole by creating a 2-by-2 matrix, putting the estimate of the overall population having the disease or no disease on the rows, and test results as positive or negative on the columns. Here, returning to the US example above, we can fill in the true and false positives, and the true and false negatives.