Human challenge studies in the study of infectious diseases
What can deliberately infecting healthy people tell us about infectious diseases? How is this useful for developing treatments, and how do we manage the risks?
Testing people to see if they are currently infected or previously infected with SARS-CoV-2, the virus that causes COVID-19, is a key component of medical management, public health monitoring and research. Diagnosing people as having active infections is a fundamental part of any test and contact tracing system. Improving the speed and accuracy of tests that detect current infections is a research priority and the focus of recent UK Government investment and policy decisions. Antibody tests are also an important tool to understand how many people in the population have been infected and how their immune system responded.
DOI: https://doi.org/10.58248/RR45
Antibody test: detects antibodies to SARS-CoV-2 produced during a current or previous infection.
Antigen test: detects viral material indicating a current infection.
Diagnostic test: a test that can confirm if someone is currently infected with SARS-CoV-2.
False negative: an incorrect result when someone with a SARS-CoV-2 infection tests negative.
False positive: an incorrect result when someone who does not have a SARS-CoV-2 infection tests positive.
Mass screening/testing: using tests in a large sample of healthy people to detect those who are currently infected.
Molecular test: a test that detects viral genetic material through PCR or newer laboratory techniques.
PCR test: Polymerase Chain Reaction, a type of molecular test.
Sensitivity: how well a test reports a positive result for people who have COVID-19 or SARS-CoV-2 antibodies.
Specificity: how well a test reports a negative result for people who do not have COVID-19 or SARS-CoV-2 antibodies.
There are two main types of test used to identify COVID-19 caused by infections with the SARS-CoV-2 virus. They either detect the presence of the virus or an immune response to it.
As with any medical diagnostic test, data about how confident we can be about their accuracy and reliability is crucial. This is complex because it depends on several factors. These include how a test is evaluated and how the performance of a test may change when it is used in the real world rather than in a highly controlled laboratory environment. Depending on the context and purpose for which tests are used, different test characteristics may be more important than others. The accuracy of testing also depends on what proportion of the population have an infection (or antibodies) at any given time. This is explained later.
The accuracy of diagnostic tests is usually benchmarked against a highly reliable reference standard, sometimes called a ‘gold standard’. There is no gold standard reference test for COVID-19 and no generally accepted reference standard against which to measure a diagnostic test’s performance. Therefore, no test can claim 100% accuracy. This is also the case for antibody tests.
So far, most tests that detect SARS-CoV-2 infections are benchmarked against the testing type that is seen as the most accurate available so far. This is the RT-PCR (reverse transcription polymerase chain reaction) test which is carried out in a laboratory. It uses a technique and special equipment to increase the amount of viral genetic material from the sample so that it can be detected. This test is the mainstay of COVID-19 testing in the UK. Test samples are sent to and processed in NHS Trust laboratories, national public health agency laboratories and the UK Lighthouse Labs Network (a network of diagnostic centres focused on COVID-19 testing).
Similarly, there is no agreed reference standard for antibody tests. The pragmatic solution is to compare a test with a composite standard based on samples containing antibodies taken from patients with confirmed disease at an appropriate stage of infection. This is particularly important for antibody tests, because it takes time to build up levels of antibodies after becoming infected, typically about 2 weeks, although this can be longer in people with mild or no symptoms. The timing of the test is also important as levels of SARS-CoV-2 antibodies decrease over time. This could mean that it would be possible that someone who had been infected 6 months ago could now test negative. This might be because they no longer have antibodies or because they are present at a low level that the test is not able to detect.
The National Institute for Health and Care Excellence and the Medicines and Healthcare products Regulatory Agency have published detailed guidance for test manufacturers about the essential test features and the standards that diagnostic and antibody tests should meet. This guidance sets out both the best approaches to evaluating test performance and the minimum reference standards to use, as well as more detailed information about the minimum levels for sensitivity and specificity of tests according to the context in which they are to be used. It also gives manufacturers a clear idea of the UK Government’s requirements on usability, safety and how quickly results need to be produced. This will help a manufacturer determine their capacity to supply tests at the volume required in order for them to be used nationally, in the NHS or screening programmes.
Many commercial tests to detect a current infection or antibodies are available. Comparing the relative performance of different commercial tests is difficult since manufacturers may compare their test’s performance using different reference standards. For this reason, public health agencies in the UK designed their own reference standard for diagnostic tests and evaluate the performance of several commercial tests against them in order to work out which ones are best suited for use by government, such as in the NHS or in infection surveillance projects. Public Health England has also carried out head-to-head evaluations of several commercial antibody tests. Data on commercial tests helps public health agencies develop communication materials explaining the limits of testing to professionals using them and to the public (this is particularly relevant for people who may access testing privately).
Understanding the extent to which a test can detect even very small amounts of virus or antibodies is paramount. This is called the limit of detection and refers to the minimum amount of material that a test can detect. This is important because some samples may contain less viral material or antibodies than others and the amount of virus and antibodies in the body changes as an infection progresses. There may also be differences in the amount of viral material in different parts of the body, so where and how a sample is taken is very important. Diagnostic tests also need to be able to distinguish between SARS-CoV-2 and other viruses that may be present in a sample, especially other coronaviruses that can cause respiratory infections.
When the accuracy of tests is discussed two important terms are used. This example talks about a diagnostic test to see if someone has an infection:
Test sensitivity and specificity is reported by manufacturers by seeing how well their test is able to confirm the results for a group of reference samples that we know are either positive or negative:
However, these numbers do not give a complete picture of a test’s reliability. The value of a test in real world use can be quite different, depending on how common the infection is in the population. This is because we do not know what proportion of the population are infected or how many people have antibodies. This can lead to false negative and false positive results.
For example, if a group of 10,000 patients were hospitalised with suspected COVID-19 symptoms in an outbreak area, it is likely that 90% of them are actually infected with SARS-CoV-2. For a diagnostic test with 95% sensitivity and 95% specificity the predictive accuracy of the test would be as follows:
Infected with SARS-CoV-2 | Not infected with SARS-CoV-2 | Total | Predictive accuracy of test | |
Test positive | 8,550 (true positive) | 50 (false positive) | 8,600 | (8,550/8,600 x 100)
= 99.4% |
Test negative | 450 (false negative) | 950 (true negative) | 1,400 | (950/1,400 x 100)
= 67.9% |
Overall, the test is very good at identifying people with the infection. However, 50 people who are not infected will still test positive and 450 infected people test negative.
If there is a chance that only 5% of people are infected and 10,000 people are tested, the predictive accuracy of results for a test with 95% sensitivity and 95% specificity looks quite different:
Infected with SARS-CoV-2 | Not infected with SARS-CoV-2 | Total | Predictive accuracy of test | |
Test positive | 475 (true positive) | 475 (false positive) | 950 | (475/950 x 100) = 50.0% |
Test negative | 25 (false negative) | 9,025 (true negative) | 9,050 | (9,025/9,050 x 100)
= 99.7% |
This example shows that when disease (or antibody) prevalence is low in the population, the probability of a false positive result becomes higher, even using tests with reasonably high levels of sensitivity and specificity. Decisions about balancing test sensitivity and specificity depend on the purpose for which testing is being used for.
Test error is also amplified when tests are used in the “real world”. The results from tests performed under strict research laboratory conditions are not necessarily replicated when tests are used operationally, such as in large scale testing programmes. Errors can arise for several reasons. For example, a sample may be taken incorrectly or contaminated, and other sources of error can lead to more false results. There are no data yet on the extent to which operational use of tests in national COVID diagnostic testing programmes exacerbate the problem of false positive results, but one estimate based on tests using similar technologies for other viruses is 2.3%. The rate of false negatives in large programmes is also unknown but will be influenced by the timing of the test – samples taken in the early and late stages of infection are more likely to be falsely negative.
The British Medical Journal has an interactive COVID-19 test calculator where you can explore how the features of a test influence the accuracy of results.
The characteristics of each test are important, but the way in which they will be used also has implications for the interpretation of the results. For example, the most important features of a test to be used for confirming a clinical diagnosis of a patient in hospital with COVID-19 are very different to those for a test intended for a mass screening programme. As the examples above show, tests with high sensitivity work well when there is a high chance that the person is infected. Specificity is much more important if tests are used to screen a very large population of people where most don’t have the infection.
The rate of false positives and negatives has significant implications when a test is to be performed at scale as a screening tool, such as in a large workforce or at a national population level. This is because even if a test has 99% sensitivity and specificity, large numbers of people would get an incorrect positive result and subsequently be required to self-isolate, or get a false negative result and could go on to spread the virus. False positive results would also lead to additional unnecessary effort to identify contacts who would then be required to self-isolate. When the overall prevalence of infection in the population is low, the mass screening of asymptomatic people also increases the rate of false positive results. This then necessitates confirmatory testing using a second different test so that people know to either self-isolate or return to daily life.
The latest estimate from a national COVID-19 infection survey is that nearly 105,000 people had an infection between 13 and 19 September. In such a screening programme, the problems detailed above are present. However, such a survey still provides very useful information because statistical techniques can be used to control for false positives and false negatives. It also provides information over time and allows comparisons between different geographic areas – so the direction of the outbreak can be better understood. These findings can then be supported by other evidence, such as hospital admissions.
The scientific committee advising the Welsh Government on COVID-19 has published a framework for assessing the utility of tests in different scenarios as either a diagnostic or screening tool. A subgroup of the Scientific Advisory Group for Emergencies has also published recommendations on using tests in mass screening programmes.
You can find more content from POST on COVID-19 here.
You can find more content on COVID-19 from the Commons and Lords Libraries here.
What can deliberately infecting healthy people tell us about infectious diseases? How is this useful for developing treatments, and how do we manage the risks?
How do our bodies defend against Covid-19? Read how immune responses differ across people, variants, reinfection, vaccination, and current immunisation strategies.
Research studies involving thousands of people have allowed scientists to test which drugs are effective at treating COVID-19. Several drug therapies are now available to treat people who are in hospital with COVID-19, or to prevent infections in vulnerable people becoming more serious. This briefing explains which drugs are available, the groups of people in which they are used and how they work. It also outlines the importance of monitoring the emergence of new variants and drug resistance.