Introduction: Although individual HIV rapid diagnostic tests (RDTs) show good performance in evaluations conducted by WHO, reports from several African countries highlight potentially significant performance issues. Despite widespread use of RDTs for HIV diagnosis in resource-constrained settings, there has been no systematic, head-to-head evaluation of their accuracy with specimens from diverse settings across sub-Saharan Africa. We conducted a standardized, centralized evaluation of eight HIV RDTs and two simple confirmatory assays at a WHO collaborating centre for evaluation of HIV diagnostics using specimens from six sites in five sub-Saharan African countries.
Methods: Specimens were transported to the Institute of Tropical Medicine (ITM), Antwerp, Belgium for testing. The tests were evaluated by comparing their results to a state-of-the-art reference algorithm to estimate sensitivity, specificity and predictive values.
Results: 2785 samples collected from August 2011 to January 2015 were tested at ITM. All RDTs showed very high sensitivity, from 98.8% for First Response HIV Card Test 1–2.0 to 100% for Determine HIV 1/2, Genie Fast, SD Bioline HIV 1/2 3.0 and INSTI HIV-1/HIV-2 Antibody Test kit. Specificity ranged from 90.4% for First Response to 99.7% for HIV 1/2 STAT-PAK with wide variation based on the geographical origin of specimens. Multivariate analysis showed several factors were associated with false-positive results, including gender, provider-initiated testing and the geographical origin of specimens. For simple confirmatory assays, the total sensitivity and specificity was 100% and 98.8% for ImmunoComb II HIV 12 CombFirm (ImmunoComb) and 99.7% and 98.4% for Geenius HIV 1/2 with indeterminate rates of 8.9% and 9.4%.
Conclusions: In this first systematic head-to-head evaluation of the most widely used RDTs, individual RDTs performed more poorly than in the WHO evaluations: only one test met the recommended thresholds for RDTs of ≥99% sensitivity and ≥98% specificity. By performing all tests in a centralized setting, we show that these differences in performance cannot be attributed to study procedure, end-user variation, storage conditions, or other methodological factors. These results highlight the existence of geographical and population differences in individual HIV RDT performance and underscore the challenges of designing locally validated algorithms that meet the latest WHO-recommended thresholds.