Successful incorporation of single reviewer assessments during systematic review screening: development and validation of sensitivity and work-saved of an algorithm that considers exclusion criteria and count Nama, Nassr; Hennawy, Mirna; Barrowman, Nick; O’Hearn, Katie; Sampson, Margaret; McNally, James D
Background: Accepted systematic review (SR) methodology requires citation screening by two reviewers to maximise retrieval of eligible studies. We hypothesized that records could be excluded by a single reviewer without loss of sensitivity in two conditions; the record was ineligible for multiple reasons, or the record was ineligible for one or more specific reasons that could be reliably assessed. Methods: Twenty-four SRs performed at CHEO, a pediatric health care and research centre in Ottawa, Canada, were divided into derivation and validation sets. Exclusion criteria during abstract screening were sorted into 11 specific categories, with loss in sensitivity determined by individual category and by number of exclusion criteria endorsed. Five single reviewer algorithms that combined individual categories and multiple exclusion criteria were then tested on the derivation and validation sets, with success defined a priori as less than 5% loss of sensitivity. Results: The 24 SRs included 930 eligible and 27390 ineligible citations. The reviews were mostly focused on pediatrics (70.8%, N=17/24), but covered various specialties. Using a single reviewer to exclude any citation led to an average loss of sensitivity of 8.6% (95%CI, 6.0–12.1%). Excluding citations with ≥2 exclusion criteria led to 1.2% average loss of sensitivity (95%CI, 0.5–3.1%). Five specific exclusion criteria performed with perfect sensitivity: conference abstract, ineligible age group, case report/series, not human research, and review article. In the derivation set, the five algorithms achieved a loss of sensitivity ranging from 0.0 to 1.9% and work-saved ranging from 14.8 to 39.1%. In the validation set, the loss of sensitivity for all 5 algorithms remained below 2.6%, with work-saved between 10.5% and 48.2%. Conclusions: Findings suggest that targeted application of single-reviewer screening, considering both type and number of exclusion criteria, could retain sensitivity and significantly decrease workload. Further research is required to investigate the potential for combining this approach with crowdsourcing or machine learning methodologies.
Item Citations and Data
Attribution 4.0 International (CC BY 4.0)