Accurate classification of carotid endarterectomy indication using physician claims and hospital discharge data

UBC Faculty Research and Publications

Accurate classification of carotid endarterectomy indication using physician claims and hospital discharge data van Gaal, Stephen; Alimohammadi, Arshia; Yu, Amy Y. X.; Karim, Mohammad Ehsanul; Zhang, Wei (Health economist); Sutherland, Jason M.

Abstract

Background and purpose Studies of carotid endarterectomy (CEA) require stratification by symptomatic vs asymptomatic status because of marked differences in benefits and harms. In administrative datasets, this classification has been done using hospital discharge diagnosis codes of uncertain accuracy. This study aims to develop and evaluate algorithms for classifying symptomatic status using hospital discharge and physician claims data. Methods A single center’s administrative database was used to assemble a retrospective cohort of participants with CEA. Symptomatic status was ascertained by chart review prior to linkage with physician claims and hospital discharge data. Accuracy of rule-based classification by discharge diagnosis codes was measured by sensitivity and specificity. Elastic net logistic regression and random forest models combining physician claims and discharge data were generated from the training set and assessed in a test set of final year participants. Models were compared to rule-based classification using sensitivity at fixed specificity. Results We identified 971 participants undergoing CEA at the Vancouver General Hospital (Vancouver, Canada) between January 1, 2008 and December 31, 2016. Of these, 729 met inclusion/exclusion criteria (n = 615 training, n = 114 test). Classification of symptomatic status using hospital discharge diagnosis codes was 32.8% (95% CI 29–37%) sensitive and 98.6% specific (96–100%). At matched 98.6% specificity, models that incorporated physician claims data were significantly more sensitive: elastic net 69.4% (59–82%) and random forest 78.8% (69–88%). Conclusion Discharge diagnoses were specific but insensitive for the classification of CEA symptomatic status. Elastic net and random forest machine learning algorithms that included physician claims data were sensitive and specific, and are likely an improvement over current state of classification by discharge diagnosis alone.

Item Metadata

Title	Accurate classification of carotid endarterectomy indication using physician claims and hospital discharge data
Creator	van Gaal, Stephen; Alimohammadi, Arshia; Yu, Amy Y. X.; Karim, Mohammad Ehsanul; Zhang, Wei (Health economist); Sutherland, Jason M.
Contributor	Centre for Health Evaluation & Outcome Sciences; University of British Columbia. Centre for Health Services and Policy Research
Publisher	BioMed Central
Date Issued	2022-03-22
Description	Background and purpose Studies of carotid endarterectomy (CEA) require stratification by symptomatic vs asymptomatic status because of marked differences in benefits and harms. In administrative datasets, this classification has been done using hospital discharge diagnosis codes of uncertain accuracy. This study aims to develop and evaluate algorithms for classifying symptomatic status using hospital discharge and physician claims data. Methods A single center’s administrative database was used to assemble a retrospective cohort of participants with CEA. Symptomatic status was ascertained by chart review prior to linkage with physician claims and hospital discharge data. Accuracy of rule-based classification by discharge diagnosis codes was measured by sensitivity and specificity. Elastic net logistic regression and random forest models combining physician claims and discharge data were generated from the training set and assessed in a test set of final year participants. Models were compared to rule-based classification using sensitivity at fixed specificity. Results We identified 971 participants undergoing CEA at the Vancouver General Hospital (Vancouver, Canada) between January 1, 2008 and December 31, 2016. Of these, 729 met inclusion/exclusion criteria (n = 615 training, n = 114 test). Classification of symptomatic status using hospital discharge diagnosis codes was 32.8% (95% CI 29–37%) sensitive and 98.6% specific (96–100%). At matched 98.6% specificity, models that incorporated physician claims data were significantly more sensitive: elastic net 69.4% (59–82%) and random forest 78.8% (69–88%). Conclusion Discharge diagnoses were specific but insensitive for the classification of CEA symptomatic status. Elastic net and random forest machine learning algorithms that included physician claims data were sensitive and specific, and are likely an improvement over current state of classification by discharge diagnosis alone.
Subject	Machine learning; Administrative data; Carotid endarterectomy; Statistical model
Genre	Article
Type	Text
Language	eng
Date Available	2022-04-14
Provider	Vancouver : University of British Columbia Library
Rights	Attribution 4.0 International (CC BY 4.0)
DOI	10.14288/1.0412852
URI	http://hdl.handle.net/2429/81198
Affiliation	Medicine, Faculty of; Non UBC; Population and Public Health (SPPH), School of
Citation	BMC Health Services Research. 2022 Mar 22;22(1):379
Publisher DOI	10.1186/s12913-022-07614-1
Peer Review Status	Reviewed
Scholarly Level	Faculty; Researcher
Copyright Holder	The Author(s)
Rights URI	http://creativecommons.org/licenses/by/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Faculty Research and Publications