UBC Faculty Research and Publications

Extending and encoding existing biological terminologies and datasets for use in the reasoned semantic… Samadian, Soroush; McManus, Bruce; Wilkinson, Mark D Jul 20, 2012

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
52383-13326_2011_Article_121.pdf [ 630.02kB ]
Metadata
JSON: 52383-1.0215908.json
JSON-LD: 52383-1.0215908-ld.json
RDF/XML (Pretty): 52383-1.0215908-rdf.xml
RDF/JSON: 52383-1.0215908-rdf.json
Turtle: 52383-1.0215908-turtle.txt
N-Triples: 52383-1.0215908-rdf-ntriples.txt
Original Record: 52383-1.0215908-source.json
Full Text
52383-1.0215908-fulltext.txt
Citation
52383-1.0215908.ris

Full Text

RESEARCH Open AccessExtending and encoding existing biologicalterminologies and datasets for use in thereasoned semantic webSoroush Samadian1, Bruce McManus1 and Mark D Wilkinson2,3*AbstractBackground: Clinical phenotypes and disease-risk stratification are most often determined through the directobservations of clinicians in conjunction with published standards and guidelines, where the clinical expert is thefinal arbiter of the patient’s classification. While this "human" approach is highly desirable in the context ofpersonalized and optimal patient care, it is problematic in a healthcare research setting because the basis for thepatient's classification is not transparent, and likely not reproducible from one clinical expert to another. This sits inopposition to the rigor required to execute, for example, Genome-wide association analyses and otherhigh-throughput studies where a large number of variables are being compared to a complex disease phenotype.Most clinical classification systems and are not structured for automated classification, and similarly, clinical data isgenerally not represented in a form that lends itself to automated integration and interpretation. Here we applySemantic Web technologies to the problem of automated, transparent interpretation of clinical data for use inhigh-throughput research environments, and explore migration-paths for existing data and legacy semanticstandards.Results: Using a dataset from a cardiovascular cohort collected two decades ago, we present a migration path -both for the terminologies/classification systems and the data - that enables rich automated clinical classificationusing well-established standards. This is achieved by establishing a simple and flexible core data model, which iscombined with a layered ontological framework utilizing both logical reasoning and analytical algorithms toiteratively "lift" clinical data through increasingly complex layers of interpretation and classification. We compare ourautomated analysis to that of the clinical expert, and discrepancies are used to refine the ontological models, finallyarriving at ontologies that mirror the expert opinion of the individual clinical researcher. Other discrepancies,however, could not be as easily modeled, and we evaluate what information we are lacking that would allow thesediscrepancies to be resolved in an automated manner.Conclusions: We demonstrate that the combination of semantically-explicit data, logically rigorous models ofclinical guidelines, and publicly-accessible Semantic Web Services, can be used to execute automated, rigorous andreproducible clinical classifications with an accuracy approaching that of an expert. Discrepancies between themanual and automatic approaches reveal, as expected, that clinicians do not always rigorously follow establishedguidelines for classification; however, we demonstrate that "personalized" ontologies may represent a re-usable andtransparent approach to modeling individual clinical expertise, leading to more reproducible science.* Correspondence: markw@illuminae.com2NCE CECR Center of Excellence for the Prevention of Organ Failure (PROOFCentre), St. Paul's Hospital, Room 166, Burrard Building 1081 Burrard Street,Vancouver, BC, Canada V6Z 1Y63Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica deMadrid, Madrid, SpainFull list of author information is available at the end of the articleJOURNAL OFBIOMEDICAL SEMANTICS© 2012 Samadian et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the CreativeCommons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, andreproduction in any medium, provided the original work is properly cited.Samadian et al. Journal of Biomedical Semantics 2012, 3:6http://www.jbiomedsem.com/content/3/1/6BackgroundTerminologies and Nosologies have long been used byclinicians and clinical researchers as a means of moreconsistently annotating their observations. It is not sur-prising, then, that the emergence of the Semantic Webfound fertile ground in the clinical and life science com-munities, and formal Semantic Web standards have beenrapidly adopted by these communities to migrate exis-ting annotation systems into these modern frameworksand syntaxes. While this largely syntactic migration is auseful exercise, in that it becomes possible to do simplereasoning over manual annotations, this simple migra-tion does not enable the full power of modern semantictechnologies to be applied to these important biomedicaldatasets. This is, in part, because these semantic re-sources continue to be used largely as controlled vo-cabularies rather than as rich descriptors for logicalclassification.The Semantic Web languages Resource DescriptionFramework (RDF) [1] and Web Ontology Language(OWL) [2] are the World Wide Web Consortium's re-commended standards for semantically-explicit encodingof data and knowledge representation on the SemanticWeb (respectively), and as such, these were the lan-guages chosen for this study. Given the ability for RDFand OWL to be used to interpret, rather than simply an-notate data, it would be useful to examine the migrationpath - both for the terminologies and the data - thatenables such rich interpretive reasoning to be applied.How do we alter and/or extend existing terminologiessuch that they can be used to classify clinical data? Whatmodifications to traditional data capture and representa-tion must be made in order to make these data amena-ble to such logical inferences? Can we replace (or at aminimum, guide) expert clinical annotators in their in-terpretation of clinical data, and with what level of ac-curacy can this be achieved? In this report, we exploreone such migration path, and discuss our observationsand results, as well as the barriers and resulting manual-interventions that were employed to accomplish the goalof creating a reasoned environment for clinical dataevaluation and interpretation. We base our explorationin a real-world use case, using clinical data collected andannotated 20 years ago in the context of a study of pa-tient outcomes after various cardiovascular interventions.Heart and Blood Vessel Diseases have a high rate ofmortality and morbidity, and pose a significant disease-burden on healthcare systems worldwide. In such di-seases, asymptomatic biological “diseases”, typically pre-cede the clinical manifestation of symptomatic diseases.Most of the time, the development of biological diseaseinto a symptomatic event can be significantly mitigatedor prevented through a combination of medication andlifestyle changes. It is widely accepted that several riskfactors including age, sex, high blood pressure, smoking,dyslipidemia, diabetes, obesity and inactivity are majorfactors for developing a variety of heart and blood vesseldiseases [3].To assist with comparison of, and interpretation of,patient data, clinical researchers have developed guide-lines for classifying patients phenotypically into variouscategories based on a wide variety of raw clinical mea-surements. For instance, Table 1 shows the AmericanHeart Association (AHA) [4] guidelines for phenotypicclassification of hypertension based on systolic and dia-stolic blood pressure observations. Although this classifi-cation system appears relatively straightforward, it isimportant to note that this represents only one of anumber of different classification systems for the samephenotypic phenomenon (systemic hypertension), someof which include the informal expert-opinion of theclinicians themselves. As such, the same patient clini-cal observations might be categorized as “hypertensive”using one standard but categorized as “normal” using adifferent standard. This leads to problems when attemp-ting to compare and integrate patient data between stu-dies or even between different clinicians/centers in thesame study, particularly when the annotation (“normal”versus “hypertensive”) is published in the dataset in lieuof the primary clinical measurements. To complicatematters further, health sciences communities continu-ously modify and update their guidelines in the light ofnew biomedical knowledge. For example, Global Initia-tive for Chronic Obstructive Lung Disease (GOLD) wascomprehensively updated in 2006 which lead to differentcriteria for phenotypic classification with respect to pre-vious years [5]. As such, even data from the same insti-tution may be subject to slightly different interpretationsover time. These interpretations become encoded inpublished datasets and, unfortunately, it is rare for thestandards under which an interpretation was made to berigorously recorded together with that interpretation. Thisissue leads to potentially erroneous re-interpretation ofdata, particularly when integrating data over long periodsof time, or between disparate institutions. The emergenceand uptake of Semantic Web technologies such as OWLand RDF by the Life Sciences, and the ability to use theseTable 1 American Heart Association classification forsystolic and diastolic blood pressure [4]Classification Systolic pressure Diastolic pressuremmHg kPa mmHg kPaNormal 90-119 12-15.9 60-79 8.0-10.5Pre-hypertension 120-139 16.0-18.5 80-89 10.7-11.9Stage 1 140-159 18.7-21.2 90-99 12.0-13.2Stage 2 ≥160 ≥21.3 ≥100 ≥13.3Isolated systolic hypertension ≥140 ≥18.7 <90 <12.0Samadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 2 of 18http://www.jbiomedsem.com/content/3/1/6technologies to enable dynamic classification of data, pro-vides exciting opportunities for exploring novel ways toevaluate the feasibility of doing such clinical annotationdynamically.In this largely methodological study we undertookto create an environment in which “legacy” clinical dataand annotation terminologies are modified such thatthey can be used together to automate the dynamic "on-demand" analysis and logical classification of patientsinto various cardiovascular disease risk groups under avariety of clinical classification guidelines. Specifically,we undertook a data remodeling process, migrating datafrom traditional databases and spreadsheets into agraph-based data framework (RDF); we utilize OWL toextend the cardiovascular-specific portion of an existingclinical annotation system namely GALEN [6] such thatit can be utilized as an interpretation layer over this pa-tient data; we then created a series of analytical WebServices which will be used to execute the statistical ana-lyses of patient data in cases where pure logical reasoningis insufficient for classification; and finally, we executedour automated analyses/classifications, and comparedthem to the manual annotations done by an expert cardio-vascular clinician two decades prior. Any differences werethen examined in detail to determine the source of thediscrepancy, and we evaluate and discuss our ability tomodify the interpretive layers to account for differencesbetween the clinician's manually annotated data and theautomated annotations.MethodsDatasets and data collectionThe dataset used for this experiment consists of clinicalobservations of a cardiovascular patient cohort collectedfrom a number of hospitals in Nebraska, USA from theperiod from August 1986 to July 1989. A total numberof 636 unique patients with a total of 1723 encounterswere recorded. The database was originally collected asa part of a study comparing the cardiovascular diseaserisk-profile changes over a period of one year post pro-cedure/surgery for patients undergoing Coronary AllograftBypass Graft (CABG) versus those undergoing percuta-neous coronary intervention (PCI). An individual's riskcan be assessed using a number of available risk-predictiontools such as Framingham [7], and Reynolds Risk Scores[8], which incorporate information on established riskfactors such as blood lipids, Blood Pressure, Body MassIndex, age, gender, and smoking status. In this dataset tworisk-assessment schemes were used to annotate patientdata: a binary risk score ("at risk", "not at risk") assigned toindividual clinical observations such as blood pressure,and an overall cumulative risk score using the Framing-ham risk measurement (see results section). The clinicalobservations used in this analysis were as follows:Age, Gender, Height, Weight, Body Mass Index (BMI),Systolic Blood Pressure (SBP), Diastolic BloodPressure (DBP) Glucose, Cholesterol, Low DensityLipoprotein (LDL), High Density Lipoprotein (HDL),Triglyceride (TG)As an exemplar, the first row of the data set is shownin Table 2. The intended meaning of acronyms for eachcolumn header (e.g., SBP for Systolic Blood Pressure)was confirmed with the clinician who owned the dataset.The table contains two types of data: clinical observa-tions (un-shaded cells), and the clinician-assessed binaryrisk - 1 or 0 for "at risk" or "not at risk", respectively(shaded cells; e.g., HDL GR for High Density LipoproteinRisk Grade). The final column (RISK GR) indicates theternary overall risk assessment - 1 for low, 2 for moder-ate, and 3 for high risk - which the clinician indicated tous was based on the Framingham Risk Score algorithms.Overview of approachIn 2005 we proposed a semantic data classification ar-chitecture in which raw clinical measurements wouldbe "lifted" through increasingly conceptual/interpretivelayers of ontologies in order to complete an analysis,evaluation, or query [9]. This would be achieved througha combination of logical reasoning over the data andontologies, in parallel with the discovery of Web Ser-vices that aggregated and analyzed the data, therebydynamically identifying individuals logically compliantwith the ontological classes at each layer. This hybridapproach is necessary because (useful) OWL reasoningis limited to a decidable fragment of first-order logic -effectively, it is possible to define the conditions underwhich an individual would be a member of a particularset/category, and it is possible to infer through a seriesof logical statements about the data, whether those con-ditions exist for a particular data record. However, whileit is possible to infer that particular data properties mustexist as a logical consequence of the existence of otherdata properties, it is not possible to derive data throughalgorithmic calculations using OWL reasoning alone.For these cases, we have written and published a seriesof Semantic Web Services that consume clinical data,Table 2 Part of the first row of dataset used in Microsoft excel sheetSBP DBP TOTALCHOLHDL TG AGE GENDER HEIGHT WEIGHT TGGRHDLGRLDLGRCHOLGRBMIGRDBPGRSBPGRRISKGR128 80.1 227 55 84 77 M 1.8288 78.1818 0 0 0 1 0 0 0 1Samadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 3 of 18http://www.jbiomedsem.com/content/3/1/6execute various algorithmic analyses on them, and thenreturn the dataset with new, derived data propertiesattached. These derived properties can then be used bythe OWL reasoner to further classify the clinical dataand "lift" it into increasingly complex clinical phenotypiccategories.While our approach is not reliant on any additionaltechnologies for its success, one of our secondary goalsin undertaking this project was to demonstrate that cer-tain frameworks and practices established by our groupcould be used, with very little effort, to automate thisinteraction between OWL models and analytical Web Ser-vices. This automation reduces the complexity of analysisand evaluation of clinical data for the end-user. While theiterative process of reasoning, identification of appropriateanalytical algorithms, execution of those algorithms, re-integration of output data, and re-reasoning could be doneentirely manually (as would be the current practice), au-tomating the "semantic lifting" process is enabled bytwo recently published pieces of technology - Seman-tic Automated Discovery and Integration (SADI) and theSemantic Health And Research Environment (SHARE).SADI is a set of best-practices for modeling SemanticWeb services in the scientific domain [10]. It is designedto be used in conjunction with OWL ontologies to dis-cover Web Services capable of generating the propertiesthat comprise an OWL class definition. Those Services,once discovered, are invoked by simple HTTP POST ofRDF-formatted data.SHARE is a SADI client application that allows SADIservices to be discovered during the process of SPARQL-DL query evaluation [11]. Effectively, SHARE augmentsOWL reasoners and SPARQL query engines with theability to retrieve data dynamically generated from remotedata sources at the time of query execution and reasoning.When an ontological concept is present in a SHAREquery, it will exhaustively "decompose" that concept intoits complete set of property restrictions, importing anyadditional ontological classes as necessary. Once "decom-posed", it then utilizes SADI to discover and execute ser-vices capable of creating those properties based on anydata SHARE already has in its database.Figure 1 provides a diagrammatic representation of the"semantic lifting" process. By referring to an ontologicalconcept in the SPARQL query (Layer 4 in the diagram),raw data is "lifted" through the ontological layers via aniterative process of reasoning, service discovery, and exe-cution. This is our first attempt to deploy such an archi-tecture over bona fide clinical data.In order for this approach to be successful, we mustfirst migrate the legacy data, and any legacy terminolo-gies, into a more rigorous logical framework that is ca-pable of being interpreted by OWL reasoners. We willnow describe that process in detail.Ontologies usedMeasurement unit ontology (MUO)While this study does not (conspicuously) take advan-tage of the semantic encoding of measurement units, wewish to nevertheless fully describe the process by whichwe transformed legacy data into a semantic framework.Since neither RDF nor OWL have a built-in method forrepresenting units, it was necessary to select an approachto unit representation that would enable us, in futurestudies, to take advantage of their inherent semantics du-ring query and reasoning.Layer 1:Blood Pressure Layer 2:High Risk Blood PressureLayer 3:High Risk FraminghamLayer 4:DL Query Database 1 Database 2 Analytical AlgorithmSADI SADI4231Figure 1 CardioSHARE architecture: increasingly complex ontological layers organize data into more abstract concept.Samadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 4 of 18http://www.jbiomedsem.com/content/3/1/6MUO is a modular ontology specifically designed to re-present units in a combinatorial fashion. MUO includesthe definitions of the classes and properties conforming tothe general design principles of upper-level OntologyDOLCE [12]. The ontology models mainly three disjointentities: 1. Units of measurement; 2. Physical qualities thatcan be measured, and 3. Common prefixes for units ofmeasurement. MUO also defines URIs for the most com-mon units of measurement, “physical qualities” [13], andprefixes, which can be shared and reused in different do-main ontologies. Every unit of measurement is attributedto a physical quality. Two types of units are generally dealtwith in MUO: Base Units and Derived Units are definedas follows:Base Units are the units that are not derived from anyother unit. Base units can be used to derive other units.The international System of Units (SI) It should be notedthat even though ‘kilogram” is considered a base unit inSI, it is composed of kilo prefix plus base unit “gram”. Inthis sense, kilogram is an exception of a unit which isconsidered as base unit in SI system. However, to stick toour design schema, we considered “gram” as the base anddefined kilogram as an extension of it,a defines a numberof independent base units such as meter (m) [14].Derived Units are the units obtained from combinationof the base units of based units to represent “derivedphysical quantities” as defined by DOLCE [12]. In theformal representation of physical qualities and associatedunits, MUO defines property muo:derivesFrom toexpress the relationship between the derived unit and theunits it is derived from. Derived Units are further dividedinto simple and complex derived units [14].Simple Derived Units are the units that are derivedfrom exactly one base unit [14]. For instance, themillimeter (mm) can be derived from meter (m). Theseare units that can be defined by attaching a Prefix tobase Units. MUO also recognizes a different type ofbase unit that although derived from exactly one baseunit, has a different dimension. For instanceSquareMeter(m2). For such cases another propertycalled muo:dimensionalSize is added to account for thedimensionality differences.Complex Derived Units are the units that are derivedfrom more than one base unit [14]. For instanceconsider Body Mass Index(BMI) which is a statisticalmeasure which compares a person's weight and height.BMI is used to estimate a healthy body weight basedon a person's height. BMI in International System ofUnits (SI) [15], is defined as follows [16]:BMI ¼ mass kgð Þheight mð Þð Þ2 ð1ÞBMI defined in this way, has the units kg/m2 and thisunit can be defined as follows using MUO::kilogram-per-meter-square rdf:type muo:ComplexDerivedUnit;muo:derivesFrom ucum:kilogram;muo:derivesFrom :meter-squared.:meter-squared rdf:type muo:SimpleDerivedUnit;muo:derivesFrom ucum:meter;muo:dimensionalSize "2"^^xsd:float.As shown above, MUO proposes a clear and conveni-ent framework for defining new units of measurementsin terms of existing ones, and this was used to deriveany units required by our investigation not explicitlydefined by the current version of the MUO.GALENThe GALEN Common Reference Model (CRM) is a richcompositional ontology of the medical domain, coveringanatomy, function, pathology, diseases, symptoms, drugs,and procedures [6]. It was developed by the Departmentof Computer Science at the University of Manchester [17].It is available in both GRAIL [18] and OWL formalisms.The version used in this study is the latest OWL versionavailable, dated August 2011 consisting of 2749 classesand 500 object properties. Several groups have investi-gated various aspects of the GALEN Ontology includingexpressivity, representation, and suitability for specific ap-plications (e.g., [19]). Based on these studies, and our owninvestigation of the suitability of its terminological do-main, we selected GALEN as our core Ontology descri-bing cardiovascular concepts. In this paper we primarilyfocus on concepts in GALEN that are relevant to car-diovascular risk monitoring, and describe an approach forre-factoring and extending the cardiovascular-relevantclasses of GALEN such that they can be used to automat-ically classify clinical data.Semanticscience integrated ontology (SIO)The SemanticScience Integrated Ontology (SIO) is an ef-fort to create a coherent formal ontology with rigorousattention to concrete and clearly-stated design patterns[20]. SIO takes the "realist" position in which things existindependently of conceptual or linguistic schemes, andfirmly acknowledges that terms used in a discourse de-notes one or more individuals or classes, for which thelatter may have zero or more instances [21].The choice of properties in development of any onto-logy is crucial and non-trivial [22]. The use of a minimalset of re-usable relations is essential in building consistent,interoperable and well-formed knowledge bases [23]. Forinstance, the following two OWL property constraintsmight be considered to describe the same data feature:Samadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 5 of 18http://www.jbiomedsem.com/content/3/1/61. Patient hasAttribute someValueFromSystolicBloodPressure2. Patient hasSystolicBloodPressure someValueFromAttributeWith respect to re-usability these two representationsare considerably different. When designing ontologies tosupport logical reasoning, it is considered good-practiceto encode the complexity of data in class definitions(statement #1) rather than through proliferation of prop-erties (statement #2) [23]. The relationships defined bySIO are highly generic (e.g., "has Attribute", defined bySIO's property SIO_000008), and this forces us, as thedata modelers, to follow these good design patterns andformalize data-types through elaboration of ontologicalclasses which are, whenever possible, distinct in theirproperties from all other ontological classes. We adheredto this design principle as closely as possible in this study.Finally, SIO is extensively used by analytical tools ex-posed using SADI Semantic Web Services, and thus ouradoption of SIO also allows us to more easily take ad-vantage of existing analytical tools published throughthe SADI framework, as well as rapidly publish and inte-grate new tools as-needed for our study.Unit representation in OWL-RDFWhen extracting datasets from disparate sites, particu-larly over international boundaries, it is not uncommonfor the de facto unit of measurement to be different forany given clinical observation. Therefore, we must definea practical approach that allows clinical measurementsto carry different units while not sacrificing interope-rability. The lack of a standard approach to representmeasurable quantities in RDF has led to a number of dif-ferent configurations being used in different ontologiesand RDF data repositories (see [14] for more information).In our introduction of the MUO, we alluded to thefact that in context of the Semantic Web, representingphysical quantities using ontologies is a non-trivial prob-lem. RDF does not have any internal support for repre-senting a literal value together with its unit of measure[14]. RDF literal nodes can represent numeric values,such as "120" or "141.5" without units, or value-unit pairscan be as strings of characters (e.g. “120 mmHg”); however,the "semantics" of the unit of measure is lost in both ofthese approaches, compromising our ability to accuratelyintegrate datasets with heterogeneous measurements.GALEN itself has a rather limited coverage of measu-rement units, and lacks a systematic framework to definenew ones, or create composite units from basic ones. Forinstance, the GALEN concept MilligramPerDeciLitre, isdefined as a subclass of the concepts ConcentrationUnit,but lacks any indication that this unit is composed ofcombination of two base units (gram and liter) and twoprefixes (milli and deci). Similarly, SIO incorporates unitsfrom the Unit Ontology (UO) [24] in parallel with qua-lities from Phenotypic Quality Ontology (PATO) [25] forrepresenting quantifiable measurements. However, likeGALEN, SIO, UO, and PATO lack any formal frameworkfor describing the relationship between related units, ordefining new ones. Thus, in order to make use of suchrich semantics in our analyses, we avoided the use ofGALEN measurement units, preferring those defined by,or defined using, the MOU.Nevertheless, though MUO does provide a method fordefining the relationships between units and their de-rivatives, it does not provide a semantic framework forrepresenting conversions between different units of thesame "type" (e.g., metric and imperial weights). Since thiswas a potential problem in our analysis, and is a signi-ficant problem in science generally, we created a seriesof publicly-accessible Semantic Web Services capableof automatically detecting when unit-conflicts existin an aggregated dataset, and automatically resolvingthose conflicts to whichever canonical measurement unitis desired.Ontological mapping, extensions, and algorithmicservicesThe set of OWL classes that are required to describe ourdataset are as follows:Age, Sex, Mass, Height, BodyMassIndex,SystolicBloodPressure, DiastolicBloodPressure,BloodSugarConcentration, SerumCholesterolConcentration, SerumLDLCholesterolConcentration,SerumHDLCholesterolConcentration,SerumTriglycerideConcentration.We explored GALEN to search for the cardiovascu-lar concepts listed above, and found it to be sufficientlycomprehensive in terms of coverage of these concepts;however there were some minor differences in termi-nology between the labels in our dataset and GALENterms. For example, the term SerumHDLCholesterol ap-pears in GALEN, while the acronym HDL was used inour clinical dataset. Similarly the concept Glucose existsin GALEN while BloodSugarConcentration was the labelapplied to the (semantically) equivalent measurement inour clinical dataset. Such discrepancies were manuallymapped based on consultation with expert clinicians,using their preferred labels. Our intent was to select thelabel/class-name that best semantically described theintended meaning of the concept; while we admit thatthis approach is somewhat arbitrary, we could think ofno way to reliably automate these mappings.The concept Height did not exist in GALEN, thoughthe class Length did; to avoid over-loading the semanticsSamadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 6 of 18http://www.jbiomedsem.com/content/3/1/6of the existing GALEN class, we defined a new classHeight, and made this a subclass (owl:subClassOf ) ofGALEN's Length.Our proposed "layered" semantic framework requires usto identify concepts which are "core" (based on directobservations - Layer 1 of Figure 1), and concepts which are"derived" (based on calculations over core observations -Layer 2 and higher). For instance, the current lipid meas-urement protocols do not generally measure LDL particlesdirectly but instead estimate them using the Friedewaldequation [26]:H≈C  L kT ð2Þwhere H is HDL cholesterol, L is LDL cholesterol, C istotal cholesterol, T is triglycerides, and k is 0.20 if thequantities are measured in mg/dl and 0.45 if in mmol/l[27]. Thus, Triglycerides and Cholesterol are "core" mea-surements, while LDL is a "derived" measurement. Simi-larly, BMI is calculated from a relationship between heightand mass, and would be considered "derived". We manu-ally examined the protocols for obtaining the measure-ments in our dataset and consider the following GALENclasses to represent "core" measurements:Age, Sex, Mass, Height, SystolicBloodPressure,DiastolicBloodPressure, BloodSugarConcentration,SerumCholesterolConcentration, SerumHDLCholesterolConcentration, SerumTriglycerideConcentration.We henceforth will refer to these Classes as the "Groun-ding Classes" - classes whose members will be directly tiedto the dataset through explicit declaration of a piece ofdata as being a member of that Class.The OWL definition of each Grounding Class was cre-ated by extending the corresponding GALEN Class def-inition to include the defining features of the "Attribute"OWL Class from SIO; thus all members of these classeswill (logically) be both GALEN individuals, and SIOAttributes. This involved adding axioms for the SIO pro-perties hasMeasurement, hasUnit and hasValue to theGALEN class definitions.The example below shows how GALEN class for Sys-tolic Blood Pressure is extended using external classesand properties (the prefix before “:” shows the onto-logical namespace of each entity; the prefix "cardio" isthe namespace used to indicate the ontological classeswe have defined)cardio:SystolicBloodPressure:Galen:SystolicBloodPressure and(sio:hasMeasurement somecardio:pressuremeasurement)cardio:pressuremeasurement:sio:measurement and(sio:hasUnit some“muo:unit of pressure” andhasValue some Literal)The remainder of the measurements in our clinicaldataset are "derived", based on calculations performedover the core measurements, and their corresponding"Derivative Classes" in GALEN are:SerumLDLCholesterolConcentration BodyMassIndexThe class definition for these was generated using thesame approach as for the Grounding Classes; however,since members of these Derivative Classes can only bedetermined through algorithmic analysis of 'core' measure-ments, we also created a set of SADI Semantic Web Ser-vices that expose the necessary algorithms, consumingmembers of the relevant Grounding Classes, and genera-ting members of the Derivative Classes in response. Thus,data from Layer 1 can be "raised" into Layer 2 Classes (andabove) through invocation of these algorithmic services.Refactoring the legacy datasetAssumptions about data collection and measurementsSince the exact protocols describing how the clinicalobservations were made were not available, we made theassumption that they were derived from the most com-mon measurement protocols. For example, for bloodpressure measurements, we assumed that the measure-ments were made in a clinical setting (as opposed to cas-ual home monitoring), using conventional mercurymanometers applied on the left arm. The units used foreach measurement were not explicitly stated in the data-sets (Table 2) itself, so we made a best-guess based onthe range of the measurement values and confirmedthose with clinical experts. The units used to representmeasurements are shown below.Height: meterWeight: kilogramBodyMassIndex: kilogram per square meterSystolicBloodPressure, DiastolicBloodPressure:millimeter of mercury columnSerumHDLCholesterolConcentration,SerumLDLCholesterolConcentration,SerumTriglycerideConcentration,SerumCholesterolConcentration,BloodSugarConcentration: milligram per deciliterData schemaOur primary objective in designing an ontological modelto represent the clinical data was to support dynamic re-Samadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 7 of 18http://www.jbiomedsem.com/content/3/1/6interpretation of that data under a variety of differenthypothetical scenarios (e.g., re-interpretation as analy-tical or classification standards change over time). Im-portantly, it was not our intention to design a datamodel with sufficient complexity to represent every as-pect of a clinical record; rather we were focused onmodeling individual clinical measurements in a way thatwould allow them to be automatically analyzed and in-terpreted. Constraining ourselves to modeling only thissmall aspect of the clinical record should, we believe,allow existing comprehensive clinical record models tobe easily adapted to the framework we propose here.Figure 2 shows the schematic view of the data model,described as follows:1. We defined a class "PatientRecord", as a subclass ofthe SIO "record" class. PatientRecord will include allof the observations about a patient, keeping in-mindthat we considered each patient-encounter to be adifferent patient record for the purposes of this study(i.e., the longitudinality of the data was notconsidered).2. Patient clinical observations were divided intoGrounding Classes and Derived Classes as describedabove, and were modeled as owl:Individuals of theseclasses, with the corresponding unit and valueattached by the SIO hasUnit and hasValue properties.3. Each resulting Grounding Class member wasattached as an attribute of the PatientRecord usingthe hasAttribute property from SIO.Finally, using MUO methodologies (described above),we defined the units kilogram, kilogram-per-meter-squared, millimeter-of mercury-column, milli-gram-per-deci-liter, and milli-mole-per-liter, which were used forvarious individual studies as described in the Resultssection.Approach to binary patient classification (“at risk” versus“not at risk”)In our dataset the clinical researchers used a binary systemto classify patients as being "at risk" or "not at risk" basedon each of the following measures: Blood Pressure,Cholesterol, HDL, LDL, Triglycerides, and BMI. Thus, wecreated OWL classes representing each of these categories -for example, "HighRiskSBPRecord" to represent patientrecords reflecting a high-risk score with respect to SystolicBlood Pressure, and "LowRiskSBPRecord" to representpatient records reflecting a low-risk score with respect toSystolic Blood Pressure. These categories would then bePatientRecordcardio:Massowl:Literalsio:hasUnitsio:hasValuecardio:Heightcardio:SerumCholesterolConcentrationcardio:SerumTriglycerideConcentrationcardio:SerumHDLCholesterolConcentrationcardio:DiastolicBloodPressurecardio:SystolicBloodPressurecardio:BloodSugarConcentrationsio:hasAttributesio:hasAttributesio:hasAttributesio:hasAttributesio:hasAttributesio:hasAttributesio:hasAttributesio:hasAttributesio:has measurementsio:Measurementmuo:UnitFigure 2 Data schema using concepts in legacy ontologies. The additional features shown on the Mass class are present on all classes in thatrow, but are hidden to improve readability.Table 3 American Heart Association classification forsystolic and diastolic blood pressure [4]Classification Systolic pressure Diastolic pressuremmHg kPa mmHg kPaNormal 90-119 12-15.9 60-79 8.0-10.5Pre-hypertension 120-139 16.0-18.5 80-89 10.7-11.9Stage 1 140-159 18.7-21.2 90-99 12.0-13.2Stage 2 ≥160 ≥21.3 ≥100 ≥13.3Isolated systolic hypertension ≥140 ≥18.7 <90 <12.0Samadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 8 of 18http://www.jbiomedsem.com/content/3/1/6used in SHARE SPARQL queries to trigger data "lifting",and to compare the result of the resulting automatedcategorization of patient records with the expert annotationof the clinical researchers two decades ago.Working through one example in detail - Table 3shows the American Heart Association's classification ofsystolic and diastolic blood pressure values. Althoughthey indicate five different ranges (Normal, Prehyperten-sion, Stage 1 hypertension, etc.) the clinical researcherswho generated our dataset had only two categories - "atrisk", and "not at risk". Through discussions with theresearchers, they indicated that they considered Normaland Prehypertension to be "not at risk" (in Table 3) andall other categories to be "at risk". In Tables 3 through 6the light shaded area represents “low risk” whereas thedark shaded area represents the “high risk” groups asdefined by the guidelines. As such, we modeled an onto-logical class "HighRiskSBPRecord" in OWL as follows:HighRiskSBPRecord =cardio:PatientRecord and(sio:has Attribute some(cardio:SystolicBloodPressure and sio:hasMeasurement some(sio:Measurement and(sio:hasUnit value cardio:milli-meter-of-mercury-column) and(sio:hasValue some double[> = "140.0"^^double]))))In Tables 3 through 6 the light shaded area represents“low risk” whereas the dark shaded area represents the“high risk” groups as defined by the guidelines.In a somewhat different scenario, Table 4 shows theAmerican Association risk stratification for cholesterol,HDL and Triglycerides, each of which has three categories -high, medium, and low - compared to our clinician's binarycategorization of high and low. As above, we attempted tocreate OWL classes to model these risks; however, in thiscase we had no guidance from the clinician as to what to dowith intermediate measurements, as their original policyhad not been recorded. As such, in our initial (somewhattrivial) analysis, we defined "high risk" and "low risk" recordsas being congruent with the high and low risk categories ofthe official guidelines, and ignored all data in the intermedi-ate category. We describe how we modified these models,and our ability to determine the actual clinician's risk thresh-old, in the Results section.Modeling BMI and LDL risks were slightly more com-plex, since these two measurements are derived by algorith-mic analysis of one or more 'core' measurements. BMI iscalculated using a person’s weight and height (Equation 1)[15], and the guidelines were modeled in OWL followingthe American Heart Association guidelines in Table 5. Theresulting OWL classes representing BMI and HighRiskBMImeasurements respectively were as follows:cardio:BodyMassIndex =galen:BodyMassIndex and(sio:hasMeasurement some cardio:measurement)cardio:measurement =sio:measurement and(sio:hasUnit some cardio:UnitOfAreaDensity andsio:hasValue some Literal)Table 4 American Heart Association classification for cholesterol, HDL, and triglycerides [28,29]Level (mg/dl) Level (mmol/L) InterpretationCholesterol <200 <5 Desirable level corresponding to lower risk200-240 5.2-6.2 Borderline high risk>240 >6.2 High riskHDL <40 for men, <50 for women <1.03 Low HDL cholesterol, heightened risk40-59 1.03-1.55 Medium HDL level>60 >1.55 High HDL level, optimal conditionTriglyceride <150 <1.69 Normal Range: low risk150-199 1.70-2.25 Borderline high200-499 2.26-5.65 High>500 >5.65 Very high: high riskTable 5 American Heart Association classification forBMI [30]BMI (kg/m2) CategoryBelow 18.5 Underweight18.5 to 24.9 Healthy weight25.0 to 29.9 Overweight30 to 39.9 Obese40 and above Morbidly obeseSamadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 9 of 18http://www.jbiomedsem.com/content/3/1/6HighRiskBMI=PatientRecord and(sio:hasAttribute some(cardio:BodyMassIndex and sio:hasMeasurement some(sio:Measurement and(sio:hasUnit value cardio:kilogram-per-meter-squared) and(sio:hasValue some double[> = 25.0]))))The schematic diagram of the SADI Web Service inter-face for BMI calculation is shown in Figure 3. The inputand output of the Service is as follows (sample data, andinstructions on how to send this data to the SADI service,are provided in the Supplementary Information [27]):Input:(sio:hasAttribute some cardio:Height) and(sio:hasAttribute some cardio:Mass)Output:sio:hasAttribute some(cardio:BodyMassIndex and(sio:hasMeasurement some ( sio:hasMeasurement and(sio:hasUnit value cardio:kilogram-per-meter-squared and(sio:hasValue some Literal)))Subsequently, we calculated LDL in a similar fashionusing SADI-compliant Semantic Web Services. . LDL is cal-culated based on HDL measurements via the Friedewaldequation (Equation 2 [26]) and the guidelines were modeledin OWL following the guidelines in Table 6. Note that theFriedewald equation includes a constant that is sensitive tothe units HDL is measured in; however since we are expli-citly declaring and automatically converting units, this ser-vice is able to automatically determine which is the correctconstant to use in every case.Approach to ternary risk assessmentsIn addition to the somewhat trivial binary risk assess-ments described above, we wish to determine whethermore complex clinical phenotype and risk classificationscan be automated using the same infrastructure. For ex-ample, some clinicians are more interested in estimatingthe probability of a patient developing a certain type ofcardiovascular disease within a specific period of time.Researchers have developed a variety of algorithms forestimating a patient’s statistical probability of death(from cardiovascular disease) or of developing a varietyof cardiovascular diseases, with one of the most widelyadopted being the Framingham Risk Scores [8]. There area number of different Framingham Risk Scores centeredaround different cardiovascular diseases (e.g., CongestiveHeart Failure versus Atrial Fibrillation), the period of timeunder which the risk assessment is calculated (e.g., 5 yearversus 10 year risk), and the precise Framingham standardused. For instance, the same patient clinical observationsmight be categorized as “high risk” using Canadian Stan-dards, but categorized as “medium –high risk” usingAmerican or European Standards.To test our ability to automatically classify patients intocomplex risk-stratification models such as Framingham, wecreated OWL models of the Framingham Risk Scores forGeneral Cardiovascular Disease in Men [31]. Table 7 showsthe scoring framework proposed by the Framingham studyto calculate the estimated risk score for General Cardiovas-cular Disease in men based on the mean values for clinicalobservations. Similar tables exist for women and othercardiovascular diseases such as Arterial Fibrillation,Congestive Heart Failure, Coronary Heart Disease,General Cardiovascular Disease, Hard Coronary HeartDisease, Intermittent Claudication, Recurring CoronaryHeart Disease, Stroke After Atrial Fibrillation.In our dataset, clinician had annotated the recordswith three scores: “high risk”, “low risk” and “moderaterisk”. For this study, we only considered the records ofsio:Measurementcardio:Mass cardio:Heightsio:Measurementcardio:Mass cardio:Heightcardio:BMIFigure 3 The schematic diagram of the SADI web service interface to the BMI calculation service. The property-restriction imposed on theoutput, when detected by SHARE, triggers the discovery and invocation of the Service that attaches the BMI class with appropriate units andvalue properties attached to it.Samadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 10 of 18http://www.jbiomedsem.com/content/3/1/6male patients and records with no missing values for thevarious observations required to make a risk evaluation.The conventional classification used in Canadian healthcare system is based on three levels of quantization (0–9:low Risk, 10–19: Medium risk, > = 20: High risk) over theaccumulated individual risk score (Table 8).The input and output classes for SADI web service to cal-culate the Framingham Risk Score are defined as follows:Input:PatientRecord and(sio:hasAttribute some cardio:Age) and(sio:hasAttribute some cardio:SerumCholesterolConcentration)and(sio:hasAttribute some cardio:SerumHDLCholesterolConcentration) and(sio:hasAttribute some cardio:SystolicBloodPressure)Output:sio:hasAttribute some(GeneralCVDFraminghamRiskScore and(sio:hasValue some Literal))Since an OWL class representing Risk Score did notexist in any of the Ontologies we were using, we defineda class named RiskScore and a second, GeneralCVDFra-minghamRiskScore, which is a subclass of the former.ResultsEvaluation of automated binary risk classificationEvaluation of our ability to dynamically reproduce the ori-ginal clinical classifications, using the approaches describedabove, was undertaken as follows: In the dataset, when theclinician had indicated the patient was "at risk" for a giventype of observation, this was represented as a numeric "1",while if they indicated the patient was not at risk, we repre-sented this as a numeric "0". We then used our "HighRisk"and "LowRisk" OWL Classes in SPARQL queries, calling-up the clinician-annotated numerical score in the samequery. For each HighRisk query, we would expect the clini-cians score to be "1" in all cases if our automated analysis isfunctioning correctly, and should be "0" in all cases for theLowRisk queries. Figure 4 shows two queries for SBP mea-surements and their clinician-assigned risk grade, togetherwith a screen-shot of the abbreviated output for each query.If the system is calculating risk correctly, then all results ofthe query for high risk (Figure 4A) should be assigned ascore of "1" by the clinician, and similarly the results of thequery for low risk (Figure 4B) should be assigned a score of"0". Similar queries were issued for DBP, Chol, HDL, TG,and BMI attributes. Table 9 shows the comparison betweenmanual and automatic risk classification for all attributes inthe dataset. In most cases, our automated analysis of theTable 7 Estimated risk of general cardiovascular disease in men [31]Points Age, y HDL Total cholesterol SBP not treated SBP treated Smoker Diabetic−2 60+ <120−1 50-590 30-34 45-49 <160 120-129 <120 No No1 35-44 160-199 130-1392 35-39 <35 200-239 140-159 120-1293 240-279 160+ 130-139 No4 280+ 140-159 Yes5 40-44 160+6 45-4978 50-54910 55-5911 60-6412 65-691314 70-7415 75+Table 6 LDL guidelines [29]Level (mg/dl) Level (mmol/L) Interpretation<129 <3.3 Desirable level130-159 3.3-4.1 Borderline high risk>160 >4.1 High riskSamadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 11 of 18http://www.jbiomedsem.com/content/3/1/6data was entirely concordant with the expert annotations ofthe clinician; however, there were several cases of discrep-ancy as discussed in the next section. More detailed query/result pairs, plus before/after categorization data for all clin-ical observations can be found as supplementary materialat [27].Discrepancies between automated and expert binaryclassificationsSystolic and diastolic blood pressure risksClassifying patients as being “high risk” or “low risk”based on blood pressure was consistent with manualcuration of experts in every case.LDLSimilar to SBP and DBP, for LDL manual and automaticclassifications were consistent.Total cholesterol riskSome patient risk classifications differed between ourautomated analysis and the expert annotations. In eachcase, the risk score fell between 5 and 5.2. Interestingly,in the American Heart Association guidelines (Table 4)there is a gap in their measurement-continuum, resul-ting in a lack of any interpretation-guidance for mea-surements between 5 and 5.2. Our automated analysistherefore revealed that the clinical expert had com-pensated for this gap by assigning these measurementsto the "low risk" category. By modifying our OWL modelto change the low risk cut-off level from 5 to 5.2, wewere then able to achieve perfect correspondence withthe clinical expert; moreover, this correspondence showsthat the clinician had used this 5.2 boundary as theirupper limit for low-risk when undertaking their binaryclassification.Original AHA guideline in OWL:HighRiskCholesterolRecord=PatientRecord and(sio:hasAttribute some(cardio:SerumCholesterolConcentration andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:mili-mole-per-liter) and(sio:hasValue some double[> = 5.0]))))LowRiskCholesterolRecord=PatientRecord and(sio:hasAttribute some(cardio: SerumCholesterolConcentration andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:mili-mole-per-liter) and(sio:hasValue some double[< 5.0]))))Modified model:HighRiskCholesterolRecord=PatientRecord and(sio:hasAttribute some(cardio:SerumCholesterolConcentration andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:mili-mole-per-liter) and(sio:hasValue some double[> = 5.2]))))LowRiskCholesterolRecord=PatientRecord and(sio:hasAttribute some(cardio: SerumCholesterolConcentration andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:mili-mole-per-liter) and(sio:hasValue some double[< 5.2]))))HDL and Triglyceride riskHaving no guidance on how to build the model in thesecases where the clinical (binary) classification had nocorrespondence to the three or four level categorizationsystem of the official guidelines, we first modeled theextreme cases (high/low, ignoring borderline/mediumcategories), expecting to find complete congruencewith the expert annotation at least for these patients.Surprisingly, in neither case did our automated cate-gorization match the expert clinical categorization.Table 8 10-year risk for general CVD by totalFramingham Risk Score [31]Total points 10-year risk<9 <1%9 1%10 1%11 1%12 1%13 2%14 2%15 3%16 4%17 5%18 6%19 8%20 11%21 14%22 17%23 22%24 27%25 or more ≥30%Samadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 12 of 18http://www.jbiomedsem.com/content/3/1/6We determined (by manual inspection) that in thesecases the clinician did not follow any of the guidelinecategory boundaries for their binary classification ra-ther, they "invented" boundaries reflecting their per-sonal opinion of risk. In the case of HDL, theboundary was well under the official lower limit(0.89 mmol/L compared to the official boundary of1.03 mmol/L), whereas for Triglyceride measurementsthe clinician chose a cutoff between the guidelinesrange for "High" risk (2.26-5.65 mmol/L). The originalOWL models, and the adjusted OWL models areshown below. The adjusted models provided perfectcorrespondence with the expert clinical classificationwhen used in our automated framework.HDLOriginal AHA guideline in OWL:HighRiskHDLCholesterolRecord=PatientRecord and(sio:hasAttribute some(cardio:SerumHDLCholesterolConcentration andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:mili-mole-per-liter) and(sio:hasValue some double[<= 1.03]))))LowRiskHDLCholesterolRecord=PatientRecord and(sio:hasAttribute someABFigure 4 SPARQL queries (Prefixes not shown) followed by a small snapshot of the results for automatic classification of patients into“high risk” (A) and “low risk” (B) for Systolic Blood Pressure. Note that, because of unit conversion layer, the units used to model theguideline may or may not be the same as the unit used to model clinical data.Samadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 13 of 18http://www.jbiomedsem.com/content/3/1/6(cardio: SerumCholesterolConcentration andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:mili-mole-per-liter) and(sio:hasValue some double[> 1.55]))))Modified model:HighRiskHDLCholesterolRecord=PatientRecord and(sio:hasAttribute some(cardio:SerumHDLCholesterolConcentration andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:mili-mole-per-liter) and(sio:hasValue some double[<= 0.89]))))LowRiskHDLCholesterolRecord=PatientRecord and(sio:hasAttribute some(cardio: SerumCholesterolConcentration andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:mili-mole-per-liter) and(sio:hasValue some double[> 0.89]))))TriglycerideOriginal AHA guideline in OWL:HighRiskTriglycerideRecord=PatientRecord and(sio:hasAttribute some(cardio:SerumTriglycerideCholesterolConcentration andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:mili-mole-per-liter) and(sio:hasValue some double[> = 2.26]))))LowRiskTriglycerideRecord=PatientRecord and(sio:hasAttribute some(cardio: SerumTriglycerideConcentration andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:mili-mole-per-liter) and(sio:hasValue some double[<1.69 ]))))Modified model:HighRiskTriglycerideRecord=PatientRecord and(sio:hasAttribute some(cardio:SerumTriglycerideCholesterolConcentration andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:mili-mole-per-liter) and(sio:hasValue some double[> = 2.63]))))LowRiskTriglycerideRecord=PatientRecord and(sio:hasAttribute some(cardio: SerumTriglycerideConcentration andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:mili-mole-per-liter) and(sio:hasValue some double[<2.63 ]))))Body Mass Index riskSimilarly, we determined from our results that theguideline used by the expert in their classification wasmore relaxed than the AHA guidelines. By changing thethreshold in our OWL class definition from 25 to 26, wewere able to achieve perfect correspondence with theexpert’s annotations. It is important to point-out, withrespect to this measurement, that we did not need tomodify the analytical Web Service in order to achievethis correspondence - only the OWL model needed tobe adapted to match the interpretation of the clinicalexpert. The significance of this observation will be dis-cussed later.Original AHA guideline in OWL:HighRiskBMIRecord=PatientRecord and(sio:hasAttribute some(cardio:BodyMassIndex andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:kilogram-per-meter-squared) and(sio:hasValue some double[> = 25.0]))))LowRiskBMIRecord=PatientRecord and(sio:hasAttribute some(cardio:BodyMassIndex andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:kilogram-per-meter-squared) and(sio:hasValue some double[< 25.0]))))Modified model:HighRiskBMIRecord=PatientRecord andTable 9 Comparison between manual and automaticbinary risk classificationsTrue positive rate “at risk” % False positive rate”at risk” %SBP 100 0DBP 100 0CHOL 92.6 0HDL 100 56.5TG 100 8.5BMI 100 18.8LDL 100 0Correctness represents the degree of fidelity of automatic classification to thatof the expert.Samadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 14 of 18http://www.jbiomedsem.com/content/3/1/6(sio:hasAttribute some(cardio:BodyMassIndex andsio:hasMeasurement some ( sio:Measurement and(sio:hasUnit value cardio:kilogram-per-meter-squared) and(sio:hasValue some double[> = 26.0]))))LowRiskBMIRecord=PatientRecord and(sio:hasAttribute some(cardio:BodyMassIndex andsio:hasMeasurement some (sio:Measurement and(sio:hasUnit value cardio:kilogram-per-meter-squared) and(sio:hasValue some double[< 26.0]))))Evaluation of automated ternary risk classificationFigure 5 shows the SPARQL query which automaticallyclassifies patient records into the "moderate risk"Framingham guidelines OWL model, compared with theannotations done manually by experts (see supplemen-tary material [27] for other Framingham guidelinequery/result pairs). Below this is an abbreviated table ofexemplar query output specifically showing rows of dis-crepancy which are of particular interest for discussion.Discrepancies between automated and expert ternaryclassificationsFraminghamriskNo expert-annotated “high risk record” was classified as“low risk record” by automatic classification or vice versaonly the “moderate risk records” were differentially-classified by our automated approach compared to theclinical expert classification. Interestingly, however, theautomated interpretations included both higher- andlower-risk classifications compared to the expert an-notations. As can be seen in the first two rows of theFigure 5 results table, the same calculated Framinghamrisk-score of 15 was classified as being “low risk” and“medium risk” respectively by the expert clinician, whileanother "medium risk" score (19) was classified as "highrisk" by the expert. After trouble-shooting the code andthe ontological definitions, we examined the scores todetermine if, as with the binary classifications above, itwould be possible to improve our performance by relaxingor tightening certain constraints in the OWL class defini-tions, however we determined that this was not possible.This suggests that other factors, not captured by theguidelines, have led the clinical expert to select one or theother risk category for any given patient. In discussionswith the clinician, we learned that the patients were undervarying regimes of pharmaceutical blood pressure treat-ment, and that this would have affected their risk-assess-ment. We are undertaking a follow-up study in which weattempt to semantically model and add drug treatmentregimes to the patient's profiles and our risk models to de-termine if this is sufficient to resolve all cases of mis-classification by our automated system, or if there remainyet additional factors that are being used by clinicians tomake their risk assessment. Regardless, we may not everbe able to determine, with any certainty, the bases for theoriginal risk classifications, and this is an important pointfor discussion.DiscussionInterpreting discrepancies between automated andmanual risk classificationIt is first important to note that the data in our study - inparticular, the risk classifications of the patients -were not used for the purpose of selecting an inter-vention in the course of the patient's clinical care.We presumed that clinical researchers would useSELECT ?patientrecord ?calculatedrisk ?riskgradeFROM <http://cardio-soroush.rhcloud.com/framingham/patients.rdf> WHERE{?patientrecord rdf:type cardio:MediumRiskFraminghamScoreRecord.?patientrecord cardio:ExpertFraminghamGrade ?riskgrade.?patientrecord cardio:hasAttribute ?attr.?attr rdf:type cardio:GeneralCVD10YearFraminghamRiskScore.?attr cardio:hasValue ?calculatedrisk}Calculated Risk Score Calculated Risk Grade Expert-assigned Grade 15 2 1 15 2 2 19 2 3Figure 5 SPARQL queries and a small snapshot of the results for automatic classification of patients into “high risk”, “medium risk”and “low risk”, respectively.Samadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 15 of 18http://www.jbiomedsem.com/content/3/1/6existing published guidelines for categorization in thecourse of their clinical research, but this was notnecessarily a valid premise.In our results, we note a variety of discrepancies bet-ween our initial OWL models' rigorous adherence topublished clinical standards, and the evaluation andphenotypic classification by the expert clinical re-searcher. Some of these were due to missing data in theguidelines themselves, where we were able to, with rea-sonable confidence, guess what the clinician's interpret-ation of the guideline was and model that interpretation.Others were due to the researcher "bending" the guide-lines either to match their personal beliefs, or because itwas more appropriate for the research question theywere asking. We were similarly able to (as far as we cantell) accurately modify the OWL models to match theclinician's expert opinion in many of these cases. Somecases, however, have so far eluded our ability to capture,in OWL, what the intent or rationale of the researcherwas. Nevertheless, assuming that the decisions are not"arbitrary", we are confident that with further study wewill be able to construct OWL models that correspondto these complex clinical interpretations. Moreover,while in this pilot study, we manually modified the cutofflevels of the OWL models after visual inspection of thedata, our subsequent studies undertake to determine theseboundaries using data-mining and pattern-detectionapproaches, thus this should not be considered an insur-mountable weakness of the current work.That experts (at least in our case, but we believe it islikely to be true for many cases) do not strictly followpublished guidelines when classifying patients in theirclinical research is, in itself, not surprising; however, itdoes have implications for both reproducibility of clin-ical studies, as well as the accuracy and interpretation ofstatistically sensitive high-throughput studies such asGWAS. Potential factors that influence experts to devi-ate from guidelines may include clinical observationsoutside of those that make up the guideline, or othernon-clinical yet measurable/detectable features. Regard-less, it is important for reproducibility and rigor that ex-perimental methods be fully explained and detailed, yetat the same time it is undesirable (in fact, likely impos-sible) to force clinical researchers to follow guidelineswhich go against their expert beliefs. As such, a middle-ground is needed where experts retain their "perso-nalized" classification system, and yet have this systemformally encoded in a transparent, publishable, and re-usable manner.In this study, we demonstrate that the semantic mo-deling approach we advocate here provides re-usable,rigorous models which are nevertheless flexible, allowingindividual, personalized expert-knowledge to be enco-ded, published, and shared. Moreover, these rigorous yetpersonalized ontological models can be used to drive theautomated analysis of data, removing the individual fromthe analytical process. This is important because "ana-lysis tweaking", based in human intervention in the ana-lytical or interpretative process, historically has goneunrecorded and thus led to non-reproducible science.Our approach, while not preventing the expert fromimposing their own interpretation on the data (in fact,encouraging it!), ensures that in order to "tweak" the ana-lysis, such an intervention must be made explicit in theirontological model; moreover, the resulting ontology canbe published together with the study results to ensuretransparency. Not only does this facilitate reproducibilityof the study by making the personal expert opinion/in-terpretation accessible to other researchers, but it alsoallows explicit and accurate comparison between theformally-encoded expert opinions of a diverse commu-nity of clinical researchers, and the ability to use a third-party interpretation to investigate your own data - i.e.the ability to "see your data through the eyes of another".We think this is a powerful new approach to transparentand reproducible clinical research, where ideas andinterpretation-regimes are explicitly recorded, shared,and compared.Broader implications of "personalizing" OWL ontologiesOur use of OWL in this study differs markedly from thenorm in the biomedical community, where ontologiesare used primarily to compel harmonization around aparticular world view, thus facilitating cross-study com-parisons by dis-allowing individual opinion or interpre-tation. In contrast, we began from the perspective thatindividual clinicial researchers would insist upon theirauthority, as experts, to classify patients in whatever waythey thought was correct for a particular study, andwould resist forced adherence to guidelines (in fact, insome cases it is the guidelines themselves that are thetopics of investigation and evaluation). Indeed, we dem-onstrate in this study that clinicians frequently deviatefrom established clinical guidelines, yet we also demon-strate that OWL classes can be constructed to model anindividual clinician's expert perspective, thereby makingtheir interpretation transparent, and re-usable in a rigor-ous manner. Most importantly, however, our inability insome cases to accurately reproduce the interpretation ofthe expert post facto, even after manually re-modeling theguidelines, shows the danger of not capturing these per-sonal, expert perspectives in some formal framework suchas OWL at the time the experiment is being run. Theseontologies, representing individual perspectives on howdata should be interpreted, resemble in silico hypotheses -the belief system of the individual undertaking the study,which may or may not be correct and/or shared by anyother researcher. In this study, we demonstrate that theseSamadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 16 of 18http://www.jbiomedsem.com/content/3/1/6clinical hypotheses can be automatically evaluated overreal patient data using existing Semantic Web tools andframeworks.ConclusionsThis study had several, largely methodological, objec-tives. First, there are a large number of "legacy" datasetsthat would be of benefit to researchers if they were pub-lished on the Semantic Web. We demonstrated a work-able path for conversion and publication of thesedatasets that provided advantages beyond simply makingthe data available as "triples", but also in making itsemantically transparent such that it could be easily re-analyzed by third-party researchers using their own clas-sification frameworks. Second, the majority of ontologiesavailable in the life sciences to date are class hierarchies,where the labels of each class are largely used to stan-dardize annotations. The ability to logically reason overthese labels is quite limited, thus inhibiting their use forautomated annotation and classification of data. Never-theless, these ontologies are increasingly comprehensiveand reflect expert consensus of what concepts are rele-vant in a given domain. Here, we proposed and demon-strated a path for extending an existing ontology suchthat it could be utilized by DL reasoners to dynamicallyclassify and interpret datasets - a process that is cur-rently done largely by experts. Third, we demonstratedthat clinical phenotype classification systems could bemodeled in the OWL language by taking advantage ofthe rich, axiomatic structure of OWL-DL ontologies,and a variety of analytical Web Services. We showedhow this combination of ontologies and Services can beused to make clinical data analyses both more transpar-ent and more automated. Finally, we showed that indi-vidual clinicians deviate from established clinicalguidelines at every layer of an analysis, and this de-monstrates the need for a formal, yet personalized clinicalinterpretation framework to ensure transparency and repro-ducibility. We demonstrate that this can be achieved bycreating and publishing "personalized" OWL ontologies.EndnotesaIt should be noted that even though ‘kilogram” isconsidered a base unit in SI, it is composed of kilo prefixplus base unit “gram”. In this sense, kilogram is an ex-ception of a unit which is considered as base unit in SIsystem. However, to stick to our design schema, we con-sidered “gram” as the base and defined kilogram as anextension of it.AbbreviationsAHA: American heart association; BMI: Body mass index; CABG: Coronaryallograft bypass graft; DBP: Diastolic blood pressure; G: Gram; HDL: High-density lipoprotein; Kg: Kilogram; kg/m2: Kilogram per meter squared;KPa: Kilopascal; LDL: Low-density lipoprotein; M: Meter; mmHg: Millimeter ofmercury column; OWL-DL: Web ontology language - description logic;Pa: Pascal; PATO: Phenotypic quality ontology; PCI: Percutaneous coronaryintervention; RDF: Resource description framework; SADI: Semanticautomated discovery and integration; SBP: Systolic blood pressure;SIO: SemanticScience integrated ontology; TG: Triglyceride.Competing interestsThe authors have no competing interests to declare.Authors' contributionsSS and MW planned this work, and jointly wrote this manuscript. SSexecuted the data migration, ontology design and extension, web servicedeployment, and overall analysis. BM generated and initially analyzed thesource clinical data-set, and discussed and validated our approach andchoice of clinical standards. All authors have read, revised and approved thismanuscript.AcknowledgementsThis work is part of the CardioSHARE initiative, founded through a specialinitiatives award from the Heart and Stroke Foundation of British Columbiaand Yukon, with subsequent funding from Microsoft Research and anoperating grant from the Canadian Institutes for Health Research (CIHR).Core laboratory funding is derived through an award from the NaturalSciences and Engineering Research Council of Canada. The authorsrecognize the fiscal, operational and scientific support of the NCE CECRPROOF Centre of Excellence.Author details1UBC James Hogg Research Center, Institute for Heart Lung Health, St. Paul'sHospital, Room 166, Burrard Building 1081 Burrard Street, Vancouver, BC,Canada V6Z 1Y6. 2NCE CECR Center of Excellence for the Prevention ofOrgan Failure (PROOF Centre), St. Paul's Hospital, Room 166, Burrard Building1081 Burrard Street, Vancouver, BC, Canada V6Z 1Y6. 3Centro deBiotecnología y Genómica de Plantas, Universidad Politécnica de Madrid,Madrid, Spain.Received: 21 October 2011 Accepted: 13 May 2012Published: 20 July 2012References1. RDF Semantic Web Standards. [http://www.w3.org/RDF/]2. OWL Web Ontology Language Overview. [http://www.w3.org/TR/owlfeatures/]3. Zakkar M, Hornick P: Surgery for coronary artery disease. Surgery (Oxford)2007, 25(5):231–37.4. Chobanian AV, Bakris GL, Black HR, Black HR, Cushman WC, Green LA, IzzoJL Jr, Jones DW, Materson BJ, Oparil S, Wright JT Jr, Roccella EJ: SeventhReport of the Joint National Committee on Prevention, Detection,Evaluation, and Treatment of High Blood Pressure. JAMA 2003,290(2):197.5. Global Initiative for Chronic Obstructive Lung Disease. [http://www.who.int/respiratory/copd/GOLD_WR_06.pdf]6. OpenGALEN Mission Statement. [http://www.opengalen.org/index.html]7. Dawber TR, Moore FE, Mann GV: Coronary heart disease in theFramingham study. Am J Public Health 1957, 47(3):4–24.8. Ridker PM, Buring JE, Rifai N, Cook NR: Development and validation ofimproved algorithms for the assessment of global cardiovascular risk inwomen: the Reynolds Risk Score. JAMA 2007, 297(6):611–619.9. BioMOBY Interoperability Today, Integration Tomorrow. [http://sadiframework.org/documentation/MOBY_IMB_UQ2005.ppt]10. Wilkinson MD, Vandervalk B, McCarthy L: The Semantic AutomatedDiscovery and Integration (SADI) Web service Design-Pattern, API andReference Implementation. J Biomed Semantics 2011, 2:8.11. Vandervalk B, McCarthy L, Wilkinson MD: SHARE: A Semantic Web QueryEngine for Bioinformatics. The Semantic Web, Lecture Notes in ComputerScience proceedings of the ASWC 2009, 5926:367–369.12. DOLCE units of measurements. [http://www.w3.org/2001/sw/BestPractices/WNET/DLP3941_daml.html#measurement-unit%20#5]13. Claudio Masolo, Stefano Borgo: Qualities in Formal Ontology. InFoundational Aspects of Ontologies (FOnt 2005) Workshop. Edited by. Trento,Italy: 2005:2–16.Samadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 17 of 18http://www.jbiomedsem.com/content/3/1/614. Measurement Unit Ontology. [http://forge.morfeoproject.org/wiki_en/index.php/Units_of_measurement_ontology#Measurement_Units_Ontology_.28MUO.29]15. International System of Units. 8th edition. [http://www.bipm.org/utils/common/pdf/si_brochure_8_en.pdf]16. Garabed Eknoyan: Adolphe Quetelet (1796–1874)—the average man andindices of obesity. Nephrol Dial Transplant 2008, 23(1):47–51.17. School of Computer Science. [http://www.cs.manchester.ac.uk/]18. A brief history. [http://www.opengalen.org/tutorials/grail/tutorial21.html]19. Isabel Cruz: The semantic web - ISWC 2006. 5th International Semantic WebConference. GA, USA: SWC 2006 Athens; 2006:664.20. Dumontier M: Semanticscience Integrated Ontology (SIO). [http://code.google.com/p/semanticscience/wiki/SIO]21. Why and how SIO differs from OBO Foundry effort. [http://groups.google.com/group/sio-ontology/browse_thread/thread/e5aa67843aaad402?pli=1]22. Gruber TR: A translational approach to portable ontologies specifications.Knowledge Acquisition 1993, 5(2):199–220.23. Dumontier M, Villanueva-Rosales N: Modeling Life Science Knowledgewith OWL 1.1. OWL Experiences and Design (OWLED-DC 2008.Washington DC, USA.24. Units of Measurements. [http://bioportal.bioontology.org/ontologies/45500?p=terms]25. Phenotypic Quality Ontology. [http://obofoundry.org/cgi-bin/detail.cgi?quality]26. Friedewald WT, Levy RI, Fredrickson DS: Estimation of the concentration oflow-density lipoprotein cholesterol in plasma, without use of thepreparative ultracentrifuge. Clinical Chemistry 1972, 18(6):499–502.27. Supplementary Data. [http://cardio-soroush.rhcloud.com/framingham/supplementary.pptx]28. What cholesterol levels mean. [http://www.heart.org/HEARTORG/Conditions/Cholesterol/AboutCholesterol/What-Your-Cholesterol-Levels-Mean_UCM_305562_Article.jsp]29. Cholesterol Levels. [http://lightup-ym.com/2011/04/10/the-cholesterol-levels/]30. Body Mass Index (BMI). [http://www.heart.org/HEARTORG/GettingHealthy/WeightManagement/BodyMassIndex/Body-Mass-Index-BMI-Calculator_UCM_307849_Article.jsp]31. Framingham Heart Study. [http://www.framinghamheartstudy.org/risk/index.html]doi:10.1186/2041-1480-3-6Cite this article as: Samadian et al.: Extending and encoding existingbiological terminologies and datasets for use in the reasoned semanticweb. Journal of Biomedical Semantics 2012 3:6.Submit your next manuscript to BioMed Centraland take full advantage of: • Convenient online submission• Thorough peer review• No space constraints or color figure charges• Immediate publication on acceptance• Inclusion in PubMed, CAS, Scopus and Google Scholar• Research which is freely available for redistributionSubmit your manuscript at www.biomedcentral.com/submitSamadian et al. Journal of Biomedical Semantics 2012, 3:6 Page 18 of 18http://www.jbiomedsem.com/content/3/1/6

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.52383.1-0215908/manifest

Comment

Related Items