You may notice some images loading slow across the Open Collections website. Thank you for your patience as we rebuild the cache to make images load faster.

Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A deep learning framework for wall motion abnormality detection in echocardiograms Asgharzadeh, Parisa 2020

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


24-ubc_2020_may_asgharzadeh_parisa.pdf [ 6.2MB ]
JSON: 24-1.0388865.json
JSON-LD: 24-1.0388865-ld.json
RDF/XML (Pretty): 24-1.0388865-rdf.xml
RDF/JSON: 24-1.0388865-rdf.json
Turtle: 24-1.0388865-turtle.txt
N-Triples: 24-1.0388865-rdf-ntriples.txt
Original Record: 24-1.0388865-source.json
Full Text

Full Text

A Deep Learning Framework for Wall MotionAbnormality Detection in EchocardiogramsbyParisa AsgharzadehB.A.Sc, Isfahan University of Technology, 2017A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFMaster of Applied ScienceinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Electrical and Computer Engineering)The University of British Columbia(Vancouver)March 2020c© Parisa Asgharzadeh, 2020The following individuals certify that they have read, and recommend to the Fac-ulty of Graduate and Postdoctoral Studies for acceptance, the thesis entitled:A Deep Learning Framework for Wall Motion Abnormality Detectionin Echocardiogramssubmitted by Parisa Asgharzadeh in partial fulfillment of the requirements for thedegree of Master of Applied Science in Electrical and Computer Engineering.Examining Committee:Purang Abolmaesumi, Electrical and Computer EngineeringSupervisorRobert Rohling, Electrical and Computer EngineeringSupervisory Committee MemberJane Wang, Electrical and Computer EngineeringAdditional ExamineriiAbstractCoronary Artery Disease (CAD) is the leading cause of morbidity and mortality indeveloped nations. In patients with acute or chronic obstructive CAD, Echocar-diography (ECHO) is the standard-of-care for visualizing abnormal ventricular wallthickening or motion which would be reported as Regional Wall Motion Abnormal-ity (RWMA). The accurate identification of regional wall motion abnormalities isessential for cardiovascular assessment and myocardial ischemia, coronary arterydisease and myocardial infarction diagnosis. Given the variability and challengesof scoring regional wall motion abnormalities, we propose the development of aplatform that can quickly and accurately identify regional and global wall motionabnormalities on echo images.This thesis describes a deep learning-based framework that can aid physiciansto utilize ultrasound for wall motion abnormality detection. The framework jointlycombines image data and patient diagnostic information to determine both globaland clinically-standard 16 regional wall motion labels. We validate the approachon a large cohort of echo studies obtained from 953 patients. We then report theperformance of the proposed framework in the detection of wall motion abnormal-ity. An average accuracy of 69.2% for the 16 regions and an average accuracy of69.5% for global wall motion abnormality were achieved.To the best of our knowledge, our proposed framework is the first to analyzeleft ventricle wall motion for both global and regional abnormality detection inechocardiography data.iiiLay SummaryIdentification of patients with regional wall motion abnormalities is beneficial forearly detection of any coronary artery disease not evident by symptoms. Conven-tional methods for assessment of RWMAs, which are based on visual interpretationof endocardial excursion and myocardial thickening, are observer variants and de-pend on the experience level of the echocardiographer. Considering the variabilityand challenge of coding RWMAs, an effective model for the reduction of the mis-reading of RWMAs is required. Thus, we propose the development of a machinelearning platform that can quickly and accurately identify regional wall motionabnormalities on echo images. Such a tool would have several applications to im-prove the accuracy and consistency of RWMA reporting with bedside echo at thepoint of care.ivPrefaceThis thesis is predominantly based on a pending journal submission. The pre-sented work involves collaboration among multiple students, professors, sonogra-phers and cardiologists at the University of British Columbia, the Department ofElectrical and Computer Engineering and Vancouver General Hospital. The studyis conducted under the approval of the University of British Columbia (UBC) Re-search Ethics Board, certificate number H16-02624, provided by the VancouverCoastal Health Research Ethics Board.The author has developed and implemented the proposed framework for wallmotion abnormality detection, as well as evaluating the solution on the createdlocal database of corresponding data. Professor Purang Abolmaesumi and Drs.Teresa Tsang and Christina Luong helped with technical guidance and insight intothe problem being addressed.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xGlossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv1 Introduction and Background . . . . . . . . . . . . . . . . . . . . . 11.1 Clinical Background . . . . . . . . . . . . . . . . . . . . . . . . 11.1.1 Heart . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.2 Echocardiography . . . . . . . . . . . . . . . . . . . . . 51.1.3 Wall Motion Abnormality (WMA) . . . . . . . . . . . . . 71.2 Machine Learning in WMA Detection . . . . . . . . . . . . . . . 151.2.1 Deep Neural Networks . . . . . . . . . . . . . . . . . . . 151.2.2 Convolutional Neural Networks . . . . . . . . . . . . . . 171.2.3 Recurrent Neural Networks . . . . . . . . . . . . . . . . 181.2.4 Hyper-Parameter Optimization . . . . . . . . . . . . . . . 21vi1.2.5 Applications of ML in WMA detection . . . . . . . . . . 211.3 Thesis Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 231.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231.5 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.1 Ethics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.2 Echocardiography Data . . . . . . . . . . . . . . . . . . . . . . . 272.2.1 Echocardiography Retrospective Data . . . . . . . . . . . 272.2.2 Echo Data Download and Processing . . . . . . . . . . . 292.3 Analytical Measurements . . . . . . . . . . . . . . . . . . . . . . 292.3.1 Filemaker Measurements . . . . . . . . . . . . . . . . . . 302.3.2 Analytical Data Download and Processing . . . . . . . . . 312.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.1 Relevant Cardiac Echo View Selection for WMA . . . . . . . . . 373.2 Automatic WMA Detection Deep Convolutional Network . . . . . 393.2.1 Spatial Feature Extractors . . . . . . . . . . . . . . . . . 393.2.2 Temporal Feature Aggregators . . . . . . . . . . . . . . . 413.2.3 Spatio-temporal Framework . . . . . . . . . . . . . . . . 423.2.4 Regularization and Data Augmentation . . . . . . . . . . 433.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454 Experiments and Results . . . . . . . . . . . . . . . . . . . . . . . . 464.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . 464.1.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.1.2 Wall Motion Abnormality labels . . . . . . . . . . . . . . 474.1.3 Network Architecture . . . . . . . . . . . . . . . . . . . . 474.1.4 Hyper-parameters . . . . . . . . . . . . . . . . . . . . . . 484.1.5 Regularization and Data Augmentation . . . . . . . . . . 484.1.6 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . 494.1.7 Implementation . . . . . . . . . . . . . . . . . . . . . . . 504.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51vii4.3 Discussion and Summary . . . . . . . . . . . . . . . . . . . . . . 545 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61A Supporting Materials . . . . . . . . . . . . . . . . . . . . . . . . . . 69viiiList of TablesTable 1.1 The complete list of all 17 segments of left ventricle wall. . . . 10Table 2.1 The Wall Motion Abnormality data frequency in each region. . 33Table 3.1 The dataset composition in terms of number of studies fromeach type of the 14 standard echocardiography views (A#C: api-cal #-chamber view, PLAX: parasternal long-axis view, RVIF:right ventricular inflow view, S#C: subcostal #-chamber view,IVC: subcostal inferior vena cava view, PSAX-A: parasternalshort-axis view at aortic valve, PSAX-M: PSAX view at mi-tral annulus valve level, PSAX-PM: PSAX view at mitral valvepapillary muscle level, PSAX-APEX: PSAX view at apex level,and SUPRA: Suprasternal view). . . . . . . . . . . . . . . . . 38Table 4.1 The co-variance matrix of the 16-segment labels and WMSI. . 53Table 4.2 The Global WMA accuracy per class comparison of experi-mented methods. . . . . . . . . . . . . . . . . . . . . . . . . 55Table 4.3 The A2C relevant RWMA accuracy per class comparison of ex-perimented methods. . . . . . . . . . . . . . . . . . . . . . . 56Table 4.4 The A4C relevant RWMA accuracy per class comparison of ex-perimented methods. . . . . . . . . . . . . . . . . . . . . . . 56Table 4.5 The PLAX relevant RWMA accuracy per class comparison ofexperimented methods. . . . . . . . . . . . . . . . . . . . . . 57ixList of FiguresFigure 1.1 Heart position in the thoracic cavity, located in the mediastinumbetween the lungs. (This image is available under a CreativeCommons Attribution License 2.0 at . . . . . . . . . . . . . . . . 2Figure 1.2 Human blood circulation in Heart. (This image is availableunder a Creative Commons Attribution Licence 2.0 at . . . . . . . . . . . . . . . . 4Figure 1.3 Demonstration of different echo modalities. . . . . . . . . . . 6Figure 1.4 Four main cardiac views. (Left) The transthoracic echocardio-gram displaying different structures in the heart. (Right) Ananatomical diagram of the corresponding view. . . . . . . . . 8Figure 1.5 LV regional wall motion analysis. Subfigure (a) shows the ef-fect of WMA severity on motion dysfunction of the wall. Ithas been advised to assess the wall motion of individual LVsegments visually and score them [29]. . . . . . . . . . . . . 9xFigure 1.6 ACC/AHA recommended “bulls-eye” plot of the 17 segmentmodel. The outer ring represents the basal segments, the mid-dle ring represents the segments at midpapillary muscle level,and the inner ring represents the distal level. The anteriorinsertion of the right ventricular wall into the left ventricledefines the border between the anteroseptal and anterior seg-ments. Starting from this point, the myocardium is subdividedinto six equal segments of 60 degrees. The apical myocardiumin is divided instead into four equal segments of 90 degrees.The apical cap is added in the center of the bull’s-eye. . . . . . 11Figure 1.7 (a) Demonstration of colour kinesis where lack of colour chang-ing depicts lack of wall thickening[37]. (b) Three-dimensionalechocardiographic images of the heart (apex view) [38]. (c)The main directions of deformation and strain imposed on theLV myocardium. (d) Example of measurements of the main2D Strain variables. Each of the coloured lines at the left pan-els denotes one of the six regions measured from the apical4-chamber window [12]. . . . . . . . . . . . . . . . . . . . . 14Figure 1.8 Demonstration of a deep convolutional neural network withfive hidden layers including convolutional, pooling and fully-connected layers. . . . . . . . . . . . . . . . . . . . . . . . . 17Figure 1.9 Demonstration of an LSTM block. mt−1 stands for the inputfrom a memory cell in time point t; xt is an input in time pointt; ht is an output in time point t that goes to both the outputlayer and the hidden layer in the next time point. . . . . . . . 20Figure 1.10 A systematic diagram of the workflow for automated RWMAprediction. After data are acquired in the clinic, it is storedin the Vancouver Coastal Health (VCH)’s servers along withthe cardiologist’s measurements. The machine learning modelwill then use these data to make predictions on RWMA. . . . 22Figure 2.1 A diagram designed to show the relationship between the Echo,Xcelera and Filemaker databases in a routine cardiology study. 28xiFigure 2.2 Snapshot of the wall motion abnormality section of FileMaker.By clicking on each region, the cardiologist will type a scorefrom 1 to 5 to each region to label the abnormality of the region. 30Figure 2.3 A block diagram of the relationship between the Echo databasetables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Figure 2.4 The Wall Motion Abnormality data distribution in different re-gions of the heart: (a) is data distribution in LCX region , (b)is data distribution in LAD region, and (c) is data distributionin RCA region. . . . . . . . . . . . . . . . . . . . . . . . . . 35Figure 2.5 Global WMA data distribution. . . . . . . . . . . . . . . . . . 36Figure 3.1 The cardiac view classifier network architecture. Related em-bedding are extracted from the individual frames by the Spatialembedding extractor (DenseNet blocks). The embeddings arethen fed into the Long Short-Term Memory (LSTM) blocksto extract the temporal information across 10 sequential echocine frames. . . . . . . . . . . . . . . . . . . . . . . . . . . . 38Figure 3.2 The proposed WMA network architecture. Spatio-temporalembeddings are extracted from the individual cines. The em-beddings are then fed into the FC blocks to connect the reason-ing between the information across echo cine frames. . . . . . 40Figure 3.3 The network architectures for WMA classification consideredin this work. (a) C2D are 2D Covolutions; (b) C2D+LSTM aretime distributed 2D Covolutions followed by LSTM units; (c)C2D+GRU are time distributed 2D Covolutions followed byGRU units; (d) D2D are 2D DenseBlocks followed by LSTMunits; and (e) D2D are 2D DenseBlocks followed by GRUunits. (f) C3D are 3D convolutions. For interpretability, theconnections are omitted. . . . . . . . . . . . . . . . . . . . . 44Figure 4.1 The value of loss for training and validation over each epoch. . 52Figure A.1 A snapshot of different tabs available in the Filemaker softwarefor cardiologists to records the analytical measurements. . . . 73xiiGlossary2D two-dimensional3D three-dimensionalA2C Apical Two-chamberA4C Apical Four-chamberCAD Coronary Artery DiseaseCCT Cardiac Computed TomographyCMR Cardiac Magnetic Resonance (Imaging)CNN Convolutional Neural NetworksCSV Comma-Separated ValuesDCNN Deep Convolutional Neural NetworksDICOM Digital Imaging and Communications in MedicineECG ElectrocardiogramECHO EchocardiographyFC Fully-Connected LayersGRU Gated Recurrent UnitLAD Left Anterior Descending ArteryxiiiLCX Left Circumflex ArteryLSTM Long Short Term MemoryLV Left VentricleMRN Medical Record NumberONEIROS Open-ended Neuro-Electronic Intelligent Robot Operating SystemPACS Picture Archiving and Communication SystemPLAX Parasternal Long-axisPOCUS Point-Of-Care UltrasoundPSAX Parasternal Short-axisRCA Right Coronary ArteryRCL Robotics and Control LabRF Radio FrequencyROI Region of InterestRWMA Regional Wall Motion AbnormalityUBC University of British ColumbiaVCH Vancouver Coastal HealthWMA Wall Motion AbnormalityWMSI Wall Motion Score IndexxivAcknowledgmentsI would like to give my enduring gratitude to my supervisor Professor Purang Abol-maesumi for his dedicated role. I have been fortunate to have him as not onlya person with a deep insight into our field but a person with persistent patienceand endless support that helped me overcome many crisis situations and finish thisstage of my life. I would additionally like to thank all of the instructors that I havebeen lucky to learn from over the last two years. Thank you, Christina Luong andHany Girgis, for your precious hours teaching me different aspects of cardiologyand helping me with my questions.It has been my pleasure to be part of the Robotics and Control Lab. Thank youall for making it a fantastic place to work. I am really lucky to have worked withyou.To my friend who took time out of her busy life to help edit this thesis, DelaramBehnami, thank you again. Thank you for all your endless guidance and supportthrough the unknown aspects of grad school. A whole-heartedly thanks to myfamily for their countless love and support.Finally, I would like to acknowledge the Natural Science and Engineering Re-search Council (NSERC), and the Canadian Institutes of Health Research (CIHR),who funded this project.xvChapter 1Introduction and Background1.1 Clinical Background1.1.1 HeartAnatomyThe heart is a muscular organ that pumps blood around the body, located withinthe thoracic cavity, medially between the lungs in the space known as the medi-astinum. Figure 1.1 shows the position of the heart in the thoracic cavity. The heartis located on its own space named the pericardial cavity and is separated from othermediastinal parts in the mediastinum by the pericardium or pericardial sac whichis a tough membrane.The heart looks like a pinecone. A person’s heart is typically around the size ofhis/her fist: normally 12 cm× 8 cm× 6 cm. Considering the differences betweenthe members of each sex, an average female heart is approximately 250-300 gramsand an average male heart is approximately 300-350 grams. Exercise is an impor-tant factor in the increase in the size of the heart. The behaviour of the cardiacmuscle at exercise time is similar to the skeletal muscle. The exercise increases theprotein myofilaments, resulting in an increase in the size of particular cells withoutincreasing the number of them, i.e. hypertrophy. Thus, athletes’ hearts can pumpblood more effectively (same amount of blood pump at a lower heart rate) than1Figure 1.1: Heart position in the thoracic cavity, located in the mediastinumbetween the lungs. (This image is available under a Creative CommonsAttribution License 2.0 at [43].There are four chambers in the human heart: one atrium and one ventricle oneach side of the heart. The upper chambers, also known as the atrium, serve asreceiving chambers and contract to push the blood to the ventricles. On the otherhand, the ventricles function as the principal pumping section, pushing the bloodto the lungs or the other organs in the body.The wall of the heart consists of three unequal thickness layers of tissue: epi-cardium, myocardium, and endocardium. The so-called layers are mainly coveredwith a thin protective layer named pericardium. The epicardium is mainly madeof connective tissue. The myocardium particularly consists of the muscles of theheart, and the endocardium lines the inside of the heart and protects the valves andchambers [43].Besides, the heart contains four valves being used to keep the blood flow in onedirection only. The atrioventricular valve or tricuspid valve is located between the2right atrium and the right ventricle. Moving forward from the right ventricle at thebase of the pulmonary trunk, we reach the pulmonary valve. The mitral valve, alsoknown as the bicuspid valve, is located at the opening between the left atrium andleft ventricle. The last valve is the aortic valve settled at the base of the aorta thatprevents back-flow from the aorta.FunctionThe heart, blood, and blood vessels combined are called the circulatory system.An average human body has around 5 litres of blood, which is constantly pumpedthroughout the body around 100,000 times a day.There are two blood circulations in the human body called pulmonary and sys-temic circuits. The pulmonary circuit is the transportation of blood between theheart and lungs, where it receives fresh oxygen and delivers carbon dioxide forexhalation. On the other hand, the systemic circuit is the transportation of oxy-genated blood to all other tissues of the body and returning of deoxygenated bloodand carbon dioxide to the heart to be sent back to the pulmonary circulation.A cardiac rhythm or heartbeat is known as the process of blood pumpingthrough the four chambers. The heartbeat can be split into two phases: systoleand diastole. In diastole, the atria and ventricles relax and fill with blood. In sys-tole on the other hand, the atria contract and pump blood into the ventricles; afterthat, as the atria start to relax, the ventricles contract (ventricular systole) and pumpblood out of the heart.In summary, as shown in Figure 1.2, blood flows from the right atrium to theright ventricle, then it is pumped into the pulmonary circuit. The blood in thepulmonary artery branches has low oxygen but a relatively high amount of carbondioxide. In the pulmonary capillaries, gas exchange occurs (oxygen is entered toblood, carbon dioxide is out). Subsequently, the blood which is high in oxygenand low in carbon dioxide is returned to the left atrium. Then, blood enters theleft ventricle, which is pumped to the systemic circuit. Following the exchange inthe systemic capillaries, the deoxygenated blood returns to the right atrium and thecycle is repeated.3Figure 1.2: Human blood circulation in Heart. (This image is available undera Creative Commons Attribution Licence 2.0 at ModalitiesThe visual assessment of ventricular function, cardiac chamber dimensions, andventricular mass is essential for clinical diagnosis, risk assessment, therapeutic de-cisions, and prognosis in patients with any kind of cardiac disease. Many imagingtechniques are applicable for the assessment of the left ventricular function eachwith their own limitations.Cardiac Magnetic Resonance (Imaging) (CMR) has been considered as the goldstandard for Left Ventricle (LV) assessment. Despite good performance, CMR isexpensive, time-consuming, and is not available in most of the medical centers.Moreover, it can not be used for all patients, due to the presence of metal devicesor clinical conditions such as claustrophobia and the inability to lay flat in somepatients [17].Cardiac Computed Tomography (CCT) is a non-invasive imaging technique thatcan be used to obtain information about left ventricular function and morphology,in addition to its main application which is the assessment of coronary artery dis-4ease. It serves as a decent alternative option when other imaging modalities suchas echocardiography cannot deliver acceptable images; or where CMR cannot beused due to patient’s contraindications [50].Having compelling developments in ultrasound technology, the routine incor-poration of harmonic imaging has been used clinically for the assessment of LVsegmental function [21].A detailed assessment of the global and regional myocardial function is de-veloped using other echocardiographic imaging modalities including automatedendocardial border detection using integrated backscatter, tissue Doppler and two-dimensional (2D) speckle tracking imaging of myocardial displacement, velocity,strain and strain rate, and real-time three-dimensional (3D) echo that will be dis-cussed in the next section.1.1.2 EchocardiographyEchocardiography or cardiac echo, mainly known as echo, is an ultrasound im-age of the heart. The conventional ultrasound image is created by an ultrasoundtransducer transmitting and then receiving Radio Frequency (RF) signals. The RFsignals are then converted to a digital RF signal, filtered to produce an envelope-detected signal. The resulting signal will produce the final B-mode image aftersome post-processing methods. Moreover, for doing some measurements, colourflow Doppler is interpreted.Echocardiography has been routinely used in most of the diagnosis, manage-ment, and follow-up of patients related to heart diseases. It is one of the mostcommon diagnostic tests used in cardiology. It provides rich information aboutthe size and shape of the heart (internal chamber size quantification), pumping ca-pacity, and the location and extent of any tissue damage. Moreover, the videos ofit help cardiologists to have a good estimation of the heart function, such as thecalculation of the cardiac output, ejection fraction, and diastolic function (i.e. howwell the heart relaxes). It is also used as a tool for the assessment of how severe isthe wall motion abnormality in patients with suspected cardiac diseases.5(a) B-mode (b) DopplerFigure 1.3: Demonstration of different echo modalities.B-modeThe B-mode echo producing the visual interface of examined anatomy in both 2Dand 3D echo files is the most common method of heart imaging. The brightness ofeach pixel in the images is dependent on the amplitude of the returned echo signal.The difference in brightness of each tissue allows visualization and quantificationof anatomical structures, as well as visualization of diagnostic and therapeutic pro-cedures. This is a real-time method for image acquisition, allowing for up to 50-70images per second in 2D echo.Colour DopplerColour Doppler ultrasound allows cardiologists to clearly observe the blood flowthrough the heart and the blood vessels. It also allows them to measure obstructionsin arteries and the degree of narrowing or leakage of heart valves (regurgitation).It is mainly done by encoding colour Doppler information and overlaying it on 2Decho images. Each colour is a representation of the speed of blood flow within aRegion of Interest (ROI).Cardiac viewsTransthoracic Echocardiography (ECHO) is the most common imaging modalityused for cardiac assessment. The ultrasound data is acquired from standard cross-6sections of the heart for measurement and examination of multiple variables incardiac structures and functions. Each of these cross-sections of the heart willresult in different views in echo that distinctly highlight specific regions of theheart details. The cardiac echo data is acquired by the manual movement of theimaging probe over chest acoustic windows. A good interpretation of the heart isachieved through correct acquisition using the best fixation on the cross-section.This requires years of experience and expertise.2D echo incorporates a recording of 2D cardiac images, often referred to ascine series. Each cine illustrates a 2D cross-sectional video of heart and may con-tain several cardiac cycles. There are 14 cardiac standard views each with theirown set of signature features. The main standard cardiac views are:• Apical views (Apical Two-chamber (A2C), Apical Four-chamber (A4C), etc.)• Parasternal Short-axis (PSAX) views• Parasternal Long-axis (PLAX) view.Figure 1.4 shows the main cardiac views, in which the transthoracic echo view isalongside its anatomical diagram.The four views most frequently acquired by clinicians are apical four-chamber,parasternal long axis (PLAX), parasternal short axis at the papillary muscle level(PSAX-PM), and subcostal four-chamber (SUBC4). For the purpose of this thesis,we only need three views of the heart: apical two-chamber and four-chamber, andparasternal long axis (PLAX).1.1.3 Wall Motion Abnormality (WMA)Clinical DefinitionThe blood is supplied to different regions of the heart through three main epicardialcoronary arteries. The left main coronary artery is split to the Left Anterior De-scending Artery (LAD) and the Left Circumflex Artery (LCX). In the blood supply,there may be some variations but the overall pattern is the same: The LAD goesdown the interventricular groove and provides blood to the anterior wall (with thehelp of its diagonal branches), anterior septum and apex of the heart. On the other7(a) A2C view (b) A4C view(c) PLAX view (d) PSAX viewFigure 1.4: Four main cardiac views. (Left) The transthoracic echocardio-gram displaying different structures in the heart. (Right) An anatomicaldiagram of the corresponding view.hand, the lateral wall is supplied by the left circumflex artery with its marginalbranches. The Right Coronary Artery (RCA) arising from the right sinus of Val-salva and going infero-medially down to the atrioventricular groove. The posteriordescending artery (PDA), which is a branch of RCA, supplies the inferior wall andthe inferior septum.In echocardiography, regional myocardial function assessment is mainly doneby observing the wall thickening and endocardial motion of the myocardial seg-ment.However, it should be remembered that deformation can also be passive andtherefore, may not always accurately reflect myocardial contraction.8Figure 1.5: LV regional wall motion analysis. Subfigure (a) shows the effectof WMA severity on motion dysfunction of the wall. It has been advisedto assess the wall motion of individual LV segments visually and scorethem [29].Many models have been used to divide the left ventricular myocardium intosegments so that regional wall motion can be accurately explained and quantified.The LV myocardium is divided into three sections as follows:base (six segments, each encompassing 60 degrees of the left ventricular short-axis which are described as: basal anterior, basal anterolateral, basal inferolateral,basal inferior, basal inferoseptal)mid-section (divided into six segments in a similar manner to the base)apex (divided into four 90 degree segments of apical anterior, apical lateral,apical inferior, apical septum).Each of these sections is divided into some segments that correspond to regionsin the LV wall. The most recent recommendation from the American College ofCardiology (ACC) and American Heart Association (AHA) is a 17-segment model[42]. Since the apical cap in the 17 segment model is acontractile and thereforemore appropriate for perfusion imaging, the 16 segment model of myocardial seg-mentation is being used extensively. Table 1.1 shows a complete list of all segmentshighlighted in Figure 1.6.Each segment is then assigned a score between 1 to 5, using the followingcriteria:1 = normal or normokinetic (normal wall thickening and endocardial excur-9Table 1.1: The complete list of all 17 segments of left ventricle wall.Basal segments Mid segments Apical segments1. Basal anterior 7. Mid anterior 13. Apical anterior2.Basal antero-septal 8. Mid antero-septal 14. Apical septal3.Basal infero-septal 9. Mid infero-septal 15. Apical inferior4. Basal inferior 10. Mid inferior 16. Apical lateral5. Basal infero-lateral 11. Mid infero-lateral 17. Apex6. Basal antero-lateral 12. Mid antero-lateralsion)2 = hypokinetic (myocardial thickening ≤ 30-40% in systole)3 = severely hypokinetic or akinetic (myocardial thickening ≤10% in systole)4 = dyskinetic (segment moves outward in systole)5 = aneurysmal (segment pouches out in both systole and diastole)There is a correspondence between the location of regional wall motion abnor-malities and the coronary artery territories. Mainly, the anterior septum, anteriorwall and the anterior apex are affected by LAD. Any disease in the LCX arteryaffects the lateral and posterior walls of the left ventricle. Besides, CAD detectionis improved by the detection of regional wall motion abnormality in this regionmostly due to the pathology of the LCX artery. The RCA at its posterior descend-ing branch supplies the inferior septum and inferior wall of the left ventricle. Anywall motion abnormality in the mentioned areas has a strong correspondence withRCA. Since the location of regional wall motion abnormalities correlates reason-ably well with the location disease in coronary arteries, it can be utilized as a validguide for further management.10Figure 1.6: ACC/AHA recommended “bulls-eye” plot of the 17 segmentmodel. The outer ring represents the basal segments, the middle ringrepresents the segments at midpapillary muscle level, and the inner ringrepresents the distal level. The anterior insertion of the right ventricularwall into the left ventricle defines the border between the anteroseptaland anterior segments. Starting from this point, the myocardium is sub-divided into six equal segments of 60 degrees. The apical myocardiumin is divided instead into four equal segments of 90 degrees. The apicalcap is added in the center of the bull’s-eye.Clinical ImportanceRegional wall motion abnormality refers to the motion of a region of the heart mus-cle being abnormal. Myocardial infarction or severe ischemia is the most commoncause of left ventricular (LV) wall motion abnormalities [21].Regional wall motion abnormalities mainly occur early in the ischemic cascadefollowed by Electrocardiogram (ECG) changes. When the myocardium oxygen de-mand is reduced and subsequently returned to baseline, there would be a resolutionof myocardial ischemia and wall motion returns to normal [42]. Wall motion anal-ysis has been a paramount factor in clinical decision-making situations such as inpatients with chest pain in the emergency department and patients with congestiveheart failure [33]. The accurate evaluation of left ventricular (LV) regional function11is essential for general cardiac assessment, specifically for evaluation of CAD andacute myocardial infraction [27, 29, 33, 58].Identification of regional wall motion abnormalities (RWMA) can help in thediagnosis of acute myocardial infraction and multi-vessel CAD and coronary syn-dromes or chronic CAD, direct to the ischemic territory and finally influence pa-tient treatment [8].Automated LV Wall Motion Assessment Using EchocardiographyEchocardiography, with its high spatial and temporal resolution, is the best choiceas a non-invasive method for assessing changes in wall motion. In patients withan acute situation, having ECG inclusive would be useful in the early detection ofmyocardial ischemia. Equally, in patients with acute chest pain, ECG included,normal regional wall motion may help to exclude underlying myocardial ischemia[21].Despite the impressive advances in echo technology, detection and quantifica-tion of regional left ventricular wall motion abnormalities on echocardiography im-ages is highly subject to the observer skills, require substantial experience [8, 33],and is prone to significant inter-observer variability [8].To alleviate the wall motion assessment subjectivity, some other echocardio-graphic imaging modalities are developed to automate the process and reduce theinter- and intra-observer variability. These methods conduct a more comprehensiveassessment of global and regional left ventricle function. In the following, we willdiscuss some of these methods.• Border TrackingThe automated border tracking method is a procedure in which the differ-ence between the ultrasound backscatter emitted from the endocardium andblood in the LV cavity is being used. After image acquisition, the backscat-ter information along the scan line is analyzed and the pixels are classified.The pixels are colour coded and superimposed onto a 2D image. This leadsto real-time tracking of the endocardial border. The figure shows an echoimage with and without automated border tracking. The main drawback ofautomated endocardial border tracking is that it is dependent on good image12quality. The poor quality images lead to poor tracking of the endocardialborder which results in turn poor colour-coded tracking images [37, 46].• Tissue Doppler imagingIn Tissue Doppler imaging (TDI), high amplitude, low-frequency Dopplersignals coming from the myocardium and mitral annulus are measured. InFigure 1.7 a tissue Doppler sample volume of the myocardium or the annulusarea and the systolic and diastolic velocities at that point are then displayed.Essentially, any area of the myocardium can be studied in this manner. Thus,the quantitative assessment of regional systolic function is achieved by mea-suring the S wave peak velocity. Any translational movement and tetheringaffect the myocardial velocity measurements, which in turn leads to the dif-ficulty of discrimination between segments that are akinetic or actively con-tracting. Besides, the distribution of the velocities is not uniform throughthe myocardium; as moving from base to apex, it would be harder to set areference value [18].• Strain Rate ImagingImaging myocardial deformation is a good alternative to conquer the limi-tations of velocity measurements. Strain and strain rate measurements are arepresentation of the magnitude and rate of change in length of myocardialfibre which is the energy required in both systole and diastole [12, 14]. Two-dimensional speckle tracking echo is a method for assessment of myocardialmotion by tracking natural acoustic markers, known as speckles, generatedfrom interactions between ultrasound and myocardium. By tracking the mo-tion of speckles, deformation is measured. Limitations of speckle tracking inecho include the need for good image quality and the assumption that a givenspeckle can be tracked from one frame to the next, which may not happen inexcessive cardiac motions.• 3D echocardiographyThe development of 3D echo has allowed assessing LV volumes and move-ment without any dependency on LV geometry and any assumption aboutLV shape. Thus, the truncation in apical visualization in 2D echo is resolved13(a) Border tracking (b) 3D echo demonstration(c) Directions of LV Strain (d) Longitudinal Strain MeasurementFigure 1.7: (a) Demonstration of colour kinesis where lack of colour chang-ing depicts lack of wall thickening[37]. (b) Three-dimensional echocar-diographic images of the heart (apex view) [38]. (c) The main directionsof deformation and strain imposed on the LV myocardium. (d) Exampleof measurements of the main 2D Strain variables. Each of the colouredlines at the left panels denotes one of the six regions measured from theapical 4-chamber window [12].in 3D echo. 3D echo captures the entire volume of the left ventricle dur-ing image acquisition. The acquired images are demonstrated as a renderedvolume or surface, wire-framed or 2D tomographic slices [38]. Figure 1.7-b shows an example of a 3D echo. While 3D echo has got more popularthan before, it is still limited to the need for good image quality and operatorskills and experience. Temporal and spatial resolution development and datamanipulation are also needed to enhance the application of this method.Although echo imaging has had remarkable enhancements, the echo data are14still nontrivial. This results in an adverse effect on echo-based diagnosis due to highinter and intra-observer variability. These limitations are because of intrinsic ultra-sound constraints such as noise, frequency vs. depth trade-off, probe tethering anddependency of image quality to the right probe positioning in the planes. However,these limitations are addressed partially by automatic machine learning methodsthat benefit from high spatio-temporal resolution of echo images [2, 57, 59, 60].1.2 Machine Learning in WMA DetectionRecently, applications of machine learning in medical imaging have become veryprominent. Machine learning can be used in different ways to achieve diverse ob-jectives. Thus, many research groups in the field have focused on semi-automaticand automatic techniques for cardiac assessment (e.g., wall motion abnormalityanalysis). Deep learning as one powerful branch in machine learning is an effec-tive method for detection and classification for several diseases.1.2.1 Deep Neural NetworksNeural networks are data processing structures (i.e. functions). They map inputx in IRn to the output yˆm in IRm. In classification, yˆm is the likelihood of each Mclasses. If we assume each neuron as a function of x, W as parameters (weight)matrix and b as bias vector, then the function of each neuron can be written as:yˆm = f (Wx+b). (1.1)A typical neural network consists of multiple layers as follows:• Input Layer• Hidden Layer(s)• Output LayerDeep neural networks have been successfully applied to medical imaging taskssuch as image classification, object detection, and image segmentation thanks tothe development of Convolutional Neural Networks (CNN). These neural networks15utilize parameterized, sparsely connected kernels which preserve the spatial char-acteristics of images. Convolutional layers sequentially downsample the spatialresolution of images while expanding the depth of their feature maps. This se-ries of convolutional transformations can create much lower-dimensional and moreuseful representations of images than what could possibly be hand-crafted. Thesuccess of CNNs has spiked interest and optimism in applying deep learning tocomputer vision tasks. There are many branches of study that hope to improvecurrent benchmarks by applying deep convolutional networks to computer visiontasks. Improving the generalization ability of these models is one of the most dif-ficult challenges. Generalizability refers to the performance difference of a modelwhen evaluated on previously seen data (training data) versus data it has never seenbefore (testing data). Models with poor generalizability have overfitted the trainingdata. One way to discover overfitting is to plot the training and validation accuracyat each epoch during training.Training neural networks is an optimization problem. The parameters of f , Wand b are optimized with respect to a set of data X , with the labels Y , using a definedloss function. The most common method of optimization takes the derivativesand minimizes the loss function with respect to the weights via iterative back-propagation of gradients in parameters. This is a non-convex optimization, whichmeans during training we may go through some local optima points. Besides, atrained neural network is prone to be over-fitted on the set of training data, meaningyou may do well on the train set but perform worse on the unseen test set.There are some solutions to defeat the over-fitting problem. One is using regu-larization, which penalizes the parameters W from becoming too specific on train-ing data. Limiting the size of the network (number of hidden units) and also lim-iting the size of the weights (Weight decay) and early stopping before over-fittingare other possible solutions for this issue.Deep Neural Networks consist of a number of hidden layers, extracting lowto high-level features from images. Convolutional neural networks and recurrentneural networks are two examples of deep neural networks.161.2.2 Convolutional Neural NetworksConvolutional Neural Networks were first inspired by visual cortex research doneby Hubel et al [23]. To date, CNNs are the most powerful tool for image classifica-tion and regression problems [34]. Using CNNs the number of network parametersis reduced due to parameter sharing through convolutional kernels.In CNNs, the input image is a matrix of pixel values that are a representationof the brightness at the given pixel in the image. While traditional neural networkstreat the whole image as a one-dimensional array, CNNs include the location ofpixels and their neighbours into consideration.In a convolutional layer, a weight matrix (kernel) is used for the extraction oflow-level features. The kernel with its corresponding weights slides over the imagematrix to obtain the convolved output. The kernel is like a filter for the extractionof particular information from the input image. By minimizing the loss functionthe weight of the kernel is learnt.Figure 1.8: Demonstration of a deep convolutional neural network with fivehidden layers including convolutional, pooling and fully-connected lay-ers.Figure 1.8 is a typical configuration of a CNN architecture. The configurationmainly consists of an input image, followed by a sequence of convolutional layersjoint with a non-linear function and pooling function. The result is then fed tothe last layers containing sequences of fully-connected blocks with the output sizebeing equal to the number of classes in the dataset.171.2.3 Recurrent Neural NetworksRecurrent neural networks are a branch of neural networks that extract the featuresin temporal dimension and have been used widely in time-sequence modelling [5].RNNs process sequential data points through a recurrent hidden state that its acti-vation at each step depends on that of a previous step [19]. An RNN updates itshidden state of ht by the following equation, where X = (x1, ...,xN) is a sequencedata:ht =Θ(ht−1,xt), if: t 6= 00, if: t = 0 (1.2)Knowing that xt is the data value and ht is the recurrent hidden state at time stept, and Θ denotes the activation function of a hidden layer which is nonlinear suchas sigmoid or hyperbolic tangent. Having y = (y1, ...,yT ) as the output is optionalfor RNNs. The conventional RNN model, known as vanilla, updates the recurrenthidden state in equation 1.2 is implemented as:ht =Θ(Wxt +Zht−1). (1.3)In the equation, W and Z are the coefficient matrices of the input at the currentstep and the recurrent hidden units activation at the previous step, respectively. Byexpanding equation 1.2 the hidden vector sequence ht is calculated as follows:ht =Θ(Wihxt +Whhht−1 +bh), (1.4)where Wih denotes the input-hidden weight vector and Whh is the weight matrix ofthe hidden layer, and bh is the bias vector in the hidden layer.While traditional RNN implementation has vanishing gradient problem, mean-ing that gradients decrease significantly for a deeper temporal model, new types ofrecurrent hidden units such as Long Short Term Memory (LSTM) and Gated Re-current Unit (GRU) have improved upon this and addressed the problem. Whiletraditional RNN applies a transformation to a weighted sum of inputs in equations1.3 and 1.4, an LSTM-based recurrent layer creates a memory cell m at each timestep whose activation is computed as:18ht = ptΘ(mt), (1.5)where pt denotes the output gate which determines the portion of the memorycell content in time step t (mt) to be exposed at the next time step [5, 16]. Theexpanded, recursive version of updating pt is as follows:pt = σ(Woixt +Wohht−1 +Wocmt−1 +bo), (1.6)In the above equation σ(.) is the logistic sigmoid function, Woi is the input-output weight matrix, Woh is the hidden layer-output weight matrix, and Woc is thememory-output weight matrix. Each memory cell, mt , is updated by the sum ofnew content, current value of mt , and discarding part of the present memory:mt = it .mt + ft .mt−1, (1.7)where . is an element-wise multiplication and mt is calculated as:mt =Θ(Wmixt +Wchht−1 +bc), (1.8)In equation 1.8, since W term represents weight matrices, Wmi is the input-memory weight matrix. Input gate i denotes the degree that new information isto be added and forget gate f determines the degree current information is to bedismissed, as follows:it =σ(Wixxt +Wihht−1 +Wicmt−1 +bi); (8)ft =σ(W f xxt +W f hht−1 +W f mmt−1 +b f ).All weight matrices, W, and biases, b, are shared between cells across time.A graphical model of an LSTM cell is shown in figure 1.9. GRU is a slightlydifferent structure of LSTM with a fewer number of parameters to avoid over-fittingin models with a low number of training samples [10]. The forget and input gatesare combined into a single update gate, known as u, and merge the cell memoryand hidden state to a reset gate, r. Also, the activation of the hidden layer in GRU19is an interpolation between the updated activation, ht , and the previous activation,ht1:ht = (1−ut)ht−1 +htut , (1.9)where ut denotes the amount of update for the unit content. The update gate for-mula is:ut = σ(Wuixt +Wuhht−1), (1.10)Given that Wui is the input-update weight matrix and Wuh is the update-hiddenweight matrix, the updated activation, ht , will be computed like the traditionalRNN in Equation 1.3 as follows:ht =Θ(Woixt +Wrh(rt .ht−1)). (1.11)Finally, the reset gate, rt , is computed as:rt = σ(Wrixt +Wrhht−1). (1.12)Figure 1.9: Demonstration of an LSTM block. mt−1 stands for the input froma memory cell in time point t; xt is an input in time point t; ht is an outputin time point t that goes to both the output layer and the hidden layer inthe next time point.201.2.4 Hyper-Parameter OptimizationMost of the learning algorithms are trained based on a set of hyper-parameters thataffect the performance of the model. Generally, hyper-parameters are selected tominimize the generalization error. This objective is essentially done by runningdifferent trials of diverse sets of hyper-parameters, comparing the output perfor-mances and desiring the best setting. Many approaches have been suggested forhyper-parameter optimization. The most straight forward one is the grid search.Grid search is an exhaustive search in a specified subset of the hyper-parameterspace of a learning algorithm [6]. On the other hand, random search replaces theexhaustive enumeration of all combinations by a random selection of them. WhileBayesian optimization [53] builds a probabilistic model that maps from hyper-parameter values to the evaluated measurements on a validation set, evolutionaryhyper-parameter optimization follows a procedure inspired by the biological con-cept of evolution [7]. Also, gradient-based optimization [35] and population-basedtraining (PBT) [32] are other methods for hyper-parameter optimization.1.2.5 Applications of ML in WMA detectionMachine learning with its powerful capabilities has been constantly used for wallmotion abnormality detection. The ML methods for wall motion abnormality clas-sification include radial basis functions [11], random forest [13], unsupervisedmultiple kernel learning [51], dictionary learning [44], support vector machines(SVM)-based wall motion classication (in CMR) images [36]. Recently, deeplearning has shown a remarkable role in the classification of several diseases inmany medical fields [15, 20, 26]. Conventional machine learning methods mainlyrequire pre-determined features and measurements to identify relevant hidden in-formation in the images [25]. However, deep learning extracts useful features au-tomatically [4, 30]. Moreover, the capability of the deep convolutional layers inextracting low-level features from the original input image is useful in pathologydetection in echocardiographic images. Recent research has shown that the DeepConvolutional Neural Networks (DCNN) can be useful for RWMA detection in theclinical setting [28]. The study provides predictions only on coronary infarctionterritories, using conventional two-dimensional echocardiographic images.21In a work on 3D stress echo [41], Omar et al. proposed a CNN to distinguishbetween normal and abnormal wall motion. Some works have developed statis-tical spatio-temporal cardiac atlases, mainly relying on cardiac motion and shapepriors [3, 47, 48, 55, 56, 61]. Peressutti et al. extract clinically relevant featuresby using a motion atlas with non-motion information [45]. Oktay et al. proposedlearning cardiac image representations using anatomically constrained neural net-works [40]. Also, in CMR, wall motion assessment is done by first segmenting andthen feature tracking and strain estimation [49]. Current available DCNNs trainedon 2D echo images are only predicting the overall wall motion score as normaland abnormal and none of the above methods predict the severity of abnormalityof each region separately. We hypothesize that a deep convolutional neural net-work trained with echocardiographic images may provide improved detection ofRWMAs in addition to prediction of the abnormality for all 16 segments in the LV.Figure 1.10 shows a systematic architecture of the proposed method.Figure 1.10: A systematic diagram of the workflow for automated RWMAprediction. After data are acquired in the clinic, it is stored in the Van-couver Coastal Health (VCH)’s servers along with the cardiologist’smeasurements. The machine learning model will then use these datato make predictions on RWMA.221.3 Thesis ObjectiveIt is clinically crucial to have an automatic system for evaluation of regional wallmotion abnormality that is more objective, consistent, broadly accessible to lesstrained echocardiographers, and is as precise as an expert. Also, this system couldbe used when an experienced echocardiographer is not available or could be usedas a second opinion by experts. Hence, this system is a practical tool for automatedregional wall motion evaluation that scores left ventricular segments as precise asvisual determinations by expert echocardiographers. It can be utilized as an extra,broadly used qualitative method for the detection of wall motion abnormalities andas a screening tool for the novice echocardiographers. Moreover, this tool canbe used for teaching, unified regional scoring and routine objective evaluation ofregional wall motion.1.4 ContributionsOur research goal has been to develop a framework for regional wall motion ab-normality analysis on echo imaging information. To reach this goal the followingcontributions were made:Initially a thorough study of the heart and relevant cardiac diseases were per-formed and the diagnostic imaging techniques were reviewed. In order to improvethe diagnosis of wall motion abnormalities and help the cardiologists on this matter,we decided to develop an artificially intelligent model to assist in the identificationof the wall motion abnormalities through echo images. This is of a great valuegiven the variability and challenges of scoring such abnormalities.Subsequently, I developed a machine learning framework for training a neuralnetwork, consisting of three views of echocardiography data as input, with abnor-mality classification for 16 wall segments as output. The network directly analyzesecho data without any need for prior segmentation of the cardiac LV wall.I trained the network with the data described above from 489 patients and opti-mized the parameters and hyper-parameters of it. The resultant network preciselyidentifies regional wall motion abnormalities on echo as compared to the experthuman labels with advanced echo training. In independent test dataset, I demon-strated that the neural network can produce accuracy as high as 69.2% for detection23of abnormal wall motion.The developed model, to the best of our knowledge, is the first to analyze theleft ventricle wall motion for both global and regional abnormality detection inechocardiography data. This can be considered a great contribution in enhancingthe diagnostic process of the wall motion abnormalities, by expediting the processfor physicians.1.5 Thesis OutlineThis thesis covers the background of left ventricle wall motion abnormality, and therelevant technologies for the proposed problem, the details of a system developed,and the evaluation of the proposed system. The outline of the thesis is as follows:• Chapter 1: Introduction and backgroundIn this chapter, the basic anatomy of the heart is reviewed to gain an under-standing of its global function and the importance of the LV. Moreover, adetailed walk-through of the current wall motion abnormality detection sys-tem is provided. Furthermore, relevant machine learning techniques used inultrasound are demonstrated.• Chapter 2: Materials In this chapter, the dataset obtained from Philip’s Xcel-era TM and FilemakerTM systems is explained. This dataset is used to trainmodels for the proposed framework. The echo data is mainly acquired fromXceleraTM, extracted from routine studies since 2005 at Vancouver coastalhealth clinics by different ultrasound machines. Consequently, the patient’sclinical measurements are acquired from the FilemakerTM database; thisdataset contains over 200,000 records and is used to label diseases availablein the corresponding echo files.• Chapter 3: MethodsIn this chapter, a deep learning model is proposed to extract features and thenclassify the 16 segments of LV. The model is used to extract Spatio-temporalfeatures from the echo cine loops. Hence, the proposed model can accuratelyidentify regional wall motion abnormalities on echo images as compared to24the reference standard, experienced human interpreters with advanced echotraining.• Chapter 4: Experiments and ResultsTo evaluate the presented framework on 489 unique patients, acquired fromthe routine cardiology care, three streams for tri-plane assessment are fedto the network. The inputs to the network are cine loops captured in theA2C, A4C and PLAX views. The cine loops consist of one full cardiaccycle and are synchronized based on the cardiac phase. The expert annotatedRWMA labels will be provided in a supervised learning framework and thenetwork will hence be trained to map the LV regions to a motion score. Thenthe evaluation results of the proposed framework are reported. Using theproposed model, we investigate the advantages of using a joint-feature modelover a single view information model. Finally, we investigate the correlationof occurrence of the abnormality in the 16 segments of LV.• Chapter 5: ConclusionThis chapter summarizes the objectives of the research and the contributionsmade and describes potential applications and directions for this research tobe continued.25Chapter 2MaterialsThe wall motion abnormality detection framework uses the echocardiography datasetobtained from Vancouver Coastal Health (VCH). The datasets and the software ap-plications used for interfacing them are briefly described below.2.1 EthicsEthics approval for evaluation of 3000 studies acquired from routine cardiologycare was obtained from the Clinical Medical Research Ethics Board of VancouverCoastal Health (VCH) and in consultation with the VCH Information Privacy Office.A meticulous process of data anonymization, de-identification and data encryptionwas fulfilled based on guidelines recommended by the VCH Privacy Office. Thepatient information such as age, sex, height, weight and related health issues andthe echo images were assigned an alpha-numeric code to eliminate any risk of re-identification of participants. The data is encrypted using on-the-fly encryptionsoftware named TrueCryptTM (TrueCrypt.Org, Czech Republic). While the devel-opment of TrueCrypt has been discontinued, an independent audit of the software(published in March 2015) has confirmed that there are no significant flaws in thesoftware [1].262.2 Echocardiography DataThe presented framework is part of the Information Fusion for Echocardiography(INFUSE) project. The data consists of 992 echo studies from 953 unique patients.It consists of two parts: an echo image database and the corresponding measure-ments and pathology reports. The echo image dataset consists of 2005 to 2015of Vancouver echo cine videos, measurement screenshots, Doppler cine files andimages.To access the ultrasound studies, the cardiology department’s XceleraTM databaseis interfaced. It allows for downloading information associated with patient follow-ups, emergency, and investigational echo studies. Each study contains an echo cinethat has a variable number of frames, where the mean number of frames is 48. Be-sides, the collected studies are generated from seven different ultrasound machinemodels: Philips iE33, GE Vivid 7, Vivid i, Vivid E9, Sequoia, and Sonosite.The Digital Imaging and Communications in Medicine (DICOM) studies com-ing from these devices are uploaded to VGH’s cardiology department’s XceleraTMserver. By the time the upload for echo data is completed, it can be accessedthrough an XceleraTM workstation terminal.2.2.1 Echocardiography Retrospective DataThe XceleraTM ultrasound software allows users to have access to all saved echostudies through a Graphical User Interface (GUI). The XceleraTM software alsoworks with DICOM images and can be used as both a server and a viewer forthem. In addition to that, it incorporates some advanced features for both cardiol-ogists and sonographers. The features include the image measurement module inthe software suite that authorizes the cardiologist to draw or measure the cardiacparameters directly on medical images and save them for future examination. Themeasurement module has no automatic section and every measurement input to theFilemakerTM database is assessed manually.The XceleraTM software contains two separate databases, Echo and Xcelera.The downloaded echo information from the ultrasound machine is written to echoMySQL instance. Moreover, the manual segmentation information made on animage undertaken by a cardiologist is stored in MySQL instance named Xcel-27Figure 2.1: A diagram designed to show the relationship between the Echo,Xcelera and Filemaker databases in a routine cardiology study.era. Xcelera stores patient information keys, which connects both SQL instances.Hence, whenever the measurements are evaluated, they will be stored with a uniqueidentifier.The key identifier will then be used to link each particular ultrasound, allow-ing each study to be reopened anytime with the correct measurements attached.The two databases are connected using a key matching method for this large-scaledata. For the purpose of this thesis, we only use the XceleraTM database to ob-tain the Echo. Figure 2.1 shows an echo saving routine using XceleraTM (PhilipsHealthcare, Netherlands)’s database.282.2.2 Echo Data Download and ProcessingData acquisition is done with the help of the Information Technology team of VGH.A MySQL instance of XceleraTM is being copied and anonymized and then it willbe installed on a computer located at Vancouver Coastal Health’s IT department.A subset of available data is then queried. All available echo studies from VGH’sPicture Archiving and Communication System (PACS) are copied to Robotics andControl Lab (RCL) servers, encrypted and secured with password as required byour ethics application.Once all of the data are acquired, a new MySQL database on University ofBritish Columbia (UBC) RCL servers is created. This database contains all the datathat have been downloaded from XceleraTM software in VGH for different projectsat RCL.Each study file located on the RCL servers contains the echo volume pieces ofinformation along with manufacturer name, file name, date of the study, MedicalRecord Number (MRN), DICOM header information and if any available segmen-tation coordinates.Each study is an echo cine series that contains a variable number of frames,where the mean number of frames is 48. Besides, the collected studies were gen-erated from seven different ultrasound machine models: Philips iE33, GE Vivid7, Vivid i, Vivid E9, Sequoia, and Sonosite. Therefore, the resolution, size of theultrasound visual area, the probe specification, and imaging settings vary acrossthe machine models. Given these differences, the echo volumes are processed byapplying a semi-automatically cropping ultrasound beam shaped black mask, andthe frame size is reduced to 128×128 pixels.2.3 Analytical MeasurementsThe measurements associated with each study of all patients are stored on VGH’sFileMakerTM Pro 6 database. All the findings including diagnostic informationfor standard measurements, comments, etc along with patient information (e.g.,name, age) and exam information (e.g. date, examiner) are entered manually bythe sonographer to the FileMaker. Besides, the exams in FileMaker can be linkedto the downloaded echo files using a hospital-assigned patient number and the date29Figure 2.2: Snapshot of the wall motion abnormality section of FileMaker.By clicking on each region, the cardiologist will type a score from 1 to5 to each region to label the abnormality of the which the study has been recorded.2.3.1 Filemaker MeasurementsFilemakerTM is a relational database program that integrates a database engine anda graphical user interface (GUI) while keeping security features. It allows all userswith a minimal level of technical knowledge to modify the database by draggingnew elements into layouts, screens, or forms.Considering that the XceleraTM (Philips Healthcare, Netherlands)’s cardiol-ogy suite provides searching by MRN only, a custom version of FilemakerTM wascreated to allow advanced searching by patient physiology in addition to MRN.Thus, this software has a remarkable role within the cardiology department. More-over, FileMakerTM software is a precious teaching tool for cardiologists becausethe studies can be retrieved based on certain keywords.Within the software, there are four main tabs for each echo study. After each30P1S1Fp1Fp2S2Fp1Fp2P2S1S2S3S4Fp2Fp3Fp1Fp2Fp1Fp3Fp1Fp2Patient Study FilepathFp1Fp2RCL ECHO databaseFigure 2.3: A block diagram of the relationship between the Echo, the sonographer fills out the front page, Valves/PA, Aorta/ Atria/Shunts Peri-cardium pages. As seen in Figure 2.2, there are check boxes and empty values fortheir report. The cardiologist will then fill out the last tab (LV/RV/Conclusions)after reviewing the first three tabs and the echo images.2.3.2 Analytical Data Download and ProcessingThe only process of acquiring FileMakerTM data for this thesis was through export-ing the FilemakerTM database into a Comma-Separated Values (CSV) file, whereeach row represented a unique study and each column represented a unique field.This CSV file was then imported into an internally created RCL Echo database.Figure 2.3 shows the connection of each table in that database.As it is depicted, patient table contains all the patients available in the CSV31and is linked to the study table in a one-to-many relationship. The study is thenlinked to the filepath table in a one-to-many relationship as well. The study tablecontains all available studies within the FileMaker that we have the ethics for them.Moreover, at the time of each study, there may be many files recorded, these fileswith their filepath are saved in the filepath table. The decisive components forconnecting each file to the correct study is the patientID and the date of the study.If the patient’s MRN and date of study in both the echo study file and study tablematch, the file will be linked to that specific matched study. Besides, once theview classification label of each file is ready it is linked to the filepath table as aone-to-one relationship.Specific measurements in the FileMakerTM database that were needed for wallmotion detection including:• Global wall motion• Regional wall motion– A2Cbasalant– LAXbasalant– A4Cbasosept– A2Cbasalinf– LAXbasalinf– A4Cbasolat– A2Cmidant– LAXmidant– A4Cmidsept– A2Cmidinf– LAXmidinf– A4Cmidlat– A2Cdistalant– A4Cdistalsept– A2Cdistalinf– A4Cdistallatwere filtered out by creating a query from the database.The distribution of regional and global wall motion abnormality from the avail-able studies has been plotted in Figure 2.4, and the frequency of the Wall MotionScore Index (WMSI) is reported in Table 2.1.32Table 2.1: The Wall Motion Abnormality data frequency in each region.RegionWMSI1 2 3 4 5Global 541 231 13 0 2511 974 54 8 0 02 918 68 50 0 03 909 74 51 1 14 817 109 99 2 95 886 97 45 3 56 982 45 9 0 07 828 123 85 0 08 749 81 205 0 19 817 104 110 4 110 783 144 107 0 211 826 131 77 1 112 918 90 28 0 013 711 78 228 11 814 645 103 264 14 1015 715 92 214 6 916 640 118 253 16 9332.4 SummaryA full description of the dataset obtained from Philip’s XceleraTM and FilemakerTMsystems was presented. This dataset is used for the training stage of all models pro-posed in the next chapter. The echo data are mainly acquired from XceleraTM, ex-tracted from routine studies from 2005 to 2015 at Vancouver Coastal health clinicsby different ultrasound machines. Moreover, the patient’s clinical measurementsare downloaded from the FilemakerTM database; this dataset contains over 200,000records and is used to label diseases available in the corresponding echo files.34918749 81771164568 81 10478 10350205110228 2640 0 4 11 140 1 1 8 102 8 9 13 14LCX REGION WMA FREQUENCY1 2 3 4 5(a) LCX RWMA data distribution974886 982828826 91864054 974512313190 1188 45985 77 282530 3 0 0 1 0 160 5 0 0 1 0 91 5 6 7 11 12 16LAD REGION WMA FREQUENCY1 2 3 4 5(b) LAD RWMA data distribution90981778371574 109 144925199 107 2141 2 0 61 9 2 93 4 10 15RCA REGION WMA FREQUENCY1 2 3 4 5(c) RCA RWMA data distributionFigure 2.4: The Wall Motion Abnormality data distribution in different re-gions of the heart: (a) is data distribution in LCX region , (b) is datadistribution in LAD region, and (c) is data distribution in RCA region.3554123113 02511 2 3 4 5GLOBAL WMA FREQUENCY Figure 2.5: Global WMA data distribution.36Chapter 3MethodsThe proposed framework for echo analysis to identify RWMAs consists of threephases of view processing, classification, and localization. The objective of theview processing phase is to automatically distinguish the view in which the studyhas been recorded. In the classification phase, after retrieving key echo views re-quired for the visualization of RWMAs (parasternal long-axis, apical 2-chamber,and apical 4-chamber), these clips will be coded for regional and global wall mo-tion abnormalities for training and validation of a convolutional neural network.3.1 Relevant Cardiac Echo View Selection for WMAIn Point-Of-Care Ultrasound (POCUS), apical four-chamber (AP4), parasternal longaxis (PLAX), parasternal short axis at the papillary muscle level (PSAX-PM), andsubcostal four-chamber(SUBC4) are the four views most frequently acquired byclinicians. Since RWMA is visible in both parasternal short axis and apical views,we decided to continue with the apical views due to the popularity. For the de-termination of relevant cine loops, a pre-trained deep learning network is used topredict any of the 14 views. The model was trained on a dataset of 3,151 uniquepatients who were diagnosed with various heart conditions and diseases during theperiod of 2005 to 2015. Generally, the dataset contains 16,612 echo cines (with atotal of 807,908 frames) from cardiac standard views taken from the four standardimaging windows, namely, Apical, Parasternal, Subcostal, and Suprasternal. The37View A2C A3C A4C A5C PLAX RVIF S4C S5C IVC PSAX-A PSAX-M PSAX-PM PSAX-APEX SUPRATraining set 335 283 359 128 390 131 172 29 218 401 388 187 63 46Validation set 126 101 105 42 131 49 77 5 67 135 137 60 19 13Test set 106 95 93 44 137 29 49 15 56 108 147 73 13 11Table 3.1: The dataset composition in terms of number of studies from eachtype of the 14 standard echocardiography views (A#C: apical #-chamberview, PLAX: parasternal long-axis view, RVIF: right ventricular inflowview, S#C: subcostal #-chamber view, IVC: subcostal inferior vena cavaview, PSAX-A: parasternal short-axis view at aortic valve, PSAX-M:PSAX view at mitral annulus valve level, PSAX-PM: PSAX view at mi-tral valve papillary muscle level, PSAX-APEX: PSAX view at apex level,and SUPRA: Suprasternal view).A4C CineInputs Spatial Embeddings PredictionLSTMAveraged Quality Score PredictionAveraged View Class PredictionTemporal EmbeddingsFCDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockLSTMFCDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockDenseBlockFigure 3.1: The cardiac view classifier network architecture. Related embed-ding are extracted from the individual frames by the Spatial embeddingextractor (DenseNet blocks). The embeddings are then fed into the LongShort-Term Memory (LSTM) blocks to extract the temporal informationacross 10 sequential echo cine frames.distribution of the data per class is shown in Table 3.1, and an example of datasetimages (one per view class) can be found in the table:The network architecture is demonstrated in Figure 3.1. The input is a 10-framerandomly extracted from an echo cine array, where each frame is a 120× 120pixel, gray-scale image. As shown in the Figure 3.1 the model consists of fourcomponents:• A seven-layer DenseNet model that extracts per-frame features from the in-38put;• an LSTM layer with 128 units to capture the temporal dependencies fromthe generated DenseNet features;• a regression layer that produces the quality score from the output feature ofthe LSTM layer for each frame;• a softmax classifier that predicts the content view from the LSTM featuresfor each frame.The best accuracy of 89% is achieved by an ensemble of the three very deepmodels. It is noteworthy that the average performance of the model for A2C, A4Cand PLAX view is 89%, 93%, 96%, respectively[60]. For the purpose of this thesis,we only input the samples to the trained model and get the predicted view per study.3.2 Automatic WMA Detection Deep ConvolutionalNetworkWe propose a regression model for WMA detection. The network architecturecan be seen in Figure 3.2. The input to the network is sampled from synchronousA2C and A4C and PLAX echo cines. This network consists of two main stages:a convolutional layer and a fully-connected layer. The first stage is composedof convolutional layers (Conv) and pooling layers (Pool); the second stage onlycontains Fully-Connected Layers (FC).3.2.1 Spatial Feature ExtractorsThe convolutional layer mainly consists of 2D or 3D kernels which are convolovedwith the input images and return the spatial feature-maps (cine representations).The kernels perform a discrete convolution (i.e. a weighted sum of inputs). Forinstance the 2D kernels of size 2p+1×2p+1 do the following:xli, jk =p∑m=−pp∑n=−pwli,mnxl−1( j+m)(k+n). (3.1)39A2C CineA4C CinePLAX CineInputs Spatio-temporal EmbedingsConvMax-PoolingConvMax-PoolingConvMax-PoolingConvMax-PoolingConvMax-PoolingWM Abnormality PredictionTemporal Feature ExtractorTemporal Feature ExtractorTemporal Feature ExtractorFC(ReLU)ConvMax-PoolingConvMax-PoolingConvMax-PoolingConvMax-PoolingConvMax-PoolingConvMax-PoolingConvMax-PoolingConvMax-PoolingConvMax-PoolingConvMax-PoolingFC(ReLU)FC(ReLU)Region 4Region 15Region 10Region 13Region 7Region 1Region 6Region 16Region 12Region 14Region 9Region 3Region 5Region 11Region 8Region 2Region 16Region 14Figure 3.2: The proposed WMA network architecture. Spatio-temporal em-beddings are extracted from the individual cines. The embeddings arethen fed into the FC blocks to connect the reasoning between the infor-mation across echo cine frames.40In the above equation, wli is the weight matrix, and xli is the output feature-mapof the ith kernel of the Conv layer l; xl1 denotes the input feature-map of the layer.The feature-maps of the previous layer is convolved with the kernel and result ina 2D output. All of the outputs from kernels are stacked and generate a 3D outputfeature-map.The total number of parameters in the convolutional layer is calculated by mul-tiplying the number of kernels by the size of each kernel. Compared to FCs aconvolutional layer has fewer parameters since only a single kernel is generatingthe feature-maps. Thus, the convolutional layer has considerably fewer parametersto be optimized.The output feature-maps are passed to a non-linear activation function. TheRectified Linear Unit (ReLU) is used on all the feature-maps. While having thesame performance, the ReLU function is much faster than its rivals such as sigmoidand hyperbolic tangent functions [39]. This function is described below:f (x) = x+ = max(0,x) (3.2)A non-linear form of down-sampling is pooling. The spatial variance of feature-maps is reduced using pooling layers. This allows faster convergence in additionto picking the most relevant features [52]. Since pooling layers reduce the size offeature maps, the computation time is reduced as well. Besides, adding poolinglayers to the model will increase the generalization as they make the model robustto small translations. Between several methods of pooling, max-pooling has showna remarkable performance in comparison to its other alternatives [9]. Max-poolinglayers have no corresponding weights; thus, the number of parameters in this layeris zero.3.2.2 Temporal Feature AggregatorsFully connected layers perform the high-level reasoning in the neural networks,mainly by representing the connection between the spatial feature-maps. In an FClayer, each neuron is connected to every neuron (or activations) in the previouslayer. Mathematically, the FC layer is denoted by a matrix multiplication followed41by summation of offset value.f li (xl−1) =n∑j=1wli jxl−1j +bli, (3.3)where wli j denotes the j-th weight in neuron i of layer l, and bli is the bias value.The output of an FC layer is passed to the activation function as well as the out-put of convolutional layers. However, the output of the last FC layer is not filteredby an activation function since it would be the final prediction of the network.3.2.3 Spatio-temporal FrameworkThere are several spatio-temporal convolutional variants within this framework.Given that the input is gray-scale cine files of the size 3× f ×H×W , where 3 isthe number of view channels, f is the number of frames, H and W are the frameheight and width equal to 128. In the following, different convolutional blocks thatare used for feature extraction, will be discussed in detail.2D convolutions over the cineOne approach would be using 2D convolutions over the entire cine. 2D CNNsdisregard the temporal nature of cine files and behave like there is no difference inthe relative occurrence of each frame. Thus, this model will reshape the 4D inputto a 3D input with a size of 3 f ×H×W .The output of the convolutional block is also a 3D tensor. Its size is Ni×Hi×Wi, where Ni is the number of convolutional filters used in the ith block, and Hi,Wiare the spatial dimensions, which may be smaller than the original input frame dueto pooling or striding. The filters are 3D and have the size of Ni−1×d×d, whered is the spatial width and height of the filter. It is worth mentioning that althoughthe filter is 3D, it is convolved spatially in only two dimensions. Consequently, theoutput of each filter is a 2D output, meaning the temporal information vanishes inthe first layers of convolution. This may result in a lack of temporal reasoning inthe subsequent layers.422D convolutions per frameAnother 2D CNN approach is processing all f frames via a series of 2D convolu-tional blocks. The same filters are applied to all f frames. There is no temporalmodelling in the convolutions, however, the global spatiotemporal pooling layercombines the individual information from each frame. This architecture is illus-trated in Figure 3.3 as C2D.3D convolutions over the Cine3D CNNs extract features in spatial and temporal dimensions by performing 3Dconvolutions, so they capture the motion information within multiple adjacentframes. The output will be a 4D tensor with the size of Ni×L×Hi×Wi, whereNi is the number of filters in i-th layer. The filter size is Ni× t× d× d, where dand t denote the spatial and temporal extent of the filter respectively. Moreover,the filters are convolved in both time and space. This architecture is depicted in theFigure 3.3.2D convolutions+temporal feature extractor over the CineThe more promising approach is using the 2D convolutions followed by LSTM orGRU units. In other words, the architecture consists of both spatial and temporalfeature extraction elements in separate stages. The convolution part can be a plainstack of convolutions or sequence of DenseBlocks [22].The feature space output ofthe convolutional blocks is then flattened and fed to the temporal feature extractorpart of the network (i.e. LSTM and GRU units). Thus, the output of feature extrac-tor is XViewim of length M×1,m = 1 : M where Viewi is one of the three views usedfor the WMA detection. The output of all view channels (i.e: XA2Cm ,XA4Cm ,XPLAXm )fed to the fc layers. The different structures out of this are illustrated in the Figure3.3 subsections b to e.3.2.4 Regularization and Data AugmentationImproving the generalizability of machine learning models has been one of themost challenging tasks. Generalizability denotes the difference in the evaluatedperformance of the model on training data versus test data. Models that are overfit-43Figure 3.3: The network architectures for WMA classification considered inthis work. (a) C2D are 2D Covolutions; (b) C2D+LSTM are time dis-tributed 2D Covolutions followed by LSTM units; (c) C2D+GRU aretime distributed 2D Covolutions followed by GRU units; (d) D2D are2D DenseBlocks followed by LSTM units; and (e) D2D are 2D Dense-Blocks followed by GRU units. (f) C3D are 3D convolutions. For inter-pretability, the connections are omitted.ted on train data have low generalizability. Monitoring loss and accuracy plot fortrain and validation set at the end of each epoch is an approach for the detection ofover-fitting. Hence, we monitor the plots during training.There are many strategies to stabilize learning and prevent the model fromover-fitting while training. By adding a penalty term to the loss function, it ispossible to prevent the coefficients or weights from getting too large. This methodis called Regularization. In all experiments, we used a `2 regularizer term in theloss function in the form of λ‖w‖22, where λ ∈ 0.00001−0.001. Moreover, λ willbe a hyper-parameter that we will investigate the optimal value for it.Another method to prevent the over-fitting problem is dropout. Dropout layerslimit the co-adaptation of the feature extracting blocks and force the neurons tofollow the overall behaviour. In each step, the dropout layer removes some randomunits from the neurons in the previous layer of the network based on the probability44parameter of the dropout layer as another hyper-parameter. Thus, by removingthe units, the network architecture is changed in each training step. This impliesthat dropout integrates diverse architectures in the model [54]. In other words, adropout acts like adding random noise to hidden layers of the model. In our design,a dropout layer was deployed after each FC layer. The probability of the dropoutcan be another hyper-parameter.3.3 SummaryThe model exemplifies the ability of a supervised training algorithm to be appliedto the US, to address the clinical need for WMA detection. Different possiblestructures for extracting features and then classifying the 16 segments of LV werediscussed. Convolutional layers are used to extract feature-maps from the echo cineloops and FC layers are used to model the relationship between the feature-maps.Having both spatial and temporal feature extraction blocks, the proposed model canoutperform baseline methods in identifying regional wall motion abnormalities onecho images.45Chapter 4Experiments and ResultsIn this chapter, a study of wall motion abnormality detection performance usingthe different spatio-temporal convolutions, described in the previous chapter, ispresented. It should be pointed out that the echo dataset used in this study is largeenough to enable the training of deep models from scratch over enough number ofiterations.4.1 Experimental Setup4.1.1 DatasetThe proposed model has a large number of parameters to be trained. Thus, it re-quires a large annotated dataset. In this research a dataset of 1037 patients collectedfrom the Picture Archiving and Communication System at Vancouver General Hos-pital is used. The data includes combinations of A2C, A4C and PLAX echo viewsof each patient. The echo studies are mainly acquired by echo-technicians duringroutine cardiac exams. However, the ground-truth label of WMA is annotated byexpert cardiologists.In every echo routine, the heart is imaged from different standard imagingviews, mainly parasternal long and short axes, apical 2-, 3-, and 4-chamber, sub-costal, and suprasternal, in which the transducer is placed on patient’s chest toacquire ultrasound cine files. In this research, the apical two-chamber (A2C), four-46chamber (A4C), and parasternal long axes (PLAX) views are used.The whole data were divided into a training set and a test set (80:20 split,respectively) so that a total of 953 patients with 992 studies were split to 667 cases(704 studies) as the training set, 95 cases (96 studies) as the validation set, and 191cases (192) as the test set. The dataset is shuffled randomly and then split into fivenon-overlapping groups based on the patients. The experiment is done five times,where in each run, one of the five sets is set aside as test data and unseen whiletraining with the other four subsets. Moreover, in each run, the validation portionin the training set is used for searching the optimal hyperparameters.4.1.2 Wall Motion Abnormality labelsAll the labels corresponding to the studies of echo dataset are available in File-Maker. These labels are extracted and an integer score of 1 (normal) to 5 (aneurys-mal) was assigned to show the abnormality level of each segment. As discussed inChapter 1, scores 3, 4 and 5 all correspond to totally abnormal functioning region(akinetic, dyskinetic and aneurysmal). Thus, we consider them all together as ab-normal. Distribution of data among the five abnormality-levels is demonstrated inFigure 2.4 for each of the segments.4.1.3 Network ArchitectureTo compensate for the differences among patients’ frame rate and heart rate, imagesfrom only one heart cycle were used. All DICOM (Digital Imaging and Communi-cations in Medicine) images were resized into 128 pixel mat-files. The distributionof WMA labels among the training and test sets were examined using Pearson’s χ2goodness-of-fit test to verify that train and test set are reasonable representationsof the original data (i.e. p− value> 0.05).Three parallel streams of the network with similar architecture in the featureextraction part are trained for each of the three correspondent cardiac views. Thisdisconnects the views from one another, enabling the full use of available infor-mation in the cines. While this architecture causes a big increase in the number ofparameters in the network, we will show that this structure is successful for WMAdetection based on our experiments so far. This is mainly due to having denser47and richer feature vectors from each view, which allows more effective temporallearning.The network architectures are illustrated in Figure 3.3. Having 5-6 convolu-tional layers (each followed by ReLU activation functions. All convolutional ker-nels were convolved with a stride of one on padded inputs to preserve dimensions.Moreover, 2D and 3D pooling layer filters where used afterwards. The features arethen are fed to FC layers to predict the WMA for each of the regions. Moreover,within each structure there would be another similar channel for optical flow trans-fer of image is added to add more information about the movement in the videos.The results compare each method with and without optical flow. In results section,if the input data to the method consists of both echo cardiac view and optical flowthe method name will have an ”OF” at the end of the name of method.4.1.4 Hyper-parametersThe hyper-parameters are optimized using a grid search. The grid search starts todivide the training set into training and validation sets (80% and 20%, respectively).The loss and accuracy on both training and validation sets are used to find the bestcombination of the hyper-parameters. Loss is defined as the mean squared errorbetween the predicted label and the true label, while accuracy is the mean absoluteerror between the ground truth label and the prediction.We perform a grid search over the initial learning rate, lr ∈ {0.001− 0.0001}and batch size, using two different optimization algorithms, SGD and Adam. Wealso experiment with different levels of dropout rate, dr ∈ {0,0.5} and L2 regular-ization term (λ ) lreg ∈ {0.00001,0.001}. These parameters result in 48 differenthyper-parameter settings for the proposed model. All models are trained with thesame number of iterations and training is stopped after 1000 epochs.4.1.5 Regularization and Data AugmentationThere are many strategies to stabilize learning and prevent the model from over-fitting while training. By adding a penalty term (regularizer) to the loss function,it is possible to prevent the coefficients or weights from getting too large. In thisresearch, we used a `2 regularizer term in the loss function in the form of λ‖w‖22,48where λ = 0.001. Moreover, λ will be a hyper-parameter that we will investigatethe optimal value for it.Another method to prevent the over-fitting problem is dropout. Dropout layerslimit the co-adaptation of the feature extracting blocks and force the neurons tofollow the overall behaviour. In each step, the dropout layer removes some randomunits the neurons from the previous layer of the network based on the probabilityparameter of the dropout layer as another hyper-parameter. Thus, by removingthe units, the network architecture is changed in each training step. This impliesthat dropout integrates diverse architectures in the model [54]. In other words, adropout acts like adding random noise to hidden layers of the model. In our design,a dropout layer was deployed after each FC layer. The probability of the dropoutis also another hyper-parameter.Data augmentation is the next approach to prevent over-fitting and add transi-tional invariance to the model. Therefore, the train samples were augmented on-the-fly while training. In every mini-batch, each sample was translated horizontallyand rotated. The number of pixels for translation of the image was generated ran-domly from a zero-mean Gaussian distribution with a standard deviation of 15%of the image width. Likewise, rotational invariance is added to the images, on-the-fly, with a random degree generated from a zero-mean Gaussian distribution withσ = 25 degrees and capped to 2σ .Since the WMA labels are for the original non-augmented files, the maximumfor transitional and rotational augmentation is limited. Thus, the upper limit forboth transitional and rotational augmentation is estimated by an expert cardiologiston the research team. This is to make sure that data augmentation does not affectthe clinical value of the cine files.4.1.6 TrainingOnce the hyper-parameters are selected and the architecture of the model is final-ized, the proposed network is trained on the entire training set containing bothtrain and validation data subsets. The training was repeated four times not only toemphasize the robustness of the results but to justify the random initialization anddifferent training paradigms. The final performance of the models was evaluated49based on the predictions of the model for the test set. Neither in hyper-parameterselection nor final training test data was not deployed or analyzed in the design ofnetworks. Adam optimizer is used for training the networks.Adam is an optimization algorithm that can be used extensively instead of theclassical stochastic gradient descent procedure to update network weights itera-tively. Adam is a gradient-based optimization algorithm using adaptive estimatesof lower-order moments for stochastic objective functions. While stochastic gradi-ent descent maintains a single and constant learning rate for all parameter updatesduring training, Adam maintains a learning rate for each network parameter andadapts separately as learning unfolds.The suggested parameters for training deep learning models in the paper are:learning− rate = 0.001,β1 = 0.9,β2 = 0.999, and ε = 1e−08. (4.1)A relatively high momentum of 0.9 will be sufficient to reward the persistent reduc-tion in the loss. The optimal initial value for the learning rate in our framework was1e−5. After 3000 epochs of training, diagnostic accuracy was calculated using thetest set. During training, a small batch-size of 8 triple-cines was favoured.Early stopping is also deployed to prevent over-fitting. The cost function (loss)is used as a performance measure for it. Training is stopped if the loss does notdecrease or increases after 100 epochs. Moreover, an absolute difference in theloss of less than γ = 0.0001 is considered as no improvement. All the network pa-rameters were randomly initialized using a zero-mean Gaussian distribution. Thisparadigm as described ensures a substantial convergence of the network. In all tri-als, training is stopped once the network is converged. Convergence was definedas the state in which no progress was observed in the loss decay. Moreover, sincethe classes distributions is imbalanced, all the training batch data is sub-sampledusing a stratified sampling.4.1.7 ImplementationKeras was first developed as part of the project Open-ended Neuro-Electronic In-telligent Robot Operating System (ONEIROS) and its main author is Franc¸ois Chol-50let, a Google engineer. [ Retrieved 2018-02-23.] It is an open-sourceneural network library in python that is capable of running on top of TensorFlow,Microsoft Cognitive Toolkit, Theano, or PlaidML. Keras was used for training andtesting of the proposed network on the Tensorflow backend with Python version 2.7programming language (Python Software Foundation, Beaverton, Oregon). Theexperiments were undertaken using Nvidia GeForce GTX 980 Ti GPU with 2816CUDA cores and a GPU clock of 1 GHz, featuring from the CUDA runtime plat-form version 8.4.2 ResultsThe study population consisted of 953 patients with at least one malfunctioningLV region. Since, the frequency of abnormal cases is roughly 20% of the wholedata, reporting only the overall accuracy measurement would not be sufficient forevaluating the network performance. Thus, all the numbers reported in the tableare the accuracy per each class. Given that we only have two classes of normal andabnormal, the reported accuracy per class would be the specificity and sensitivityrespectively.Table 4.1 shows the co-variance matrix of the 16-segment labels and WMSI.As it is shown, there is a high correlation between WMSI and regions 7, 8, 9, 13,14, 15 and 16. Therefore, the wall motion score index had a higher co-variancewith the apical regions than in the other regions. Figure 4.1 illustrates the valueof the loss function in the training and validation sets for training the proposedmodel, where the horizontal axis is the number of epochs and the vertical axis isthe value of loss function. As shown in this figure, the model converges in thetraining process near the 1000th epoch. It has been seen that SGD and Adam havesimilar performance on our data while SGD learns faster than Adam optimizer.Table 4.3 shows that the Densenet network with LSTM cells and optical flowchannel(D2D+LSTM+OF) leads to a lower loss value and a higher accuracy inmost of the regions visible in A2C view. Besides, same performance have beenobserved for the other two views with their corresponding regions in the Tables 4.4and 4.5.Quantitative results for detection of global WMA obtained in this study are51Figure 4.1: The value of loss for training and validation over each epoch.demonstrated in Table 4.2. The highest performance is achieved using the DenseNetsand LSTMs for detection of regional wall motion abnormalities.As it is seen in the results tables, C3D performs lower than the D2D+LSTMor GRU networks, this difference suggests that the spatiotemporal decompositionof D2D+LSTM makes the optimization easier compared to C3D. The other note-worthy observation, would be the result of adding optical flow transform of thevideo as a separate channel. However, this doubles the number of parameters, itincreases the accuracy per each class in all models, significantly. This proves thatthe network is analyzing the motion through the cines.By comparing the results in C2D and D2D structures, it is observed that Denseblocks have been more successful in pertaining the relevant spatial features of theimages. A key pattern recognized in our observations is the link between modelperformance and the quality of images. The performance increases as quality ofthe cines increase.52Region1 Region2 Region3 Region4 Region5 Region6 Region7 Region8 Region9 Region10 Region11 Region12 Region13 Region14 Region15 Region16 WMSI GlobalWMRegion1 0.08Region2 0.05 0.23Region3 0.04 0.08 0.25Region4 0.03 0.03 0.12 0.43Region5 0.03 0.02 0.08 0.18 0.26Region6 0.03 0.02 0.03 0.05 0.06 0.08Region7 0.08 0.13 0.07 0.02 0.00 0.03 0.37Region8 0.06 0.21 0.09 -0.02 -0.03 0.01 0.28 0.65Region9 0.05 0.16 0.16 0.06 0.02 0.02 0.18 0.30 0.44Region10 0.04 0.05 0.12 0.34 0.16 0.06 0.05 0.04 0.11 0.44Region11 0.04 0.02 0.07 0.20 0.22 0.07 0.02 -0.02 0.04 0.22 0.35Region12 0.05 0.05 0.05 0.07 0.06 0.07 0.11 0.07 0.08 0.09 0.10 0.18Region13 0.07 0.17 0.09 0.02 0.00 0.04 0.33 0.48 0.29 0.08 0.03 0.11 0.72Region14 0.05 0.15 0.09 -0.02 -0.02 0.02 0.27 0.49 0.33 0.05 0.00 0.08 0.55 0.78Region15 0.06 0.12 0.11 0.12 0.07 0.05 0.26 0.34 0.25 0.18 0.13 0.13 0.55 0.44 0.69Region16 0.06 0.14 0.08 -0.01 -0.01 0.03 0.30 0.48 0.29 0.06 0.02 0.12 0.55 0.70 0.46 0.76WMSI 0.05 0.10 0.09 0.10 0.07 0.04 0.16 0.21 0.17 0.13 0.10 0.09 0.25 0.25 0.25 0.25 0.14GlobalWM -0.02 -0.06 -0.07 -0.15 -0.09 -0.03 -0.16 -0.26 -0.17 -0.18 -0.14 -0.07 -0.31 -0.38 -0.31 -0.36 -0.17 0.71Table 4.1: The co-variance matrix of the 16-segment labels and WMSI.534.3 Discussion and SummarySeveral methods have been developed for measuring cardiac wall motion and strainand strain rate. However, a quantitative automatic method is needed to fulfill theinter-observer or intra-observer variability and reproducibility limitations of thethose methods. Although strain and 3D echo have superior sensitivity and repro-ducibility than the routine echo, they require more expertise and several techni-cal difficulties and standardization issues remain. Having portable and handheldechocardiography in emergency departments and intensive care units to evaluatepatient hemodynamics results in easier LV wall motion assessment. Moreover, it ismore generalizable to a wider range of physicians with various levels of training,expertise and echocardiographic equipment. There are some groups that have pro-posed automated algorithms for assessment of the the LV function [31]. Though,the majority of them remain semiautomatic, since an observer is needed to inputthe important landmarks. Thus, a fully automated assessment algorithm is neededto do the WMA assessment without any user interaction. The reported results inthe previous section suggest that deep learning with its powerful features in clas-sification is used for development of a fully automated system for wall motionabnormality detection.The most significant limitation is the difficulty to obtain good images that in-terpret motion abnormalities clearly. The wall motion scores were assessed by anexperienced echocardiographer. While the estimation of the wall motion abnor-malities can be subjective, we categorized the patients to normal and abnormal toreduce the observer variability to less than 5%. However, short-axis parasternalviews offer the real 3D (360) analysis of cardiac dynamics as opposed to the lim-ited degrees of evaluation obtained with the thin sagittal cuts from the apical views,the acquiring process is much harder for PSAX views. Thus, the measurementsfrom apical views are studied. Given the linear relationship between the modelperformance and image quality in the cines, misclassified cines generally have un-clear LV boundaries, which cause a great deal of variance in the appearance of theheart and its motion. Besides, despite automatic view classification done for theseexperiments, confusion between the apical views (A2C, three-chamber, A4C andfive-chamber) appears to remain a challenge and a potential source of error. Thus,54Global WMAMethod Normal AbnormalC2D 53.4% 54.2%C2D-OF 55.1% 56.2%D2D[28] 54.5% 54.9%D2D-OF 56.7% 56.4%C2D+LSTM 56.9% 55.6%C2D+LSTM-OF 59.3% 57.1%C2D+GRU 57.3% 56.0%C2D+GRU-OF 58.7% 59.1%D2D+LSTM 64.5% 67.8%D2D+LSTM-OF 68.8% 70.3%D2D+GRU 63.3% 66.1%D2D+GRU-OF 69.4% 69.6%C3D 61.5 % 62.9%C3D-OF 65.2% 66.1%Table 4.2: The Global WMA accuracy per class comparison of experimentedmethods.a bottom-up approach for improving WMA accuracy can be through improving thequality of the input data.The benefit of usage of deep learning to other machine learning methods, id thatdeep leaning creates the matching features needed for classification automaticallyvia its intermediate layers and different structures in deep learning extract differentfeatures. Thus, the difference between the deep learning algorithms performanceis being addressed in this way and shows why a deep learning model is superior toothers.Since LV localization has been a key step in some ejection fraction estimationapproaches proposed for CMR, another approach worth exploring is whether LV55Region Numbers1 4 7 10 13 15Method Normal Abnormal Normal Abnormal Normal Abnormal Normal Abnormal Normal Abnormal Normal AbnormalC2D 53.0% 53.4% 53.5% 53.9% 54.0% 54.5% 54.4% 54.3% 54.1% 54.6% 53.1% 53.5%C2D+OF 54.8% 55.1% 55.3% 54.4% 55.1% 55.9% 56.0% 56.5% 55.5% 55.8% 54.5% 56.0%D2D 54.2% 54.2% 53.0% 54.8% 55.1% 55.9% 54.0% 54.9% 55.2% 55.0% 54.9% 54.8%D2D+OF 55.7% 55.4% 56.0% 55.1% 56.2% 55.0% 56.7% 56.8% 56.4% 56.6% 55.9% 57.3%C2D+LSTM 56.4% 54.6% 56.5% 55.5% 57.2% 55.8% 57.3% 56.2% 56.9% 56.0% 56.3% 55.1%C2D+LSTM+OF 58.5% 56.3% 58.9% 57.1% 59.4% 57.1% 59.9% 57.9% 59.4% 57.7% 59.0% 56.9%C2D+GRU 56.7% 55.1% 56.9% 55.4% 57.7% 56.1% 57.8% 56.6% 57.4% 56.3% 56.8% 55.4%C2D+GRU+OF 58.1% 58.2% 58.5% 58.0% 58.9% 59.0% 59.1% 59.5% 59.0% 59.3% 58.0% 58.7%D2D+LSTM 63.8% 66.8% 64.1% 66.6% 64.6% 67.5% 64.8% 68.9% 65.1% 67.0% 63.5% 65.5%D2D+LSTM+OF 67.5% 69.4% 68.0% 69.9% 68.5% 70.4% 69.3% 71.4% 69.3% 69.6% 67.8% 68.2%D2D+GRU 61.7% 65.0% 62.5% 65.1% 63.6% 65.4% 63.3% 67.1% 64.0% 65.8% 62.0% 63.5%D2D+GRU+OF 67.7% 68.4% 68.3% 68.5% 69.9% 68.8% 69.8% 70.6% 70.3% 68.9% 68.2% 67.1%C3D 61.5% 62.5% 61.7% 62.9% 62.2% 62.3% 62.7% 63.9% 62.4% 63.7% 61.4% 62.5%C3D+OF 65.5% 65.4% 66.4% 65.9% 66.7% 65.8% 65.9% 67.2% 67.4% 67.1% 66.6% 65.9%Table 4.3: The A2C relevant RWMA accuracy per class comparison of ex-perimented methods.Region Numbers3 6 9 12 14 16Method Normal Abnormal Normal Abnormal Normal Abnormal Normal Abnormal Normal Abnormal Normal AbnormalC2D 53.4% 53.2% 53.6% 53.4% 55.1% 55.5% 54.0% 54.6% 54.1% 54.6% 53.1% 53.5%C2D+OF 54.6% 54.5% 55.9% 54.0% 55.8% 56.4% 55.6% 56.9% 55.9% 55.7% 54.0% 56.8%D2D 54.5% 54.1% 54.2% 54.9% 56.0% 56.4% 55.7% 55.3% 55.0% 55.5% 54.9% 54.4%D2D+OF 55.4% 55.8% 56.5% 55.8% 56.2% 56.9% 56.5% 57.1% 56.4% 56.5% 55.3% 57.3%C2D+LSTM 56.3% 55.6% 56.1% 56.1% 56.9% 56.2% 57.2% 55.8% 57.5% 56.6% 55.9% 56.5%C2D+LSTM+OF 58.4% 56.5% 58.8% 57.3% 59.6% 57.5% 60.5% 58.2% 58.3% 57.0% 59.5% 56.7%C2D+GRU 56.1% 55.5% 57.3% 55.7% 57.2% 56.5% 57.2% 56.3% 58.0% 56.9% 56.0% 55.2%C2D+GRU+OF 58.5% 58.3% 58.2% 58.9% 58.3% 58.7% 59.5% 59.9% 60.5% 61.3% 59.5% 58.0%D2D+LSTM 64.2% 66.5% 64.9% 66.0% 65.7% 67.2% 64.1% 67.7% 65.4% 68.3% 64.2% 64.9%D2D+LSTM+OF 66.8% 69.9% 69.4% 68.6% 69.7% 71.5% 70.8% 70.9% 69.0% 69.8% 68.1% 68.9%D2D+GRU 61.2% 65.9% 63.1% 66.4% 63.2% 66.1% 63.9% 66.9% 63.5% 66.7% 63.9% 64.7%D2D+GRU+OF 67.1% 68.9% 68.0% 67.9% 68.8% 67.3% 69.9% 71.3% 70.4% 69.2% 68.4% 66.9%C3D 60.9% 61.8% 62.9% 63.4% 62.8% 61.7% 61.9% 63.2% 62.9% 63.2% 61.9% 63.0%C3D+OF 66.4% 65.9% 67.2% 66.6% 66.2% 64.3% 65.1% 67.9% 66.9% 67.8% 67.3% 65.3%Table 4.4: The A4C relevant RWMA accuracy per class comparison of ex-perimented methods.56Region Numbers2 5 8 11 14 16Method Normal Abnormal Normal Abnormal Normal Abnormal Normal Abnormal Normal Abnormal Normal AbnormalC2D 52.8% 53.2% 53.3% 53.7% 54.2% 54.1% 54.4% 54.3% 54.0% 55.1% 53.6% 53.2%C2D+OF 54.5% 55.0% 54.7% 54.2% 54.9% 56.2% 55.5% 56.1% 55.9% 55.8% 54.5% 56.6%D2D 53.9% 53.8% 54.5% 54.6% 54.9% 55.0% 55.8% 58.1% 55.2% 55.7% 54.3% 54.8%D2D+OF 55.3% 55.8% 55.4% 55.9% 56.0% 56.8% 56.3% 57.2% 56.5% 56.6% 55.9% 58.0%C2D+LSTM 56.9% 54.8% 56.9% 54.9% 57.5% 56.5% 57.1% 55.9% 56.5% 56.7% 56.0% 55.5%C2D+LSTM+OF 58.0% 57.5% 58.2% 57.9% 59.7% 57.0% 56.3% 57.4% 59.8% 57.5% 58.8% 56.1%C2D+GRU 56.5% 55.8% 56.4% 55.9% 57.5% 56.9% 59.9% 56.8% 57.9% 56.1% 55.9% 56.3%C2D+GRU+OF 58.6% 58.9% 58.1% 59.2% 57.5% 59.4% 58.9% 59.3% 60.2% 59.1% 57.5% 57.5%D2D+LSTM 63.5% 62.1% 63.9% 65.3% 64.5% 67.9% 64.3% 68.2% 65.8% 67.5% 62.9% 66.0%D2D+LSTM+OF 67.0% 69.9% 68.5% 69.3% 68.8% 71.9% 69.5% 71.0% 69.5% 68.5% 68.9% 67.1%D2D+GRU 62.9% 63.4% 63.5% 64.9% 63.8% 65.2% 64.7% 65.9% 63.9% 66.6% 63.5% 62.4%D2D+GRU+OF 67.5% 68.0% 68.9% 67.5% 69.8% 68.6% 67.7% 70.2% 71.1% 68.8% 68.7% 68.8%C3D 61.2% 61.7% 61.8% 61.6% 62.8% 61.9% 62.2% 64.2% 66.9% 62.5% 61.1% 62.9%C3D+OF 65.9% 64.9% 66.1% 65.5% 66.8% 64.9% 65.6% 67.7% 67.5% 66.6% 65.9% 66.0%Table 4.5: The PLAX relevant RWMA accuracy per class comparison of ex-perimented methods.localization helps with WMA detection in echo. Excluding the motion of the atriaand right ventricle decreases variance from the neighbouring chambers. Goingthrough the failed cases, a number of studies might have been misclassified due towrong localization of the LV. There are some approaches that localize LV automat-ically with the current segmentation networks [24]. These methods can be used toto localize, track and accordingly crop LV throughout the cine.Our results suggest that the estimation of WMS is an accurate method. Anyechocardiography machine can then easily translate this routine information into arobust estimate of WMS without the use of any strain measurement.57Chapter 5ConclusionIn this thesis, we proposed an automatic system for the evaluation of regional wallmotion abnormality. Given the variability and challenge of coding RWMAs, thedevelopment of a platform that can quickly and accurately identify regional wallmotion abnormalities on echo images will be significant asset. Such a tool hasseveral applications to improve the accuracy and consistency of RWMA reportingwith bedside echo at the point-of-care, stress echo, and for the workflow of the lab.The required data were collected from the VGH in two streams consisting 489unique patients. First part of the data was echo images from these patients and thesecond piece contained the corresponding measurements and pathology reports.The relevant measurements included diagnostic information for standard measure-ments, comments, etc along with patient information (e.g.,name, age) and examinformation. These information were linked to the echo images using the patientsunique IDs and dates. Eventually the data were transformed into a local databaseto make it in a proper format as an input for the model.Furthermore, we recognize that the categorization of RWMAs is imprecise andthat there are different degrees of hypokinesis that may correlate with differingseverity of flow-limiting lesions. These subtle differences in echo appearance maybe difficult to perceive by human interpreters but are detected with a robust machinelearning model.The overarching objective of our research program is the development of inte-grated tools to assist and automate in the acquisition and interpretation of echo. For58this dissertation, a novel deep learning model is implemented to: identify RWMAsthus, reducing the time to accurate interpretation; and predict the degree of ob-structive coronary disease through multi-view data analysis. The created databaseallows us to leverage the extensive echo dataset of patients’ images, coded RW-MAs, and complete clinical reports.Automated identification of regional wall motion abnormalities was imple-mented through the following fazes: Using our own previously developed ma-chine learning platform for view classification, we identified and retrieved keyecho views required for the visualization of RWMAs (parasternal long-axis, api-cal 2-chamber, and apical 4-chamber). These clips were coded for regional wallmotion abnormalities for training and validation of a machine learning model. Asdescribed in chapter 4, a model that can predict the regional and global wall motionabnormality was trained and hyper-parameters of it were tuned. In the evaluationphase, the model predicted the abnormality of all regions for the test cases.5.1 ContributionsIn this work, a framework for regional wall motion abnormality analysis on echoimaging information has been investigated. The contributions made to reach thisgoal are summarized below:A study on the heart and relevant cardiac diseases were performed. The di-agnostic imaging techniques were reviewed. In order to improve the diagnosis ofwall motion abnormalities, we decided to develop an artificially intelligent modelto assist in the identification of the wall motion abnormalities through echo im-ages. This is of a great value given the variability and challenges of scoring suchabnormalities.Then, I developed a deep learning framework consisting of 3 views of echocar-diography data as input, with abnormality prediction for 16 wall segments as out-put. The network directly analyzes echo data without any need for prior segmenta-tion of the cardiac LV wall.The models were trained using the data consisiting 489 patients. The hyperparameters of the model were then optimized. The resultant model precisely iden-tifies regional wall motion abnormalities as well as global WMA on echo as com-59pared to the expert human labels with advanced echo training. In independent testdataset, I demonstrated that the neural network can produce accuracy as high as69.2% for detection of abnormal wall motion.The developed model, to the best of our knowledge, is the first to analyze theleft ventricle wall motion for both global and regional abnormality detection inechocardiography data. This can be considered a great contribution in enhancingthe diagnostic process of the wall motion abnormalities, by expediting the processfor physicians.5.2 Future WorkThere is room for future work to improve the accuracy of the current framework.Moreover, the current framework is only predicting the normal versus abnormalregional and global function, it can be extended to predict all severity levels. Be-sides, we hypothesize that a machine learning approach is able to produce a modelthat can accurately predict the severity of obstructive coronary disease based onecho images as compared to the reference standard, degree of epicardial stenosison invasive coronary angiogram or coronary computed tomography angiograms.Also, since only the apical views are used for predictions, it might be worth tryingto compare the results with the parasternal views as well.To conclude, while there are many approaches available for WMA detectionusing speckle tracking, we believe we are the first to propose a model that predictsall the 16 segments abnormality using apical views of echocardiography. The flex-ibility and true power of the framework are seen by combining global wall motionabnormality predictions, which results in better performance using only 2DE imageinformation. The benefits advanced by this unified framework are: 1) it can predictboth regional and global wall motion abnormality of LV, 2) due to the low com-putational complexity of the framework, the framework can be applied to portableecho machines that are frequently used in the emergency room and in rural clinics.60Bibliography[1] Truecrypt. Phase II NCC OCAP final.pdf,2015. Accessed: 2010-08-30. → page 26[2] A. H. Abdi, C. Luong, T. Tsang, G. Allan, S. Nouranian, J. Jue, D. Hawley,S. Fleming, K. Gin, J. Swift, et al. Automatic quality assessment ofechocardiograms using convolutional neural networks: feasibility on theapical four-chamber view. IEEE Transactions on Medical Imaging, 36(6):1221–1230, 2017. → page 15[3] M. Afshin, I. B. Ayed, K. Punithakumar, M. Law, A. Islam, A. Goela,T. Peters, and S. Li. Regional assessment of cardiac left ventricularmyocardial function via mri statistical features. IEEE Transactions onMedical Imaging, 33(2):481–494, 2013. → page 22[4] S. Amari et al. The Handbook of Brain Theory and Neural Networks. MITPress, 2003. → page 21[5] S. Azizi, S. Bayat, P. Yan, A. Tahmasebi, J. T. Kwak, S. Xu, B. Turkbey,P. Choyke, P. Pinto, B. Wood, et al. Deep recurrent neural networks forprostate cancer detection: analysis of temporal enhanced ultrasound. IEEETransactions on Medical Imaging, 37(12):2695–2703, 2018. → pages 18, 19[6] J. Bergstra and Y. Bengio. Random search for hyper-parameter optimization.Journal of Machine Learning Research, 13(Feb):281–305, 2012. → page 21[7] J. S. Bergstra, R. Bardenet, Y. Bengio, and B. Ke´gl. Algorithms forhyper-parameter optimization. In Advances in Neural InformationProcessing Systems, pages 2546–2554, 2011. → page 21[8] D. S. Blondheim, R. Beeri, M. S. Feinberg, M. Vaturi, S. Shimoni,W. Fehske, A. Sagie, D. Rosenmann, P. Lysyansky, L. Deutsch, et al.61Reliability of visual assessment of global and segmental left ventricularfunction: a multicenter study by the israeli echocardiography researchgroup. Journal of the American Society of Echocardiography, 23(3):258–264, 2010. → page 12[9] Y.-L. Boureau, F. Bach, Y. LeCun, and J. Ponce. Learning mid-level featuresfor recognition. In 2010 IEEE Computer Society Conference on ComputerVision and Pattern Recognition, pages 2559–2566. IEEE, 2010. → page 41[10] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. Empirical evaluation of gatedrecurrent neural networks on sequence modeling. In Neural InformationProcessing Systems 2014 Workshop on Deep Learning, 2014. → page 19[11] C. B. Compas, E. Y. Wong, X. Huang, S. Sampath, B. A. Lin, P. Pal,X. Papademetris, K. Thiele, D. P. Dione, M. Stacy, et al. Radial basisfunctions for combining shape and speckle tracking in 4d echocardiography.IEEE Transactions on Medical Imaging, 33(6):1275–1289, 2014. → page 21[12] T. Edvardsen, B. L. Gerber, J. Garot, D. A. Bluemke, J. A. Lima, and O. A.Smiseth. Quantitative assessment of intrinsic regional myocardialdeformation by doppler strain rate echocardiography in humans: validationagainst three-dimensional tagged magnetic resonance imaging. Circulation,106(1):50–56, 2002. → pages xi, 13, 14[13] J. Ehrhardt, M. Wilms, H. Handels, and D. Sa¨ring. Automatic detection ofcardiac remodeling using global and local clinical measures and randomforest classification. In Statistical Atlases and Computational Models of theHeart, pages 199–207. Springer, 2015. → page 21[14] A. Eitan, I. Kehat, D. Mutlak, G. Lichtenberg, D. Amar, and Y. Agmon.Longitudinal two-dimensional strain for the diagnosis of left ventricularsegmental dysfunction in patients with acute myocardial infarction. TheInternational Journal of Cardiovascular Imaging, 34(2):237–249, 2018. →page 13[15] A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, andS. Thrun. Dermatologist-level classification of skin cancer with deep neuralnetworks. Nature, 542(7639):115, 2017. → page 21[16] F. A. Gers, J. Schmidhuber, and F. Cummins. Learning to forget: continualprediction with lstm. In 1999 Ninth International Conference on ArtificialNeural Networks ICANN 99. (Conf. Publ. No. 470), volume 2, pages850–855 vol.2, 1999. doi:10.1049/cp:19991218. → page 1962[17] J. Goebel, F. Nensa, H. P. Schemuth, S. Maderwald, H. H. Quick, T. W.Schlosser, and K. Naßenstein. Detection of regional wall motionabnormalities in compressed sensing cardiac cine imaging. World Journal ofCardiovascular Diseases, 8(06):277, 2018. → page 4[18] J. Gorcsan III, D. P. Strum, W. A. Mandarino, V. K. Gulati, and M. R.Pinsky. Quantitative assessment of alterations in regional left ventricularcontractility with color-coded tissue doppler echocardiography: comparisonwith sonomicrometry and pressure-volume relations. Circulation, 95(10):2423–2433, 1997. → page 13[19] A. Graves, A.-r. Mohamed, and G. Hinton. Speech recognition with deeprecurrent neural networks. In 2013 IEEE International Conference onAcoustics, Speech and Signal Processing, pages 6645–6649. IEEE, 2013. →page 18[20] V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy,S. Venugopalan, K. Widner, T. Madams, J. Cuadros, et al. Development andvalidation of a deep learning algorithm for detection of diabetic retinopathyin retinal fundus photographs. The Journal of the American MedicalAssociation, 316(22):2402–2410, 2016. → page 21[21] R. S. Horowitz, J. Morganroth, C. Parrotto, C. C. Chen, J. Soffer, and F. J.Pauletto. Immediate diagnosis of acute myocardial infarction bytwo-dimensional echocardiography. Circulation, 65(2):323–329, 1982. →pages 5, 11, 12[22] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger. Denselyconnected convolutional networks. In Proceedings of the IEEE Conferenceon Computer Vision and Pattern Recognition, pages 4700–4708, 2017. →page 43[23] D. H. Hubel and T. N. Wiesel. Receptive fields, binocular interaction andfunctional architecture in the cat’s visual cortex. The Journal of Physiology,160(1):106–154, 1962. → page 17[24] M. H. Jafari, H. Girgis, N. Van Woudenberg, Z. Liao, R. Rohling, K. Gin,P. Abolmaesumi, and T. Tsang. Automatic biplane left ventricular ejectionfraction estimation with mobile point-of-care ultrasound using multi-tasklearning and adversarial training. International Journal of ComputerAssisted Radiology and Surgery, 14(6):1027–1037, 2019. → page 5763[25] M. I. Jordan and T. M. Mitchell. Machine learning: Trends, perspectives,and prospects. Science, 349(6245):255–260, 2015. → page 21[26] S. Kida, T. Nakamoto, M. Nakano, K. Nawa, A. Haga, J. Kotoku,H. Yamashita, and K. Nakagawa. Cone beam computed tomography imagequality improvement using a deep convolutional neural network. CureusJournal of Medical Science, 10(4), 2018. → page 21[27] M. Kinno, P. Nagpal, S. Horgan, and A. H. Waller. Comparison ofechocardiography, cardiac magnetic resonance, and computed tomographicimaging for the evaluation of left ventricular myocardial function: Part 1(global assessment). Current Cardiology Reports, 19(1):9, Feb 2017. ISSN1534-3170. doi:10.1007/s11886-017-0815-4. URL → page 12[28] K. Kusunose, T. Abe, A. Haga, D. Fukuda, H. Yamada, M. Harada, andM. Sata. A deep learning approach for assessment of regional wall motionabnormality from echocardiographic images. JACC: CardiovascularImaging, 2019. → pages 21, 55[29] R. M. Lang, L. P. Badano, V. Mor-Avi, J. Afilalo, A. Armstrong, L. Ernande,F. A. Flachskampf, E. Foster, S. A. Goldstein, T. Kuznetsova, et al.Recommendations for cardiac chamber quantification by echocardiographyin adults: an update from the american society of echocardiography and theeuropean association of cardiovascular imaging. European HeartJournal-Cardiovascular Imaging, 16(3):233–271, 2015. → pages x, 9, 12[30] Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. nature, 521(7553):436–444, 2015. → page 21[31] K. E. Leung and J. G. Bosch. Automated border detection inthree-dimensional echocardiography: principles and promises. EuropeanJournal of Echocardiography, 11(2):97–108, 2010. → page 54[32] A. Li, O. Spyra, S. Perel, V. Dalibard, M. Jaderberg, C. Gu, D. Budden,T. Harley, and P. Gupta. A generalized framework for population basedtraining. In Proceedings of the 25th ACM SIGKDD InternationalConference on Knowledge Discovery & Data Mining, pages 1791–1799,2019. → page 21[33] N. Liel-Cohen, Y. Tsadok, R. Beeri, P. Lysyansky, Y. Agmon, M. S.Feinberg, W. Fehske, D. Gilon, I. Hay, R. Kuperstein, et al. A new tool forautomatic assessment of segmental wall motion based on longitudinal 2d64strain: a multicenter study by the israeli echocardiography research group.Circulation: Cardiovascular Imaging, 3(1):47–53, 2010. → pages 11, 12[34] G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi,M. Ghafoorian, J. A. Van Der Laak, B. Van Ginneken, and C. I. Sa´nchez. Asurvey on deep learning in medical image analysis. Medical Image Analysis,42:60–88, 2017. → page 17[35] D. Maclaurin, D. Duvenaud, and R. Adams. Gradient-based hyperparameteroptimization through reversible learning. In International Conference onMachine Learning, pages 2113–2122, 2015. → page 21[36] J. Mantilla, M. Garreau, J.-J. Bellanger, and J. L. Paredes. Svm-basedclassification of lv wall motion in cardiac mri with the assessment of ste. In10th International Symposium on Medical Information Processing andAnalysis, volume 9287, page 92870N. International Society for Optics andPhotonics, 2015. → page 21[37] V. Mor-Avi, P. Vignon, R. Koch, L. Weinert, M. J. Garcia, K. T. Spencer, andR. M. Lang. Segmental analysis of color kinesis images: new method forquantification of the magnitude and timing of endocardial motion during leftventricular systole and diastole. Circulation, 95(8):2082–2097, 1997. →pages xi, 13, 14[38] V. Mor-Avi, L. Sugeng, and R. M. Lang. Real-time 3-dimensionalechocardiography: an integral component of the routine echocardiographicexamination in adult patients. Circulation, 119(2):314–329, 2009. → pagesxi, 14[39] V. Nair and G. E. Hinton. Rectified linear units improve restrictedboltzmann machines. In Proceedings of the 27th International Conferenceon Machine Learning (ICML-10), pages 807–814, 2010. → page 41[40] O. Oktay, E. Ferrante, K. Kamnitsas, M. Heinrich, W. Bai, J. Caballero,S. A. Cook, A. De Marvao, T. Dawes, D. P. O‘Regan, et al. Anatomicallyconstrained neural networks (acnns): application to cardiac imageenhancement and segmentation. IEEE Transactions on Medical Imaging, 37(2):384–395, 2017. → page 22[41] H. A. Omar, J. S. Domingos, A. Patra, R. Upton, P. Leeson, and J. A. Noble.Quantification of cardiac bull’s-eye map based on principal strain analysisfor myocardial wall motion assessment in stress echocardiography. In 201865IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018),pages 1195–1198. IEEE, 2018. → page 22[42] A. H. A. W. G. on Myocardial Segmentation, R. for Cardiac Imaging:, M. D.Cerqueira, N. J. Weissman, V. Dilsizian, A. K. Jacobs, S. Kaul, W. K.Laskey, D. J. Pennell, J. A. Rumberger, T. Ryan, et al. Standardizedmyocardial segmentation and nomenclature for tomographic imaging of theheart: a statement for healthcare professionals from the cardiac imagingcommittee of the council on clinical cardiology of the american heartassociation. Circulation, 105(4):539–542, 2002. → pages 9, 11[43] OpenStax. Anatomy Physiology. OpenStax CNX., Feb 26, 2016. → page 2[44] N. Ouzir, O. Lairez, A. Basarab, and J.-Y. Tourneret. Tissue motionestimation using dictionary learning: Application to cardiac amyloidosis. In2017 IEEE International Ultrasonics Symposium (IUS), pages 1–4. IEEE,2017. → page 21[45] D. Peressutti, M. Sinclair, W. Bai, T. Jackson, J. Ruijsink, D. Nordsletten,L. Asner, M. Hadjicharalambous, C. A. Rinaldi, D. Rueckert, et al. Aframework for combining a motion atlas with non-motion information tolearn clinically useful biomarkers: application to cardiac resynchronisationtherapy response prediction. Medical Image Analysis, 35:669–684, 2017. →page 22[46] D. Platts and M. Monaghan. Colour encoded endocardial tracking: thecurrent state of play. European Journal of Echocardiography, 4(1):6–16,2003. → page 13[47] E. Puyol-Anto´n, M. Sinclair, B. Gerber, M. S. Amzulescu, H. Langet,M. De Craene, P. Aljabar, P. Piro, and A. P. King. A multimodalspatiotemporal cardiac motion atlas from MR and ultrasound data. MedicalImage Analysis, 40:96–110, 2017. → page 22[48] E. Puyol-Anto´n, M. Sinclair, B. Gerber, M. S. Amzulescu, H. Langet,M. De Craene, P. Aljabar, J. A. Schnabel, P. Piro, and A. P. King. Multiviewmachine learning using an atlas of cardiac cycle motion. In InternationalWorkshop on Statistical Atlases and Computational Models of the Heart,pages 3–11. Springer, 2017. → page 22[49] E. Puyol-Anto´n, B. Ruijsink, W. Bai, H. Langet, M. De Craene, J. A.Schnabel, P. Piro, A. P. King, and M. Sinclair. Fully automated myocardialstrain estimation from cine mri using convolutional neural networks. In662018 IEEE 15th International Symposium on Biomedical Imaging (ISBI2018), pages 1139–1143. IEEE, 2018. → page 22[50] A. Rizvi, R. C. Dean˜o, D. P. Bachman, G. Xiong, J. K. Min, and Q. A.Truong. Analysis of ventricular function by ct. Journal of CardiovascularComputed Tomography, 9(1):1–12, 2015. → page 5[51] S. Sanchez-Martinez, N. Duchateau, T. Erdei, A. G. Fraser, B. H. Bijnens,and G. Piella. Characterization of myocardial motion patterns byunsupervised multiple kernel learning. Medical Image Analysis, 35:70–82,2017. → page 21[52] D. Scherer, A. Mu¨ller, and S. Behnke. Evaluation of pooling operations inconvolutional architectures for object recognition. In InternationalConference on Artificial Neural Networks, pages 92–101. Springer, 2010. →page 41[53] J. Snoek, H. Larochelle, and R. P. Adams. Practical bayesian optimization ofmachine learning algorithms. In Advances in Neural Information ProcessingSystems, pages 2951–2959, 2012. → page 21[54] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov.Dropout: a simple way to prevent neural networks from overfitting. TheJournal of Machine Learning Research, 15(1):1929–1958, 2014. → pages45, 49[55] A. Suinesiaputra, A. F. Frangi, T. A. Kaandorp, H. J. Lamb, J. J. Bax, J. H.Reiber, and B. P. Lelieveldt. Automated detection of regional wall motionabnormalities based on a statistical model applied to multislice short-axiscardiac mr images. IEEE Transactions on Medical Imaging, 28(4):595–607,2009. → page 22[56] A. Suinesiaputra, P. Ablin, X. Alba, M. Alessandrini, J. Allen, W. Bai,S. Cimen, P. Claes, B. R. Cowan, J. D’hooge, et al. Statistical shapemodeling of the left ventricle: myocardial infarct classification challenge.IEEE Journal of Biomedical and Health Informatics, 22(2):503–515, 2017.→ page 22[57] A. J. Tajik. Machine learning for echocardiographic imaging: embarking onanother incredible journey, 2016. → page 15[58] K. Thygesen, J. S. Alpert, A. S. Jaffe, B. R. Chaitman, J. J. Bax, D. A.Morrow, H. D. White, H. Mickley, F. Crea, F. Van de Werf, et al. Fourth67universal definition of myocardial infarction (2018). European HeartJournal, 40(3):237–269, 2019. → page 12[59] N. Van Woudenberg, Z. Liao, A. H. Abdi, H. Girgis, C. Luong, H. Vaseli,D. Behnami, H. Zhang, K. Gin, R. Rohling, et al. Quantitativeechocardiography: real-time quality estimation and view classificationimplemented on a mobile android device. In Simulation, Image processing,and Ultrasound Systems for Assisted Diagnosis and Navigation, pages74–81. Springer, 2018. → page 15[60] H. Vaseli, Z. Liao, A. H. Abdi, H. Girgis, D. Behnami, C. Luong, F. T.Dezaki, N. Dhungel, R. Rohling, K. Gin, et al. Designing lightweight deeplearning models for echocardiography view classification. In MedicalImaging 2019: Image-Guided Procedures, Robotic Interventions, andModeling, volume 10951, page 109510F. International Society for Opticsand Photonics, 2019. → pages 15, 39[61] D. M. Vigneault, W. Xie, D. A. Bluemke, and J. A. Noble. Feature trackingcardiac magnetic resonance via deep learning and spline optimization. InInternational Conference on Functional Imaging and Modeling of the Heart,pages 183–194. Springer, 2017. → page 2268Appendix ASupporting Materials69(a) The Front Page70(b) The Valves Page71(c) The Aorta, Atria, Shunts, Percardium Page72(d) The LV/RV Assessment PageFigure A.1: A snapshot of different tabs available in the Filemaker softwarefor cardiologists to records the analytical measurements.73


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items