UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Machine learning systems for obstetric ultrasonography Porto, Lucas Resque 2020

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


24-ubc_2021_may_porto_lucas.pdf [ 16.25MB ]
JSON: 24-1.0394860.json
JSON-LD: 24-1.0394860-ld.json
RDF/XML (Pretty): 24-1.0394860-rdf.xml
RDF/JSON: 24-1.0394860-rdf.json
Turtle: 24-1.0394860-turtle.txt
N-Triples: 24-1.0394860-rdf-ntriples.txt
Original Record: 24-1.0394860-source.json
Full Text

Full Text

Machine learning systems for obstetric ultrasonographybyLucas Resque PortoBASc., The University of British Columbia, 2016A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFMaster of Applied ScienceinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Electrical and Computer Engineering)The University of British Columbia(Vancouver)October 2020© Lucas Resque Porto, 2020The following individuals certify that they have read, and recommend to the Fac-ulty of Graduate and Postdoctoral Studies for acceptance, the thesis entitled:Machine learning systems for obstetric ultrasonographysubmitted by Lucas Resque Porto in partial fulfillment of the requirements for thedegree of Master of Applied Science in Electrical and Computer Engineering.Examining Committee:Robert Rohling, Electrical and Computer EngineeringSupervisorSeptimiu Salcudean, Electrical and Computer EngineeringSupervisory Committee MemberShahriar Mirabbasi, Electrical and Computer EngineeringSupervisory Committee MemberiiAbstractPrenatal screening and ultrasound-guided epidurals are two common applicationsof ultrasound imaging in obstetrics that help with disease prevention and pain re-lief. While urban settings typically provide the expertise to perform these proce-dures, rural and underserved settings suffer from a lack of ultrasound equipment,training, and expertise that precludes a similar quality of care. This thesis seeks toaddress gaps in equipment and expertise by presenting machine learning systemsfor automating the analysis of conventional ultrasound images without the use ofspecialized equipment.The first system is a method for automatic segmentation of the placenta from2D ultrasound sweeps acquired during first-trimester prenatal screening. We ana-lyzed the performance and speed of four different deep learning architectures forspatiotemporal segmentation applied to 133 ultrasound sweeps from a diverse pa-tient population. Compared to manual segmentations, the top-performing architec-ture achieved a Dice coefficient of 92.11±7.5% and was able to segment at a rateof 100 frames per second.The second system is a method for 2D ultrasound image augmentation forimproved interpretability during ultrasound-guided epidurals. This system relieson registering a 3D statistical shape model of the lumbar vertebrae constructedfrom computerized tomography scans of the lumbar spine to automatically clas-sified and segmented 2D ultrasound images. The classification and segmentationof ultrasound images achieved an accuracy of 90% and a mean Dice coefficient of74.9±4.9%, respectively. The registration to the segmented regions was evaluatedon 43 ultrasound images, and the achieved a root mean squared error of 1.4±0.3mm when compared to the ground truth.iiiWe showcase the ability of machine learning systems to automate ultrasoundimage analysis in common obstetric applications. The results show the potentialfor these systems to be further developed in a translational research setting.ivLay SummaryUltrasound imaging plays a central role in managing the health of the mother andthe fetus during pregnancy. In comparison with urban settings, rural and under-served areas are limited in terms of expertise and equipment necessary to deliverthe same level of care. This thesis attempts to address these limitations by de-veloping tools for the automatic analysis of ultrasound images in two commonprocedures: prenatal screening and epidurals. We developed a tool that automati-cally highlights the placenta in ultrasound images, which are difficult to interpretby non-experts. We also developed a tool that improves the visualization of thespine in ultrasound images to assist physicians in guiding epidural needles. Thesetools have the potential to allow physicians without ultrasound expertise to performthese procedures.vPrefaceThis thesis is derived from a published manuscript [1] and additional work pendingpublication.Chapter 3 is based on work by the author in collaboration with clinical in-vestigators from BC Women’s Hospital. The author, Ryan Yan, and Ricky Hucontributed to the data collection including requisition of ethics approval, data re-trieval from BC Women’s Hospital and the Perinatal Services of British Columbia,and manual processing of the data. Access to patient data was approved by theClinical Research Ethics Board (certificate number H18-0119). The author furthercontributed to the design, development and validation of the software tool pre-sented in the chapter. Professor Robert Rohling contributed with the formulationof the problem and with technical guidance. Dr. Chantal Mayer contributed withclinical interpretation of the data.Chapter 4 is based on a published conference paper: Porto, L., & Rohling, R.(2020, April). Improving Interpretability of 2-D Ultrasound of the Lumbar Spine.In 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI) (pp.1-5). The data used in this manuscript was taken from an earlier study approvedby the Clinical Research Ethics Board (certificate number H07-0691). The authorcontributed with the design, development and validation of the software tool pre-sented in the chapter. Professor Robert Rohling contributed with the formulationof the problem and with technical guidance.viTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Prenatal screening and placental ultrasound . . . . . . . . . . . . 31.2 Ultrasound-guided epidural anesthesia . . . . . . . . . . . . . . . 31.3 The case for 2D ultrasonography . . . . . . . . . . . . . . . . . . 51.4 Thesis objectives . . . . . . . . . . . . . . . . . . . . . . . . . . 61.5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.6 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.1 Ultrasound imaging . . . . . . . . . . . . . . . . . . . . . . . . . 82.1.1 Artifacts . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2 Ultrasound examinations and placental diseases . . . . . . . . . . 11vii2.3 Ultrasound-guided epidurals . . . . . . . . . . . . . . . . . . . . 152.4 Machine learning and medical image analysis . . . . . . . . . . . 172.4.1 Supervised learning fundamentals . . . . . . . . . . . . . 182.4.2 Convolutional networks for image analysis . . . . . . . . 203 Segmentation of placental ultrasound sweeps . . . . . . . . . . . . . 233.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.2 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . 243.2.1 2D fully convolutional networks with concatenated outputs 253.2.2 2D fully convolutional networks with recurrent layers . . . 263.2.3 3D fully convolutional networks . . . . . . . . . . . . . . 283.2.4 Dataset preparation and model validation . . . . . . . . . 313.2.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . 323.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 Lumbar spine ultrasound augmentation . . . . . . . . . . . . . . . . 394.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.2.1 Classification of ultrasound images in the paramedian plane 424.2.2 Semantic segmentation of lamina in the paramedian plane 434.2.3 CT statistical shape model of the lumbar spine . . . . . . 434.2.4 Registration of 3D statistical shape model to 2D ultrasound 444.3 Experiments and Results . . . . . . . . . . . . . . . . . . . . . . 474.3.1 Model training and evaluation . . . . . . . . . . . . . . . 474.3.2 Registration and visualization . . . . . . . . . . . . . . . 484.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545.1 Limitations and future work . . . . . . . . . . . . . . . . . . . . 55Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57viiiA Supporting Materials . . . . . . . . . . . . . . . . . . . . . . . . . . 68ixList of TablesTable 3.1 Performance results for semantic segmentation of ultrasoundsweeps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Table 3.2 P-values for pairwise Wilcoxon signed rank tests between theDice coefficient values of each model. The Bonferonni p-valuecorrection was applied at the 1% significance level for 6 com-parisons (p < 0.01/6 ≈ 0.0016). Please refer to Table 3.1 formodel numbers. . . . . . . . . . . . . . . . . . . . . . . . . . 33Table 4.1 Performance of lamina classification, segmentation and statisti-cal shape model registration. . . . . . . . . . . . . . . . . . . 48xList of FiguresFigure 1.1 Comparison of medical imaging modalities for lumbar spineimaging (top row) and fetal imaging (bottom row). CT (a, d)and MRI (b, e) scans provide detailed three-dimensional rep-resentations of most organs and tissues, but are expensive and,in the case of CT, rely on harmful ionizing radiation. Ultra-sound (c, f) provides a two-dimensional representation of thespine and fetus which is less intuitive, but is significantly lessexpensive and is generally safe for use in obstetrics. Image (d)is licensed to Mikael Ha¨ggstro¨m [11], used with permission.Image (e) is licensed to Alamo et. al [12], used with permission. 2Figure 1.2 Comparison between puncture site localization with a tradi-tional palpation approach (left) and pre-procedure ultrasound(right). The traditional palpation technique relies entirely ontactile feedback of the protruding structures of the spine (e.g.,spinous process and iliac crest as shown), whereas ultrasoundprovides visual feedback of the internal anatomy. Images re-produced with permission from Kim et al. [22], CopyrightMassachusetts Medical Society. . . . . . . . . . . . . . . . . 4Figure 2.1 Speckle noise as illustrated by the application of a speckle re-duction algorithm by Zhu et al. [37]. The original ultrasoundimages (a, top row; b, top row) present a grainy and fuzzyappearance. The despeckled images after application of thealgorithm are shown below the original images. . . . . . . . . 10xiFigure 2.2 Shadow artifact illustrated on a fetal ultrasound image. Thebright region indicates a highly reflective boundary and, con-sequently, a high degree of attenuation. The attenuated wavequickly decays as it travels past the boundary, creating faint tozero reflections as the tissue completely absorbs the acousticenergy of the wave, resulting in dark regions called shadows. . 12Figure 2.3 Labeled B-mode transabdominal fetal ultrasound scan performedin the second trimester. The spine and skull of the fetus are vi-sualized, where higher density bone structures exhibit brightechoes. A cross-section of the placenta is illustrated attachedto the uterine wall, where it typically presents as an ellipticalregion of similar texture. . . . . . . . . . . . . . . . . . . . . 13Figure 2.4 Placental positioning and attachment disorders diagnosed insecond and third trimester ultrasound examinations. Placentaprevia (a) relates to an abnormal positioning of placenta whereit partially or completely obstructs the cervical opening. Pla-centa accreta, increta and percreta (b) relates to the abnormalinvasion of the placenta into the uterine wall from inner (de-cidua basalis) to outer (serosa) layers. Images (a) and (b) arelicensed to Baird et. al [43], used with permission. . . . . . . 14Figure 2.5 Comparison of imaging modalities for epidural guidance: adetailed and labeled diagram of the lumbar spine (a) for refer-ence; a fluoroscopy image of the lumbar spine (b) where thevertebrae and the needle are visible; and an ultrasound image(c) of the lumbar spine showing the surfaces of the lamina.Image (a) is licensed to Yu et. al [47], used with permission.Image (b) is licensed to Ahn et. al [48], used with permission. 16xiiFigure 2.6 Outline of model-based registration: a statistical shape modelof the lumbar vertebrae (L1 through L5, shown in A) is regis-tered to a volume consisting of stitched 2D ultrasound imagesof the lumbar spine (parasagittal cross-section shown in B).The resulting registered model is a patient-specific model ofthe lumbar vertebrae that augments the anatomy seen in ultra-sound. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Figure 2.7 An illustration of the hierarchical features learned by a convo-lutional network. The network architecture consists of a com-position of convolutional layers and subsampling operations,where the first layer in the network learns low-level featuressuch as edges. The subsequent layers form compositions ofthe earlier features into more complex features such as objectparts and, in the final layers, entire objects. . . . . . . . . . . 21Figure 3.1 Sequence of frames extracted from two ultrasound sweeps ofthe placenta. In the first sweep (top row) the lower part ofthe placenta is revealed when comparing the leftmost to therightmost frame, which represent two sections of the placentasimilar to a tomographic scan. The same occurs in the secondsweep (bottom row) as fetal structures become more visiblewhen comparing the leftmost to the rightmost frame. . . . . . 26Figure 3.2 Ultrasound sweep segmentation model based on the concate-nation of predictions from a U-Net applied to each frame in thesequence. Here, the ultrasound sweep x, predicted segmenta-tion map yˆ and ground truth segmentation map y, all consist ofN frames with dimension 128-by-128 pixels. . . . . . . . . . 27xiiiFigure 3.3 Ultrasound sweep segmentation model based on the spatiotem-poral fusion of the outputs from a U-Net using a ConvLSTMlayer. Here, the U-Net outputs xˆt are passed through a ConvL-STM layer, where the output ht depends on all preceding inputs{xˆ1, ..., xˆt−1}, outputs {h1, ...,ht−1}, and internal layer states{c1, ...,ct−1} via a recurrence relation defined in Equation 3.1.The outputs of the ConvLSTM are then passed through a 2Dconvolution with sigmoid activations to generate the outputsegmentation map yˆ. . . . . . . . . . . . . . . . . . . . . . . 29Figure 3.4 Ultrasound sweep segmentation model based on an FCN withstacked ConvLSTM features. In this architecture, the higherlevel features are modelled by stacked ConvLSTM layers, whichforms a composition of several instances of the recurrece rela-tion defined in Equation 3.1. This composition of recurrencerelations increases the complexity of temporal dependencesthat can be represented by the model, particularly when com-pared to the models illustrated in Figures 3.2 and 3.3. . . . . . 30Figure 3.5 Example segmentations from the model with highest perfor-mance (U-Net with output concatenation). The top row (a-f)illustrates the ground truth segmentation maps, and the bot-tow row (e-f) illustrates the output predictions from the model.Panels d and h illustrate a failure mode, where the model failedto identify a boundary between the placeta and the uterine wall. 34xivFigure 3.6 Example success and failure cases for images with partial at-tenuation and complete attenuation (shadow) artifacts. Us-ing the same ultrasound sweep, we illustrate the ground truthsegmentation and predictions on a frame without artifacts (a),a frame with partial attenuation (b), and a frame with com-plete attenuation (c). While all models performed were ro-bust to partial attenuation, variable performance was seen withshadow artifacts. This variation in performance is likely due touncertainty in the manual labeling of attenuated regions, whereno fixed brightness threshold was stipulated to distinguish par-tial versus complete attenuation. . . . . . . . . . . . . . . . . 36Figure 3.7 Example prediction cases for different speckle noise and tex-ture appearances. The ultrasound frames on the left were cap-tured with a Philips iU22 ultrasound machine and exhibit higheramount of speckle and show visually grainier texture than theultrasound frames on the right, which were captured with aGE Voluson E8. All models exhibited a high degree of invari-ance to speckle noise and visual differences in texture with-out the inclusion of any additional features or prior informa-tion. Fourier spectra for the above examples are included inAppendix A, Figure A.1 . . . . . . . . . . . . . . . . . . . . 37Figure 4.1 Overview of proposed registration approach. An input B-modeultrasound image is fed into a classifier that determines whetherthe image belongs to the paramedian view of the lumbar spine.Then a segmentation model extracts the lamina bone surfaces,which serve as registration targets to a CT statistical shapemodel, where only the corresponding anatomy is selected forregistration. The registered model is overlayed onto the origi-nal ultrasound image. . . . . . . . . . . . . . . . . . . . . . . 41xvFigure 4.2 Statistical shape model of the lumbar spine, constructed froma training set of 64 CT scans. The mean shape of the statisticalshape model (center diagram on top and bottom rows) can betransformed (Equation 4.1) according to shape variations sta-tistically determined from the training set (mode 1, top row;mode 3, bottom row). . . . . . . . . . . . . . . . . . . . . . . 45Figure 4.3 Example visualization of the registered model to an ultrasoundimage of the L2-3 and L3-4 intervertebral levels in the parame-dian view (a), and comparisons with the ground truth laminasegmentations from an expert sonographer (b, c, d). . . . . . . 49Figure 4.4 Example wireframe visualization of the registered model ontoultrasound images of the L2-3 and L3-4 intervertebral levelsin the paramedian view. The statistical shape model depictsanatomical features that are not present in the plane of ul-trasound acquisition, where the complete visualization of themodel provides an approximate idea of where the ultrasoundplane intersects the lumbar spine. . . . . . . . . . . . . . . . 50Figure 4.5 Example of cases with poor segmentation performance. Theregistration algorithm was able to compensate for partially seg-mented lamina as the correctly segmented portions appear toprovide sufficient information. Regions of the lamina that weremissed by the segmentation algorithm are highlighted in red. . 52Figure A.1 Fourier spectrum for the texture comparison examples in Fig-ure 3.7. The figures are positioned in the same order, wherethe spectra on the left are from ultrasound images capturedwith the Philips iU22, and those on the right are from imagescaptured with the GE Voluson E8. Note the difference in thedistribution of regions with higher magnitude in the spectrum(brighter yellow regions), which indicate differences in direc-tionality and frequency of the ultrasound spectrum. . . . . . . 69xviFigure A.2 Dice coefficient distribution for the cross-validation predic-tions (N=133) for all placenta segmentation models: (A) U-Net with output concatenation; (B) U-Net with ConvLSTMfeatures; (C) Stacked ConvLSTM; and (D) V-Net. . . . . . . . 70Figure A.3 Descriptive statistics plots for the results of lumbar spine aug-mentation (N=43): (a) Dice coefficient distribution for test ex-amples for the lumbar segmentation model; and (b) distribu-tion of root mean squared registration errors between the reg-istered model and ground truth segmentation maps. . . . . . . 70xviiAcknowledgmentsFirst and foremost, I would like to thank my supervisor, Professor Robert Rohling,for his mentorship and support. Thank you for giving me the opportunity to exploremy academic interests, to travel and share my research with others, and to growprofessionally.I would also like to thank Dr. Chantal Mayer and Ricky Hu for their expertisein obstetric ultrasound and ultrasound image analysis. Special thanks to AriadnaFernandez for all her support and patience in obtaining access to the data.Thank you to all my colleagues at the Robotics and Control Lab for their sup-port. My very special thanks to Mehran Pesteie for sharing his expertise in machinelearning, computer vision, and all things coffee.I would also like to thank the Natural Sciences and Engineering ResearchCouncil (NSERC) and Canadian Institutes of Health Research (CIHR) for fund-ing this research, and for their generous funding through the Canada GraduateScholarship.Lastly, I would like to thank my parents, Silvio and Kathia, and my sister,Paula, for providing comfort and laughter throughout this journey.xviiiChapter 1IntroductionUltrasonography is an important medical imaging modality for diagnostics and in-terventions in obstetrics, with applications ranging from fetal biometry, Dopplerimaging, pregnancy dating, and many others [2, 3]. The ubiquity of ultrasound inobstetrics and maternal care is due to an unparalleled combination of safety, cost,and portability, as well as the ability to perform functional imaging with Dopplerultrasound. However, these advantages are balanced by limitations on image de-tail and resolution, as well as an increase in the number of image artifacts arisingfrom the interaction between acoustic waves and human tissue. Consequently, ul-trasound may provide limited information in comparison with imaging modalitiessuch as CT and MRI (Figure 1.1), which produce images with higher resolutionand are less prone to artifacts [4–6]. In the vast majority of clinical contexts, how-ever, the cost of MRI and exposure to ionizing radiation from CT scans do notjustify the acquisition of better and more detailed images, and ultrasound remainsthe principal candidate for imaging in pregnancy.In the last twenty years, advances in computer vision, especially in machinelearning methods, have enabled the development of new tools for medical imageanalysis [7]. More recently, these methods and tools have been applied to ultra-sound images where certain visual tasks that are typically performed by a humanexpert can be fully automated, even in images that contain a high amount of noiseand artifacts which are generally considered difficult to interpret [8]. The automa-tion of visual tasks have the potential to improve clinical care by providing re-1(a) (b) (c)(d) (e) (f)Figure 1.1: Comparison of medical imaging modalities for lumbar spineimaging (top row) and fetal imaging (bottom row). CT (a, d) andMRI (b, e) scans provide detailed three-dimensional representationsof most organs and tissues, but are expensive and, in the case of CT,rely on harmful ionizing radiation. Ultrasound (c, f) provides a two-dimensional representation of the spine and fetus which is less intuitive,but is significantly less expensive and is generally safe for use in ob-stetrics. Image (d) is licensed to Mikael Ha¨ggstro¨m [11], used withpermission. Image (e) is licensed to Alamo et. al [12], used with per-mission.producible expertise at scale, which is especially important for ultrasonography asobstetric care is lacking in underserved communities around the world [9, 10].21.1 Prenatal screening and placental ultrasoundPrenatal screening is the principal application of ultrasound imaging in obstetrics,and it takes the form of routine ultrasound examinations that visualize the fetusin the first and second trimester, enabling physicians to check for malformationsof the fetus and surrounding organs. The role of ultrasound imaging in prenatalscreening is also an active topic of research, as it is still difficult to estimate riskand predict several disorders affecting pregnancies [13, 14]. Among these disor-ders, placenta-mediated disorders such as pre-eclampsia and intrauterine growthrestriction stand out particularly due to their high maternal morbidity and mortalityrate [15]. Recent guidelines from the International Society of Ultrasound in Ob-stetrics and Gynecology (ISUOG) recommend the analysis of maternal factors andbiomarkers for the screening of pre-eclampsia, as well as the use of uterine arteryDoppler ultrasound to characterize placental function [16]. The same guidelinesalso recognize the size of the placenta as a predictor of pre-eclampsia but choosenot to recommend them as practical guidelines based on limited reproducibility andtime cost of performing the measurement, despite the existence of computer-aidedtools in the market [17].These limitations motivate the development of new computer-aided tools formeasuring placental dimensions. The ideal approach to overcome reproducibilityissues and time constraints is a tool that is fast, accurate, and fully automatic. Inthe context of computer vision, this tool takes the form of a semantic segmenta-tion algorithm, which automatically highlights the placenta in an ultrasound scanand further allows measurements such as area, thickness, and length to be takenautomatically. As mentioned earlier, advances in machine learning methods haveallowed for the development of such tools, and we motivate our research usingmachine learning approaches for placental ultrasound analysis.1.2 Ultrasound-guided epidural anesthesiaUltrasound also plays an important role in obstetric anesthesia. The most commonindication for anesthesia in pregnancies is the epidural, which is a form of regionalanesthesia delivered through lumbar spine injections for relief of pain during laborand delivery. The traditional approach for performing an epidural involves the use3Figure 1.2: Comparison between puncture site localization with a traditionalpalpation approach (left) and pre-procedure ultrasound (right). The tra-ditional palpation technique relies entirely on tactile feedback of theprotruding structures of the spine (e.g., spinous process and iliac crestas shown), whereas ultrasound provides visual feedback of the internalanatomy. Images reproduced with permission from Kim et al. [22],Copyright Massachusetts Medical Society.of tactile feedback to locate and drive a needle and catheter into the patient’s lowerback (Figure 1.2). This approach is associated with complications arising frompoor needle placement [18], and ultrasound imaging has been proposed as a toolto provide visual feedback to reduce the rate of these complications [19, 20]. Withthe increasing adoption of ultrasound guidance, a number of existing challengesprevent its deployment in a clinical setting at scale. For example, anesthesiol-ogists require additional training or the presence of a radiologist/sonographer toadminister ultrasound-guided epidurals, and conventional needles used in epiduralinjections are difficult to see as they are not fully visible in the ultrasound image.Several studies have proposed tools for visualizing the lumbar spine and other rele-vant structures in ultrasound images as a means of improving the interpretability ofthese images, yet none of these approaches have been established in a clinical set-ting due to feasibility issues, including requirements for specialized equipment andpoor workflow [21]. As such, we were motivated to overcome these limitations bydeveloping tools that can be used with conventional equipment without disruptingthe clinical workflow.Recently, machine learning techniques have been applied to automate taskssuch as midline detection, vertebra localization, and landmark detection from ul-trasound images to aid in ultrasound-guided anesthesia [23–25]. More importantly,4these studies have demonstrated the ability of data-driven techniques to extractcomplex information from conventional 2D ultrasound images, which typically re-quire expertise from radiologists and sonographers. Therefore, we were furthermotivated to explore machine learning approaches to develop tools for improvingthe interpretability of conventional ultrasound images by automatically extractinginformation of similar complexity seen in these studies.1.3 The case for 2D ultrasonographyWhile studies have demonstrated the benefits of advanced ultrasound imaging tech-nologies such as 3D and 4D, conventional 2D ultrasound remains the principal toolfor ultrasonography in obstetrics because most ultrasound images are only capableof 2D imaging. Moreover, recent advances in portable and low-cost devices haveresulted in new devices in the market, but these devices mainly focus on providing2D imaging capabilities. This recent trend towards the production of low-cost ma-chines is driven by a demand for point-of-care medical imaging in rural and under-served communities [26]. While efforts have been made to address the technologi-cal gap to meet this demand, there remains the challenge of delivering the expertiserequired to operate these devices and provide proper patient care [27, 28]. Someof the proposed ways of addressing this issue include the establishment of trainingprograms for current physicians in rural areas and the introduction of point-of-careultrasound curricula for new graduates in family medicine. The lack of funding forthese programs, however, has been cited as a barrier for the deployment of thesetraining programs at scale [28]. In this context, we wanted to investigate whetherthe use of technology could also be used to fill the expertise gap and deliver patientcare in a scalable way.As mentioned in earlier sections, machine learning has been successful at au-tomating visual tasks in medical imaging. Recent studies also report that the ap-plication of machine learning methods has resulted in an improvement of clinicaloutcomes, particularly in settings where these methods are used to assist cliniciansand supplement clinical expertise [29–31]. These reports motivate an investigationof machine learning methods applied to 2D ultrasonography as a way to fill theaforementioned expertise gap in the deployment of point-of-care ultrasound.51.4 Thesis objectivesThis thesis aims to develop machine learning methods to improve two areas inobstetrics where ultrasound imaging is relevant: (1) ultrasound screening of theplacenta in early pregnancy; and (2) ultrasound guidance for epidural anesthesia.As motivated earlier, placental dimensions are correlated with a set of adverse out-comes in pregnancy, but the time cost and reproducibility of current methods pre-clude the measurement of such dimensions in practice. The first objective of thisthesis is to address these problems by fully automating the extraction of placen-tal dimensions. We sought to achieve this by using machine learning to developa tool for semantic segmentation of the placenta, which subsequently enables theautomatic measurement of placental dimensions.Our second objective was to make lumbar spine ultrasound images more inter-pretable for ultrasound-guided epidurals. As discussed later in Chapter 2, a signif-icant portion of the lumbar spine is invisible in ultrasound, making images diffi-cult to interpret for anesthesiologists without ultrasound expertise. We sought toaddress this problem by improving on existing tools to augment ultrasound imag-ing with a patient-specific statistical shape model that depicts the lumbar verte-brae, where we use machine learning methods to eliminate the need for specializedequipment that is used in existing methods.1.5 ContributionsWe present two main contributions,1. For the task of placenta segmentation, we studied four modern deep learn-ing approaches for spatiotemporal semantic segmentation applied to a di-verse dataset of 133 ultrasound sweeps of the placenta acquired with 2Dultrasound transducers. This study is the first to apply deep learning-basedsemantic segmentation to ultrasound sweeps of the placenta. We estimatedthe prediction performance of these models and compared the speed of thisapproach with other similar approaches in the literature.2. For ultrasound image augmentation, we leveraged machine learning methodsfor image classification and segmentation to develop a tool that does not6require position-tracked ultrasound transducers for statistical shape modelregistration to 2D ultrasound.1.6 Thesis outline• Chapter 2: Background - A discussion of the fundamental concepts of ul-trasound imaging and machine learning that are used to motivate and developthe methods presented in the following chapters.• Chapter 3: Segmentation of placental ultrasound sweeps - The devel-opment of deep learning methods for placental segmentation. We providea detailed background of placental imaging and a statistical comparison ofmodern deep learning segmentation methods applied to placental ultrasoundsweeps.• Chapter 4: Lumbar spine ultrasound augmentation - A machine learningsystem for augmenting 2D lumbar spine ultrasound images with a statisticalshape model.• Chapter 5: Conclusion - A summary of contributions and recommenda-tions for future work.7Chapter 2BackgroundIn this chapter, we present the concepts used to develop tools for image analy-sis in placental ultrasound and ultrasound-guided epidurals. We first discuss thefundamental concepts in ultrasound imaging, followed by details and limitationssurrounding its practice in fetal and spinal imaging. We then give an overview ofmachine learning and its applications to medical image analysis while focusing onrecurring topics that appear in the following chapters.2.1 Ultrasound imagingUltrasound imaging is a technique that uses acoustic waves with frequencies in theMHz range (typically 1-18), which are referred to as ultrasound waves. On mostultrasound machines, ultrasound waves are generated through the excitation ofpiezoelectric crystals located in the transducer/probe, where these crystals vibratemechanically in response to alternating current (AC) voltage supplied by the ultra-sound machine. In turn, piezoelectric crystals are also ultrasound wave receivers,as they generate an AC voltage in response to mechanical vibrations caused by in-coming waves. These piezoelectric crystals are placed in an array at the head of thetransducer, which may have different curvature profiles depending on the imagingapplication. The array of piezoelectric crystals provide two important functions:(1) during ultrasound transmission, groups of crystals are excited to form focusedultrasound beams with a particular direction and focal length via wave interfer-8ence; and (2) during ultrasound reception, crystals in the array generate signalsin response to reflected waves, and the time difference between transmission andreception is used to locate the source of reflection. Individually, these two func-tions are known as transmit and receive beamforming, respectively. Together, theprocess of intermittent transmission and reception of ultrasound waves constituteswhat is called the pulse-echo operation. To construct an ultrasound image, thepulse-echo operation is performed by transmitting and receiving ultrasound wavesalong a two-dimensional plane, where sources of wave reflection (echo) are locatedand visualized according to the intensity of the reflected waves. This particular ap-plication of ultrasound is known as brightness mode, or B-Mode ultrasound, wherebrighter pixels correspond to a higher echo intensity. Since the intensity of thereflected waves is related to changes in density in the medium of propagation, theresulting image visualizes regions of similar density, and this is a useful propertyfor medical imaging as different types of tissue like bone, muscle, and fat havedifferent densities (see Figure 2.3).2.1.1 ArtifactsThe physical and operation principles of ultrasound imaging described above giverise to a number of artifacts that can be difficult to model explicitly. One of themost common types of artifacts is the speckle, which is a type of noise that arisesfrom the scattering of reflected ultrasound waves from irregular microstructureswithin tissue. These scattered waves interfere to forms a granular pattern in theultrasound image, and the characteristics of this pattern depend on the propertiesof the imaging system (e.g., frequency, transducer type, processing algorithms) aswell as tissue properties (e.g., acoustic impedance, scatter distribution) [32]. Thistype of noise creates the characteristic grainy appearance of ultrasound images, andmethods for speckle reduction continue to be studied extensively [33–36]. Speckletends to reduce image contrast and create fuzzy boundaries, which complicates theinterpretation of images by non-experts and makes it hard to perform measure-ments on the image (Figure 2.1).Another common type of artifact is shadowing, which are black regions causedby the complete attenuation of an ultrasound wave by strongly reflective surfaces9(a)(b)Figure 2.1: Speckle noise as illustrated by the application of a speckle reduc-tion algorithm by Zhu et al. [37]. The original ultrasound images (a, toprow; b, top row) present a grainy and fuzzy appearance. The despeckledimages after application of the algorithm are shown below the originalimages.10like bone (Figure 2.2). This is a common artifact in fetal and spine ultrasoundwhere ultrasound waves can be severely or completely attenuated by the bone ofthe developing fetus and the vertebrae. In the context of placental ultrasound, atten-uation of ultrasound waves at the bones of the developing fetus can create shadowson the placenta that hide part of the organ. In spine ultrasound, the ultrasoundwaves are completely attenuated by the posterior surfaces of the vertebrae and ver-tebral bodies, which are typically seen in CT and MRI, are completely invisible.Several other artifacts exist in ultrasound imaging, and those presented aboveexemplify the variations that must be taken into account when interpreting and an-alyzing ultrasound images [32]. Consequently, the interpretation of ultrasound im-ages requires the expertise of radiologists, sonographers, or physicians with ultra-sound training in most clinical scenarios. Furthermore, artifacts and the variabilityin their presentation on different patients and image acquisition systems are diffi-cult to model explicitly, which makes it difficult to formulate tools for automatedanalysis of ultrasound images.2.2 Ultrasound examinations and placental diseasesUltrasound examination is a key component in prenatal management and screeningof the fetus and surrounding organs such as the placenta (Figure 2.3). While certaintypes of anomalies are well characterized by ultrasound imaging, the role of ultra-sound imaging for risk estimation and prediction of placenta-mediated conditions,such as pre-eclampsia and intrauterine growth restriction, is limited and remainsan active area of research.Ultrasound examinations typically occur twice during pregnancy, one in thefirst trimester at a gestational age of approximately 13 weeks, and another at thesecond trimester at a gestational age between 18 and 22 weeks with optional follow-up in the third trimester, where each of these examinations has different purposesdue to the visibility of anatomical structures at each stage of fetal development.In the first trimester, B-mode ultrasound imaging is typically performed to deter-mine the gestational age of the fetus more accurately, to confirm the presence ofcardiac activity, and to screen for ultrasound abnormalities associated with con-genital anomalies such as Down’s syndrome [38]. In terms of placenta-mediated11Figure 2.2: Shadow artifact illustrated on a fetal ultrasound image. The brightregion indicates a highly reflective boundary and, consequently, a highdegree of attenuation. The attenuated wave quickly decays as it travelspast the boundary, creating faint to zero reflections as the tissue com-pletely absorbs the acoustic energy of the wave, resulting in dark regionscalled shadows.conditions, Doppler scanning of the maternal uterine arteries is recommended forthe screening of pre-eclampsia but only predicts fewer than 50% of pregnanciesthat will develop the condition [16]. Doppler imaging of the first-trimester fetusis not recommended due to safety concerns over the thermal excitation of tissue[39, 40]. Current guidelines do not recommend B-mode ultrasound for the screen-ing of placenta-mediated disease in the first trimester.Second-trimester ultrasound examinations remain the standard of care for fetalanatomical evaluation, as many abnormalities do not present in the first-trimesterultrasound [16]. Current guidelines recommend that disorders of the placenta re-lated to its position and attachment should be reported at this stage, as the placentais not yet fully developed in the first trimester [16]. Important disorders of position12Figure 2.3: Labeled B-mode transabdominal fetal ultrasound scan performedin the second trimester. The spine and skull of the fetus are visualized,where higher density bone structures exhibit bright echoes. A cross-section of the placenta is illustrated attached to the uterine wall, whereit typically presents as an elliptical region of similar texture.and attachment include placenta previa, which relates to the abnormal position ofthe placenta relative to the cervical opening, and placenta accreta, which relatesto the abnormal attachment to the uterine wall (Figure 2.4). These disorders arediagnosed via a combination of B-mode and Doppler ultrasound and are associ-ated with a number of negative maternal and fetal outcomes, including antepartumhemorrhage and premature birth. Additionally, uterine artery Doppler is also rec-ommended in the second trimester for the screening of pre-eclampsia, where itpredicts up to 85% of cases depending on the onset of the condition [16]. Asmentioned in Chapter 1, placental size is also associated with pre-eclampsia. How-ever, current clinical guidelines do not recommend the measurement of placentalsize due to time cost and reproducibility issues. Moreover, the same guidelinesparticularly note that the time cost and reproducibility issues arise from a lack ofautomation, and that more studies are needed to assess the association between pla-cental volume and pre-eclampsia [16]. We also note that these recommendations13(a)(b)Figure 2.4: Placental positioning and attachment disorders diagnosed in sec-ond and third trimester ultrasound examinations. Placenta previa (a)relates to an abnormal positioning of placenta where it partially or com-pletely obstructs the cervical opening. Placenta accreta, increta andpercreta (b) relates to the abnormal invasion of the placenta into theuterine wall from inner (decidua basalis) to outer (serosa) layers. Im-ages (a) and (b) are licensed to Baird et. al [43], used with permission.have been published in the last two years, despite the previous existence of severaltools addressing this problem in the literature [41, 42].Lastly, ultrasound also plays a component in the management of placental dis-eases after diagnosis. In patients diagnosed with pre-eclampsia, the placenta isat increased risk for certain complications that are visible in B-mode ultrasound.These include swelling due to fluid retention, the presence of lesions such as cysts,infarctions, and hematomas, as well as the development of attachment disorderssuch as placental abruption, which is the partial or complete early separation of theplacenta from the uterine wall. Patients with pre-eclampsia with typically undergoseveral follow-up ultrasound examinations, where the placenta will be monitoredfor these conditions. In severe cases, the recommendation is induced labor andpremature delivery of the fetus.We have demonstrated the importance of ultrasound imaging in ensuring and14managing maternal and fetal health and limitations regarding predictive power, re-producibility, and time cost of performing placental measurements. In Chapter 3,we develop a machine learning approach for the automatic segmentation of the pla-centa from transabdominal ultrasound sweeps acquired in the second trimester. Inparticular, our primary motivation is to develop a tool for the automatic measure-ment of placental size, which is directly limited by reproducibility and cost whenperformed manually.2.3 Ultrasound-guided epiduralsEpidural anesthesia is a form of anesthesia that involves the injection of a needleinto the epidural space of the lumbar spine. Traditionally, epidural injections areperformed blindly by way of palpation of surface landmarks from the protrudingspinal structures on the patient’s back, where the physician looks for a space be-tween the vertebra that gives access to the epidural space. Based on the palpationtechnique, a puncture site and angle of entry are specified for subsequent needleinjection. The depth of injection is determined via a technique called loss of resis-tance where the pressure between a syringe attached to the needle (typically filledwith a saline solution) and the tissue is released as it enters the epidural space. Thecombined palpation and loss of resistance approach is associated with a numberof complications associated with technical errors, including difficulty and failureto properly insert the needle (up to 8% of cases), and negative patient outcomesincluding headaches (up to 1% of cases) to respiratory depression and convulsions(up to 0.3%) [18, 44].In the last two decades, image guidance for epidurals has been proposed asa way to improve the above complication rates. In non-obstetric procedures, flu-oroscopy may be indicated as an image guidance procedure for epidural admin-istration, where it reportedly reduces the number of complications in comparisonwith the traditional loss of resistance approach [45]. Fluoroscopy provides a de-tailed and real-time visualization of the spine and the needle throughout the pro-cedure (Figure 2.5), but the emission of harmful ionizing radiation precludes itsapplication in obstetrics. Consequently, ultrasound imaging has been the primarycandidate for image guidance in epidural anesthesia in obstetrics. A recent review15(a) (b)(c)Figure 2.5: Comparison of imaging modalities for epidural guidance: a de-tailed and labeled diagram of the lumbar spine (a) for reference; a fluo-roscopy image of the lumbar spine (b) where the vertebrae and the nee-dle are visible; and an ultrasound image (c) of the lumbar spine showingthe surfaces of the lamina. Image (a) is licensed to Yu et. al [47], usedwith permission. Image (b) is licensed to Ahn et. al [48], used withpermission.article reports that ultrasound guidance reduces the rate of complications whenused as a pre-procedure assistance tool [46], but limited evidence exists to supportultrasound imaging as a real-time guidance tool.Despite the current lack of clinical evidence for real-time ultrasound guidance,the existing benefit of ultrasound assistance has motivated recent research and de-16velopment efforts in real-time guidance systems. State-of-the-art methods in real-time ultrasound guidance are largely based on the registration of preoperative im-ages of the spine acquired with a detailed modality such as CT or MRI to real-timeintraoperative ultrasound images [21]. In the context of obstetric epidurals, a sub-set of these methods recognize the infeasibility and cost of acquiring pre-procedureCT or MRI images from pregnant women and instead focus on model-based regis-tration to intraoperative ultrasound (Figure 2.6). Model-based registration typicallyinvolves the construction of a statistical shape model from a pre-existing databaseof CT or MRI images, and subsequent registration of this model to intraoperativeultrasound volumes acquired with 3D ultrasound or stitched from 2D ultrasoundimages, which generates a patient-specific model of the vertebrae and circumventspatient exposure to ionizing radiation [49–51]. While these methods show promise,their clinical applicability is limited by the need for specialized equipment such as3D ultrasound transducers and 2D transducers with position tracking, as well ascumbersome preparation such as the acquisition of stitched ultrasound volumesand manual landmark selection for registration [49, 50].Motivated by the limitations described above, we sought to improve the clini-cal applicability of model-based registration methods by developing a system thatdepends only on conventional 2D ultrasound equipment. In Chapter 4, we proposea machine learning system that registers a statistical shape model directly to 2D ul-trasound images of the lumbar spine. In the above methods, an ultrasound volumeis acquired to provide a sufficient number of corresponding features for registrationof the statistical shape model. Our proposed method attempts to leverage machinelearning methods to constrain the registration problem and enable the registrationof the statistical shape model to a small set of features from a single 2D ultrasoundimage.2.4 Machine learning and medical image analysisMachine learning algorithms for computer vision have enabled the development oftools for medical image analysis with unprecedented performance. In this thesis,we focus on the use of supervised machine learning algorithms to develop systemsthat are robust to the noise and artifacts described in Section 2.1.17Figure 2.6: Outline of model-based registration: a statistical shape model ofthe lumbar vertebrae (L1 through L5, shown in A) is registered to avolume consisting of stitched 2D ultrasound images of the lumbar spine(parasagittal cross-section shown in B). The resulting registered modelis a patient-specific model of the lumbar vertebrae that augments theanatomy seen in ultrasound.2.4.1 Supervised learning fundamentalsSupervised learning is defined as the task of learning a function f : X → Y givena finite sample (xi,yi), also known as training set, where x ∈ X , y ∈ Y are depen-dent and independent variables, respectively. This task can be formulated as anoptimization problem where, given a set of canditate functions F and assuming thetraining set is an i.i.d. sample from an unknown joint distribution P(X ,Y ), oneseeks to learn a function fˆ ,fˆ = arg minf∈FEL(Y, f (X))= arg minf∈F∑iL(yi, f (xi)) (2.1)where L is a loss function that measures the amount of error for a particular predic-tion task such as regression or classification. The assumption is that given enoughtraining samples from P(X ,Y ) and an approriate set of candidate functions F , fˆ18will also have low prediction errors for subsequent samples drawn from the samedistribution, which we denote as test error, thereby solving the supervised learn-ing task. In practice, however, solving this optimization problem does not alwaysresult in low test error due to overfitting, which occurs due to a combination offactors such as training set size and poor choices for the set of candidate functionsF and loss function L. Therefore, it is important to estimate the test error for newsamples from the distribution of interest P(X ,Y ) to assess the actual performanceof fˆ as a candidate solution f : X → Y for the supervised learning task. The testerror is typically measured in two ways: (1) the conditional test error, which is thetest error given a function/model trained on a finite training set τ;Errτ = E[L(Y, fˆ (X)) | τ] (2.2)and (2) the expected test error, which is the expected conditional test error over allpossible training sets,Err = E[Errτ ] (2.3)The primary difference between these two quantities is that the expected testerror accounts for variability in performance when using different training sets.The conditional test error is typically estimated with the hold-out method, which ismore commonly known as train-test split, where a sample is divided into a trainingset used to train the model as in Equation 2.1, and a test set used to calculate theconditional test error where the expected value of Equation 2.2 is approximated byan empirical mean. In turn, the expected test error is typically estimated with k-fold cross-validation, where a sample is divided into k equal parts with k−1 partsconstituting a training set and 1 part constituting a test set, and the conditionaltest error is estimated for all( kk−1)= k unique training sets, and then averaged tocompute the expectation in Equation 2.3.Throughout this thesis, the process of training and evaluating a set of candidatefunctions in a supervised learning framework is applied for a class of models calledconvolutional networks.192.4.2 Convolutional networks for image analysisThe approach presented in Section 2.4.1 above is widely used to train and evaluatefunctions called convolutional networks, a class of deep learning models used inimage analysis. A convolutional network is a function composition of 2D convo-lution operations, non-linear functions, and resampling operations, which learns ahierarchy of 2D convolutional filters/spatial features (Figure 2.7). This hierarchyis described in terms of low-level features, which are the first convolution opera-tions in the composition that extract features such as edges and texture primitives;and high-level features, which are the subsequent compositions of preceding lowerlevel features into more complex geometric shapes and patterns. This composi-tion (termed network architecture) is typically defined as the stacking of buildingblocks called layers, and the primary building block for convolutional networks isthe convolutional layer: a convolution operation followed by a non-linear activa-tion function,(W∗ I)(i, j,k) =∑m∑n∑cI(i−m, j−n,c)Wk(m,n,c)Si jk(I,W) = σ[(W∗ I)(i, j,k)] (2.4)where (W∗I) is the discrete convolution of K learnable filters W=W1, ...,Wk, ...,WKwith an image or matrix I, and the activation function σ is typically a rectified lin-ear unit,σ(a) = max(0,a)or a sigmoid function,σ(a) =11− e−awhose choice are dependent on empirical performance.The parameter space of these learnable filters define the set of candidate func-tions in the supervised learning problem of Equation 2.1. And an equivalent train-ing algorithm can be defined as the task of finding the set of parameters that mini-mize the optimization problem,W∗ = arg minW∑iL(yi, f (xi;W))(2.5)20Figure 2.7: An illustration of the hierarchical features learned by a convolu-tional network. The network architecture consists of a composition ofconvolutional layers and subsampling operations, where the first layerin the network learns low-level features such as edges. The subsequentlayers form compositions of the earlier features into more complex fea-tures such as object parts and, in the final layers, entire objects.where f is a convolutional network with parameters W. While convolutional net-works are differentiable, a closed form solution to the above optimization problemdoes not exist, and we rely on iterative gradient descent methods such as RMSpropand Adam [52].The framework described in this subsection has been used to solve variouscomputer vision and image analysis tasks such as classification, regression, objectdetection, and several others. In this thesis, we rely heavily on the task of semanticsegmentation, which is the problem of partitioning an image into different regionswith similar properties. In medical image analysis, semantic segmentation typi-cally translates to highlighting an organ or tissue, which is useful for related taskssuch as image-guided interventions and morphometry for screening and diagnosis[53, 54].In the context of supervised learning and medical imaging, the semantic seg-mentation task fits into the optimization problem of Equation 2.5 where we acquirea dataset of medical images X , and corresponding ground truth segmentation masksY typically generated via manual segmentation. Then, for an appropriate choice forthe convolutional network architecture f , we seek to learn a set of parameters that21minimizes a loss function that describes the dissimilarity between the output of thefunction yˆ and the ground truth segmentation mask y. The two most common lossfunctions for this task are the average pixel-wise binary cross-entropy,L(y, yˆ) =1N∑iyi log yˆi+(1− yi) log(1− yˆi)which treats the segmentation problem as a pixel-wise classification problem; andthe Dice loss,L(y, yˆ) = 1− 2∑yiyˆi∑y2i +∑ yˆ2iwhich is measures the intersection between two sets.Note that for this formulation of the semantic segmentation problem, the onlyrequirements are a labeled dataset and an appropriate choice for the functional formof the convolutional network. In the following chapters, we sought to apply thissupervised learning approach to develop robust methods for ultrasound image anal-ysis without the need to explicitly account for the variations and artifacts presentedin Section 3Segmentation of placentalultrasound sweeps3.1 IntroductionThe placenta is an important organ that develops during pregnancy and mediatesthe exchange of nutrients between the mother and the fetus. Consequently, disor-ders of the placenta are associated with poor fetal growth and, perhaps less intu-itively, with hypertensive disorders such as pre-eclampsia, eclampsia, and gesta-tional hypertension. However, the severity and onset of these conditions remainlargely unpredictable, resulting in incidence rates up to 8% worldwide [13–15]. Arecent review article reports that predictive models that include ultrasound imag-ing markers show an increase in classification performance, including biomarkersfrom Doppler and B-mode ultrasound [55].Manual extraction of imaging biomarkers for the development of predictivemodels requires the expertise of radiologists and/or ultrasound technicians, and istypically time-consuming for large databases [17]. In this chapter, we focus onautomating the task of segmenting the placenta from ultrasound sweeps. Placen-tal segmentation is important for the measurement of placental size, volume, andthickness, which are reportedly associated with fetal disorders such as intrauter-ine growth restriction and pre-eclampsia [56, 57]. We sought to develop a systemthat is automatic, fast, and robust to variations arising from ultrasound parameters23such as acquisition with different machines, as well as variations in placentation,maternal demographics, and pregnancy outcomes. Unlike manual segmentation,fully automatic systems are not susceptible to the intra and inter-observer variabil-ity [17, 58]. Moreover, it is possible to deploy an automatic system without theneed for experienced ultrasound technicians and/or radiologists, which are lackingin rural areas and regions with limited healthcare resources.Related WorkEarlier computer-aided tools for volumetric analysis of the placenta include theVOCAL (GE Healthcare, United States) and XI VOCAL systems, which rely onthe manual drawing of organ contours in different planes [41]. These systemsare susceptible to inter and intraobserver variability and require several minutesto process a single organ [17]. Later, a semi-automatic method for segmentingthe placenta from 3D ultrasound scans was developed using the random walker al-gorithm [42]. This method requires only a manual initialization of the algorithmand is faster than manual outlining, but the time taken to process a single organis still in the order of minutes. Moreover, manual initialization resulted in interand intraobserver variability. More recently, deep learning methods were appliedto segmentation of the placenta in 2D and 3D ultrasound, achieving high Dice sim-ilarity coefficient values [59–61]. These methods, however, have been applied tosingle frame 2D ultrasound and ultrasound volumes acquired with 3D transducers.In this chapter, we focus on deep learning approaches to segmenting the placentafrom a sweep acquired with 2D transducers by incorporating a combination of thetechniques used in these studies.3.2 Materials and MethodsFollowing approval by the University of British Columbia Clinical Research EthicsBoard (Certificate No. H18-0119), 133 placental ultrasound sweeps acquired inthe second trimester were retrieved from the British Columbia Women’s HospitalPACS, and the corresponding delivery outcomes were retrieved from the Perina-tal Services of British Columbia. All sweeps were acquired with two machines,namely the Voluson E8 with a C1-5-D probe (GE Healthcare, United States), and24the iU22 with a C5-1 probe (Philips, Netherlands). The maternal age of the sam-ple ranged from 22 to 51 years, with a median and interquartile range (IQR) of 36(7). The gestational ages at the time of acquisition ranged from 18 to 27 weekswith a median and interquartile range (IQR) of 23 (2) weeks. In terms of disordersarising during pregnancy, 33 pregnancies were diagnosed with pre-eclampsia, and18 pregnancies were diagnosed with gestational diabetes out of all 133 pregnan-cies. Moreover, in terms of delivery outcomes, 31 delivered prematurely (between28 and 37 weeks), and low birth weights (between 1000 and 2499 grams) wereobserved in 23 deliveries out of all 133 pregnancies.A placental sweep can be described as a time series of 2D ultrasound framesstreamed from a freehand abdominal sweep. In different pregnancies, natural vari-ations in placental formation and fetal growth result in different ultrasound appear-ances among different patients. Moreover, the textural appearance of the placentaand other organs varies between different ultrasound machines under different ac-quisition settings. The difficulty in modeling these variations motivates the use of asupervised learning approach to automatically learn relevant features for segmen-tation. Therefore, we chose to explore existing deep learning-based segmentationmodels applied to 2D and 3D medical images as well as spatiotemporal data suchas video recordings. In particular, we sought to apply and compare architecturesthat model temporal features in different ways and observe their performance insegmenting the placenta from ultrasound sweeps.3.2.1 2D fully convolutional networks with concatenated outputsAs a baseline model, we consider a segmentation model that consists of a single 2Dfully convolutional network (FCN) applied to each frame of the sequence, wherethe outputs of the network at each frame are concatenated into an output sequence.The resulting model is one that learns a common set of spatial features extractedfrom each frame by maximizing the similarity between the output sequence anda ground truth sequence of segmentation maps. This is similar to a 2D imagesegmentation model trained on batches of images but differs in that the featuresare learned by maximizing the average similarity between two sets of frames. Wechose this architecture as a baseline because it makes the simplifying assumptions25Figure 3.1: Sequence of frames extracted from two ultrasound sweeps of theplacenta. In the first sweep (top row) the lower part of the placenta is re-vealed when comparing the leftmost to the rightmost frame, which rep-resent two sections of the placenta similar to a tomographic scan. Thesame occurs in the second sweep (bottom row) as fetal structures be-come more visible when comparing the leftmost to the rightmost frame.that spatial features are common to all frames in a sequence and are not shared inthe temporal dimension.We chose the U-Net as the FCN architecture for the model, and a diagramof the segmentation model is provided in Figure 3.2. The U-Net was chosen be-cause it has continued to achieve state-of-the-art performance on a wide varietyof segmentation tasks despite proposed improvements to the original architecture[62, 63]. In our instance of the U-Net, we decreased the input image size from 572-by-572 to 128-by-128 pixels due to GPU memory limitations while still preservingboundaries between the different fetal anatomies in ultrasound (see Section 3.2.5for machine specifications).3.2.2 2D fully convolutional networks with recurrent layersTo improve on our baseline segmentation model, we sought to model the temporaldependence between each frame in the sequence. This dependence can be mod-elled using recurrent neural networks, which are a class of neural networks thatincorporate recurrent states that connect latent variables along the temporal dimen-26x1 xNU-NetU-Netyˆ1 yˆNxnU-Netyˆn... ...concatL(yˆ, y)Figure 3.2: Ultrasound sweep segmentation model based on the concatena-tion of predictions from a U-Net applied to each frame in the sequence.Here, the ultrasound sweep x, predicted segmentation map yˆ and groundtruth segmentation map y, all consist of N frames with dimension 128-by-128 pixels.sion of a sequence. To preserve the spatial structure of features learned with aconvolutional neural network along a sequence, we consider convolutional longshort term memory layers (ConvLSTM), which learn convolutional features alongwith internal memory states that depend on the previous input of the sequence. Themathematical formulation of ConvLSTMs are as follows,it = σ(Wxi ∗Xt +Whi ∗Ht−1+Wci ◦Ct−1+bi)ft = σ(Wx f ∗Xt +Wh f ∗Ht−1+Wc f ◦Ct−1+b f)Ct = ft ◦Ct−1+ it ◦ tanh(Wxc ∗Xt +Whc ∗Ht−1+bc)ot = σ(Wxo ∗Xt +Who ∗Ht−1+Wco ◦Ct +bo)Ht = ot ◦ tanhCt (3.1)27where ∗ and ◦ are the convolution and Hadamard product operations, respectively;Xt denotes an element at position t of a sequence; Ht and Ct are, namely, the hiddenstate and cell state of the layer, which are internal states that establish a recurrencerelation along the elements of an input sequence; and it , ft and ot are denoted theinput gate, forget gate and output gate, which are logistic functions that control/s-cale the magnitude of internal states at each step of the sequence. The layer hasa set of adjustable parameters {Wjk,bk} where j refers to the variable that is con-volved/multiplied with the parameter, and k refers to the layer’s internal state orgate associated with the operation.We considered two ConvLSTM architectures to model the temporal depen-dence between the frames of an ultrasound sweep. The first architecture consistsof a 2D FCN that takes each frame of the sequence as an input and feeds the out-put into a ConvLSTM (Figure 3.3). This architecture is similar to that presentedin Section 3.2.1 with the difference that, here, the output prediction of each framein a sequence is conditioned on the predictions of the preceding frames. The sec-ond architecture is similar to one presented by Anas et al. [64], where the tem-poral dependence is modeled in deeper layers of the network (Figure 3.4). Thisarchitecture differs significantly from the ones presented so far in that ConvLSTMlayers stacked to add complexity to the temporal representations (e.g., composi-tions of recurrence relations), and that ConvLSTM layers are placed deeper in thenetwork where the dimensionality of the latent variables are smallest. In convolu-tional networks, features with lower dimensionality are constructed via composi-tions of primitive convolutional features such as edges and local intensity variationsand, therefore, this particular architecture models complex temporal dependenciesamong high-level features (i.e., curves, shapes, and global patterns) present in theframes of the sequence.3.2.3 3D fully convolutional networksAnother way of modeling the dependence between frames in a sequence is touse the three-dimensional convolutional operation. 3D convolutional networksare versatile and have found several applications in classification and segmenta-tion tasks on three dimensional and time-series data [65, 66]. Unlike LSTMs that28x1 xNU-NetU-Netxˆ1 xˆNxtU-Netxˆt... ...L(yˆ, y)ConvLSTMConvLSTMConvLSTMh1c1hN−1cN−1Conv2D(1x1)Conv2D(1x1)Conv2D(1x1)h1 hNhtyˆ1 yˆNyˆtconcat... ...Figure 3.3: Ultrasound sweep segmentation model based on the spatiotempo-ral fusion of the outputs from a U-Net using a ConvLSTM layer. Here,the U-Net outputs xˆt are passed through a ConvLSTM layer, wherethe output ht depends on all preceding inputs {xˆ1, ..., xˆt−1}, outputs{h1, ...,ht−1}, and internal layer states {c1, ...,ct−1} via a recurrence re-lation defined in Equation 3.1. The outputs of the ConvLSTM are thenpassed through a 2D convolution with sigmoid activations to generatethe output segmentation map yˆ.29x1 xNResNetLayerxt... ...ResNetLayer ResNetLayerResNetLayer ... ...ResNetLayer ResNetLayerResNetLayer ... ...ResNetLayer ResNetLayerResNetLayer ... ...ResNetLayer ResNetLayerConvLSTMConvLSTMConvLSTMConvLSTMConvLSTMConvLSTMCCCCL(yˆ, y)concatenationFigure 3.4: Ultrasound sweep segmentation model based on an FCN withstacked ConvLSTM features. In this architecture, the higher level fea-tures are modelled by stacked ConvLSTM layers, which forms a compo-sition of several instances of the recurrece relation defined in Equation3.1. This composition of recurrence relations increases the complexityof temporal dependences that can be represented by the model, particu-larly when compared to the models illustrated in Figures 3.2 and 3.3.30capture temporal dependence through recurrence relations, 3D convolutional net-works model temporal dependence through the receptive field of latent variablesin the network. In other words, as three-dimensional convolutions and activationfunctions are composed through the network, the value of each output voxel be-comes a function that depends on a larger spatiotemporal region of the originalsequence.To model the temporal dependence with three-dimensional convolutions, weemploy a 3D FCN based on the V-Net architecture [67]. The V-Net was originallydeveloped for segmentation on volumetric medical imaging modalities such as CTand MRI, and we adapt it to accept variable-length sequences as inputs by splittinga sweep into subparts of equal length with zero-padding and concatenating theoutputs into a segmentation map with the same number of frames as the groundtruth.3.2.4 Dataset preparation and model validationThe frames of all ultrasound sweeps were resized to 128-by-128 pixels to over-come memory constraints. Standard data augmentation procedures such as ran-dom rotations, zooming, and mirroring were performed. The magnitude of rota-tions, mirroring, and zooming was limited to preserve the direction of ultrasoundbeam propagation in typical ultrasound images. Additionally, the augmentationparameters were shared for all frames on a given ultrasound sweep to preserve thetemporal relationship between frames.The performance of each model was measured with the Dice coefficient and5-fold cross-validation. For each cross-validation step, the training folds were splitinto training and validation sets consisting of 80 % and 20 % of the data, respec-tively. During model training, the Dice similarity coefficient was calculated on theheld-out validation set at every epoch, and this value was used for early stopping toprevent model overfitting. All models were trained with the Adam optimizer [52]31by minimizing the Dice coefficient loss,L(Y,Yˆ ) = 1−DSC(Y,Yˆ )= 1− 2∑yiyˆi∑y2i +∑ yˆ2i(3.2)where DSC refers to the Dice similarity coefficient and Y and Yˆ are 3D tensorswith entries yi, yˆi ∈ [0,1]. The test Dice coefficient values obtained with cross-validation were compared (difference of means) between all models using theWilcoxon signed-rank test with Bonferroni p-value correction for multiple com-parisons.3.2.5 ImplementationAll models were implemented in the Python programming language using the Ten-sorFlow library version 2.1.0 with GPU support. All experiments were performedon a desktop workstation equipped with a 3.70GHz Core i7-8700K processor (In-tel Corporation, United States), and a 16GB TITAN V GPU (Nvidia Corporation,United States).3.3 ResultsThe cross validation estimates for the Dice coefficient for all models are presentedon Table 3.1, and the statistical comparison results between each model are givenon Table 3.2. The U-Net with output concatenation performed significantly betterthan the stacked ConvLSTM model (p < 0.001). No significant differences werefound between predictions from the U-Net with output concatenation, U-Net withConvLSTM, and V-Net models. All models attained an average inference timebelow 0.02 seconds per frame (Table 3.1). Example segmentations are illustratedin Figure 3.5. Histograms for the Dice coefficient values are included in AppendixA, Figure A.2.32ModelDice coefficient(% mean ± stdev)Inference time(sec/frame)Model 1: U-Net w/ output concatenation 92.1±7.5 0.01Model 2: U-Net + ConvLSTM 92.1±7.8 0.01Model 3: Stacked ConvLSTM 90.0±9.9 0.02Model 4: V-Net 91.2±9.0 0.001Table 3.1: Performance results for semantic segmentation of ultrasoundsweeps.Model 2 Model 3 Model 4Model 1 0.557 < 0.001 0.085Model 2 - < 0.001 0.127Model 3 - - < 0.001Table 3.2: P-values for pairwise Wilcoxon signed rank tests between the Dicecoefficient values of each model. The Bonferonni p-value correction wasapplied at the 1% significance level for 6 comparisons (p < 0.01/6 ≈0.0016). Please refer to Table 3.1 for model numbers.3.4 DiscussionThe results indicate that models with recurrent layers performed similarly or worsethan models without recurrent layers for the task of segmenting the placenta fromultrasound sweeps. Given that recurrent models are more susceptible to vanishingand exploding gradients during training, the results suggest that FCNs should bechosen over recurrent models.All models showed satisfactory predictions for ultrasound frames with partialattenuation artifacts but exhibited variable performance in frames with completeattenuation (shadow) artifacts (Figure 3.6). We hypothesize that this variance maybe caused by uncertainty and lack of consensus when manually segmenting frameswith shadow artifacts. For example, on Figure 3.6c, the shadow artifact is wideand we chose not to label that region as placenta, but on Figure 3.7b, the shadowartifact was narrow enough that we chose to interpolate the region into a singleplacenta segmentation label by looking at the adjacent frames. Without a deter-33(a) (b) (c) (d)(e) (f) (g) (h)Figure 3.5: Example segmentations from the model with highest perfor-mance (U-Net with output concatenation). The top row (a-f) illustratesthe ground truth segmentation maps, and the bottow row (e-f) illustratesthe output predictions from the model. Panels d and h illustrate a fail-ure mode, where the model failed to identify a boundary between theplaceta and the uterine wall.ministic criterion on how to label shadows, models may learn to predict differentthresholds due to observer variability in the labeling process. One potential wayto address this limitation is to model the uncertainty with probabilistic methodssuch as Bayesian deep learning and Monte Carlo dropout to provide confidenceestimates alongside predictions [68, 69].All models exhibited robustness to changes in texture and speckle patternswithout any explicit modeling, prior information, or preprocessing (Figure 3.7).This is particularly important for segmentation in ultrasound images because thetexture appearance and speckle noise profile may change as different ultrasoundmachines and manufacturers employ different image processing techniques to im-prove image quality. We illustrate this difference in Figure 3.7 for sweeps acquiredwith the two different machines used in the acquisition of our dataset: the PhilipsiU22 and the GE Voluson E8. We note that models trained on fully featured cart-34based ultrasound machines, such as the ones in this study, may not generalize wellto sweeps acquired with portable ultrasound machines, which tend to produce im-ages with poorer image quality [70].All models showed millisecond segmentation speeds, which are faster than thenon-machine learning approaches proposed in the literature [17, 42]. The mod-els presented in this study show average inference below 0.02 seconds per frame,which indicates a potential for real-time segmentation systems for conventionalB-mode frame rates (∼ 15Hz).3.5 ConclusionWe presented an analysis of deep learning approaches for the automatic segmen-tation of the placenta from ultrasound sweeps. This is the first comparative studyof spatiotemporal segmentation on placental ultrasound sweeps, as earlier studieshave focused on semantic segmentation on single 2D placental ultrasound and 3Dplacental ultrasound. In this chapter, we proposed several deep learning modelsfor spatiotemporal segmentation of medical images that were trained on a diversedataset of 133 scans comprising 4531 frames. The model with the highest per-formance (U-Net with output concatenation) achieves a Dice coefficient compara-ble to manual segmentation and shows potential for real-time segmentation. Theproposed model for automatic segmentation of the placenta can be used to auto-matically extract regions of interest for further analysis of placental morphology inultrasound, which may correlate with placental disorders such as pre-eclampsia andintrauterine growth restriction. In terms of future work, the performance of thesemodels is likely to improve with a larger and more diverse dataset in terms of demo-graphics, pregnancy outcomes, acquisition parameters, and ultrasound machines,especially true considering that ultrasound sweeps are high dimensional and moreexamples are needed to capture a dataset that is representative of a population. Ad-ditionally, memory limitations in our GPU required us to downsample the spatialdimensions of the ultrasound sweeps for training resulting in a loss of spatial reso-lution. Further analysis should be carried out to determine whether models trainedon ultrasound sweeps with higher spatial resolution increases the performance inspatiotemporal segmentation, but may require a significantly larger dataset due to35(a) (b)(c)Figure 3.6: Example success and failure cases for images with partial atten-uation and complete attenuation (shadow) artifacts. Using the same ul-trasound sweep, we illustrate the ground truth segmentation and pre-dictions on a frame without artifacts (a), a frame with partial attenua-tion (b), and a frame with complete attenuation (c). While all modelsperformed were robust to partial attenuation, variable performance wasseen with shadow artifacts. This variation in performance is likely dueto uncertainty in the manual labeling of attenuated regions, where nofixed brightness threshold was stipulated to distinguish partial versuscomplete attenuation.36(a) (b)(c) (d)Figure 3.7: Example prediction cases for different speckle noise and textureappearances. The ultrasound frames on the left were captured with aPhilips iU22 ultrasound machine and exhibit higher amount of speckleand show visually grainier texture than the ultrasound frames on theright, which were captured with a GE Voluson E8. All models exhib-ited a high degree of invariance to speckle noise and visual differences intexture without the inclusion of any additional features or prior informa-tion. Fourier spectra for the above examples are included in AppendixA, Figure A.137the curse of dimensionality. Furthermore, due to observer variability associatedwith the manual labeling of ultrasound images, the proposed segmentation modelswould benefit from uncertainty estimation, which can help users exclude regionsbased on the confidence of predictions.38Chapter 4Lumbar spine ultrasoundaugmentation4.1 IntroductionEpidurals are a common form of regional anesthesia in obstetrics. Currently, theadministration of epidurals is typically done through tactile feedback, where apuncture site is located via palpation of the patient’s lumbar spine, and the nee-dle is injected using a technique called loss of resistance. In this technique, theanesthesiologist drives the needle through until a change in pressure is felt, whichindicates that the needle has successfully reached the region where the anestheticshould be delivered. The reported complication rates arising from needle misplace-ment are 1% for low severity outcomes (e.g., post-procedure headaches) and up to0.3% from moderate to high severity outcomes (e.g., seizures, cardiac arrest, para-plegia) [18]. Given that 57% to 71% of pregnancies in North America report theuse of epidurals, it is important to reduce the percentage of complications [71].Ultrasound guidance systems have been proposed as a way to reduce complica-tions by providing visual feedback to the anesthesiologist before and during needleinsertion [72]. The safety and portability of these systems make them feasiblecandidates for epidural guidance in obstetrics, but the interpretation of ultrasoundimages typically requires the expertise of radiologists or ultrasound technicians, asanesthesiologists are not traditionally trained to read ultrasound images. Moreover,39with the increasing adoption of ultrasound guidance technology, studies report anincreased demand to train new anesthesiologists in ultrasound-guided anesthesia[73].As discussed in Chapter 2, several proposed ultrasound guidance systems havesought to improve the interpretability of ultrasound scans of the lumbar spine. Asubset of these models rely on model-based registration, where a statistical shapemodel constructed from a pre-existing database of CT or MRI scans is registeredto intraoperative ultrasound volumes, resulting in the augmentation of ultrasoundanatomy with a patient-specific model that provides a complete depiction of thelumbar vertebrae [21]. Model-based registration methods, however, typically relyon registration to volumetric ultrasound data acquired with 3D ultrasound or posi-tion tracked 2D ultrasound transducers, which are not commonly available in mostclinical settings.In order to overcome the above limitations, we sought to improve on model-based registration approaches by registering directly to 2D ultrasound. We proposea three-part approach (Figure 4.1): (1) classification of an incoming ultrasoundimage; (2) semantic segmentation of the lumbar spine surfaces (lamina); and (3)registration of a statistical shape model of the lumbar spine, constructed from CTscans, to the segmented lamina. The statistical shape model is then overlayed ontothe ultrasound image, providing additional information and context to the ultra-sound anatomy.Related WorkOur work is based on an earlier ultrasound guidance system that relies on the regis-tration of a statistical shape model constructed from CT scans of the lumbar spine[50, 74]. In essence, this system relies on the acquisition of a three-dimensionalscan of the patient’s lumbar spine by stitching 2D ultrasound images using posi-tion tracked transducers. Limited observational studies report that this model-basedregistration approach shows a high registration accuracy and clinical feasibility forultrasound guidance in epidural and facet joint injections [74, 75]. In this chapter,we attempted to show that by leveraging machine learning and computer visiontechniques, we can extract information that allows us to circumvent the need to40Figure 4.1: Overview of proposed registration approach. An input B-modeultrasound image is fed into a classifier that determines whether theimage belongs to the paramedian view of the lumbar spine. Then asegmentation model extracts the lamina bone surfaces, which serve asregistration targets to a CT statistical shape model, where only the cor-responding anatomy is selected for registration. The registered model isoverlayed onto the original ultrasound image.acquire volumetric ultrasound data and register the statistical shape model directlyto 2D ultrasound images.For the classification and segmentation of ultrasound images, we opted for ma-chine learning and deep learning approaches for their ability to learn models thatare robust to variations that are difficult to model explicitly [8]. These approacheshave been successfully applied in common applications of ultrasound imaging in-cluding breast, liver, and thyroid ultrasound [76]. Similar approaches have alsobeen applied specifically for lumbar spine ultrasound with applications in scolio-sis visualization and guidance for surgery [77, 78]. Moreover, machine learningand deep learning approaches to segmentation offer a direct improvement over the41methods used in previous ultrasound guidance systems, where bone surfaces areextracted through hand-crafted features [79, 80].4.2 MethodsOur proposed method consists of registering a statistical shape model of the lum-bar spine constructed from CT scans onto a single 2D ultrasound image. This isachieved by applying classification and segmentation techniques to extract infor-mation from the ultrasound image, which allows us to determine the correspon-dence between the image and the statistical shape model for subsequent registra-tion. Specifically, we classify an incoming ultrasound image as belonging to theparamedian plane, which is reported as the plane that provides optimal visibility ofthe needle target for epidurals [81].4.2.1 Classification of ultrasound images in the paramedian planeA 2D ultrasound image of the lumbar spine provides a limited view of the anatomythat displays a single plane with at most three vertebrae. In order to register a 3Ddeformable model of the lumbar spine to a 2D ultrasound image, it is necessary toconstrain the space of correspondences between them. We propose to achieve thisby identifying the plane of the ultrasound image and subsequently segmenting thebone surfaces present in the image. The plane identification is expressed as a binaryclassification problem, where the ultrasound image is classified as belonging to theparamedian plane or not. If the ultrasound image belongs to the paramedian plane,we segment the bone surfaces present in the image which, in that particular plane,is expected to be the lamina. By extracting this information from the image, wehave characterized the particular bone structure and their orientation present in theimage, and this is used to constrain the registration problem.The ultrasound image classification model is based on a neural network modeltrained on features extracted with a Hadamard transform [82]. The Hadamardtransform is a transform similar to a discrete Fourier transform where it decom-poses a discrete signal into a set of orthogonal functions called Walsh functions.Walsh functions are analogous to trigonometric functions in Fourier analysis inthe sense that any discrete function can be represented as a superposition of Walsh42functions. In the classification model, Hadamard features are extracted from the en-tire image as well as partitions of the image, which are concatenated into a featurevector that describes global and local information about the direction of ultrasoundintensity changes. These feature vectors are then used to train an artificial neu-ral network as a binary classifier that detects ultrasound images in the paramedianplane.4.2.2 Semantic segmentation of lamina in the paramedian planeOnce the ultrasound image is classified as belonging to the paramedian plane, weproceed with the segmentation of bone surfaces which, in this particular plane, arethe vertebral lamina. The segmented lamina serve as targets for the registration ofthe statistical shape model of the lumbar spine.The model used for segmentation is a modified version of the U-Net, a deepconvolutional neural network that achieves state-of-the-art performance on medi-cal image segmentation tasks in several modalities including ultrasound [62, 83].This method provides an advantage over earlier methods to extract bone from ul-trasound, which relies on engineered features with Log-Gabor filters, by learningfeatures that directly maximize the segmentation performance over a dataset of ul-trasound images [79, 80]. Consequently, the U-Net depends on a dataset containinga large number of variations in order for it to learn features that are robust and thatgeneralize to new and unseen ultrasound images.4.2.3 CT statistical shape model of the lumbar spineWe utilized a restricted version of the 3D statistical shape model presented by Ra-soulian et al. [50, 84]. This model consists of a meshed representation of thelumbar vertebrae, which are deformable by a set of transformations estimatedfrom a database of CT scans. The transformations are estimated via a general-ization of principal component analysis, called principal geodesic analysis, whichis performed on Riemannian manifolds [85]. In this specific case, the Rieman-nian manifold of interest is the Lie group of 3D similarity transformations, wherethe geodesic between two points on the manifold is a similarity transformation.As in principal component analysis, principal geodesic analysis computes princi-43pal geodesics that form an orthogonal basis which, in the context of the statisticalshape model, represents transformations that are anatomically consistent with theCT scan database.The original model by Rasoulian et al. independently models the shape andpose variations of the lumbar spine, where shape refers to the geometry of the sur-face of the vertebrae, and pose refers to the relative arrangement of the lumbarvertebrae. In this chapter, we consider only pose transformations due to the inabil-ity to estimate shape parameters from a single ultrasound plane. Since the termstatistical shape model is more commonly known in the literature, any mentions ofshape hereafter refer to what is defined as pose in the original paper. The model isthen reduced to a triangle face-vertex mesh, where the vertices are a set of points{x11, . . . ,xlm, . . . ,xLM}, xlm ∈ R consisting of M points belonging to L vertebrae, anda shape transformation function,Φ(xlm,θ) = (xlm)TI∏i=1exp(θi logT li ) (4.1)where exp and log are the matrix exponential and matrix logarithm functions. I de-notes the number of principal geodesics extracted from the CT scans. T li is the 3Dsimilarity transformation matrix corresponding to the principal geodesic i appliedto points belonging to vertebra l, and θi is a coefficient denoting the magnitude ofthe transformation. The shape transformations are illustrated in Figure Registration of 3D statistical shape model to 2D ultrasoundWe sought to register the statistical shape model to the lamina segmented from anultrasound image using a point-set registration method. First, we generated reg-istration targets by extracting a point set from the binary mask obtained from thesegmentation model. This was done by applying morphological thinning and ex-tracting the pixel coordinates of the mask, resulting in a 2D point set. We thenexpand the dimensionality of the 2D point set by considering it a projection ontoan arbitrary 3D plane which, for simplicity, we choose to be the xy plane. From theinitial classification of the ultrasound image, we predicted that the image belongsto the paramedian view, which defines an approximate plane and vector in 3D.44Figure 4.2: Statistical shape model of the lumbar spine, constructed from atraining set of 64 CT scans. The mean shape of the statistical shapemodel (center diagram on top and bottom rows) can be transformed(Equation 4.1) according to shape variations statistically determinedfrom the training set (mode 1, top row; mode 3, bottom row).45Equivalently, these steps define the xy plane as the paramedian view. The parame-dian view also defines the anatomy that is expected to be present on the ultrasoundimage, and we can now restrict the points in the statistical shape model that areused for registration. With this information, we can approximately align the statis-tical shape model with the registration targets, and this serves as the initial setupfor the registration algorithm.The registration algorithm is similar to the approach used by Rasoulian et. al,with the difference that a constrained optimization was performed to mitigate theill-posedness of the 3D to 2D registration problem. The original algorithm em-ploys a point-set registration cost function derived from the Coherent Point Driftalgorithm, which considers the points in the statistical shape model as centroids ofa Gaussian mixture model (GMM) and the ultrasound points as data drawn fromthe mixture [84, 86, 87],Q =12σ2 ∑m,n,lP(xlm | yn)∥∥yn−RΦ(xlm;θ)− t∥∥2+32logσ2∑m,nP(xlm | yn)+λθTΓθwhere R is a rotation matrix, t is a translation vector, σ2 is the variance of theGaussian mixture (assumed to be the same for all components), Φ is the shapetransformation function given by Equation (4.1), and P(xlm | yn) is the posteriorprobability of the GMM centroids, given a previous estimate of the registrationparameters R, t,θ and σ2. In addition, we penalize the shape transformation pa-rameters via a weighted L2-regularization where λ is the norm penalty coefficient,and Γ is a diagonal matrix with entries α−1i , where αi is the variance explainedby the i-th principal geodesic. The cost function is optimized in two expectation-maximization stages. In the first stage, we optimize for the rigid transformationparameters R and t until the variance of the Gaussian mixture, σ2, crosses a cer-tain threshold. This prioritizes a larger portion of the cost to be attributed to arigid misalignment, which we found to improve the registration performance. Wesubsequently allow for the algorithm to optimize for both the rigid and shape trans-formation parameters θ until convergence.464.3 Experiments and Results4.3.1 Model training and evaluationWe extracted 438 2D ultrasound images of the lumbar spine, with 215 paramedianview images and 223 non-paramedian view images from an existing dataset of104 3D ultrasound scans acquired by an expert sonographer from 13 volunteerswith informed written consent (University of British Columbia Clinical ResearchEthics Board certificate number H07-0691). The 3D scans were acquired with a3D motorized ultrasound transducer, which constructs a 3D scan via a motorizedsystem that acquires a sequence of 2D images at different angles, and we considerthe frames of the 3D scans as an approximation of native 2D ultrasound images.All volunteers were aged between 20 and 35 years old and had a BMI under 30.All 3D ultrasound volumes were acquired with a SonixTouch ultrasound machine(Analogic Corporation, Richmond, Canada). Segmentation labels were manuallygenerated by the expert sonographer. All images were resized to 364-by-464 pixelswith a spatial resolution of 0.25 mm per pixel.The 438 ultrasound images were split into training, validation, and test datasetscontaining 256, 90, and 92 images, respectively. The training and validation setswere used to train and select the hyperparameters of the paramedian view classifi-cation model, where we arrived at an artificial neural network with one hidden layerwith 25 hidden units. The classification model was subsequently fit on a concate-nation of the training and validation datasets, and its performance was evaluatedon the test set.For the lamina segmentation model, we separated the paramedian view imagesfrom the above dataset into training, validation, and test datasets consisting of 125,43, and 43 paramedian view images, respectively. Similarly, we performed hy-perparameter selection using the training and validation sets. The selected modelcontained 32 base filters and transpose convolution layers, instead of the 64 basefilters and upsampling layers in the original architecture. The resulting architecturewas then fit to a concatenation of the training and validation sets and subsequentlyevaluated on the test set with the Dice coefficient. The classification and registra-tion performance metrics are reported in Table 4.1.47Task Performance metric ValueLamina classification Accuracy 0.90Recall 0.84Precision 0.94Lamina segmentation Dice coefficient 74.5±4.9%Registration to extracted lamina RMSE 1.2±0.2 mmModel to ground truth distance RMSE 1.4±0.3 mmTable 4.1: Performance of lamina classification, segmentation and statisticalshape model registration.4.3.2 Registration and visualizationThe performance of the registration algorithm was evaluated on segmentation maskpredictions from the test set. Two metrics were used to quantify the performance:(1) the root mean squared error (RMSE) between the extracted registration tar-gets from the predicted segmentation mask and the closest surface of the registeredmodel, where the surface is defined by the resulting face-vertex mesh after registra-tion; and (2) the RMSE between the extracted registration targets from the groundtruth segmentation mask and the closest surface of the model. The registered modelis then overlayed onto the ultrasound image by generating a cross-section of themodel that intersects the paramedian plane (Figure 4.3) or by overlaying a wire-frame representation of the model that depicts the 3D anatomy of the spine (Figure4.4). Descriptive statistics plots are included in Appendix A (Figure A.3).4.4 DiscussionRegistration of the statistical shape model to the 2D targets is an ill-posed prob-lem because the match between corresponding features is not unique and is highlydependent on initialization. Our method leverages paramedian view classificationand lamina segmentation to extract anatomical information that allows us to nar-row down the number of potential correspondences between the ultrasound imageand the statistical shape model. While paramedian view classification enables usto select a subset of the statistical shape model for registration, our method is not48(a) (b)(c) (d)Figure 4.3: Example visualization of the registered model to an ultrasoundimage of the L2-3 and L3-4 intervertebral levels in the paramedian view(a), and comparisons with the ground truth lamina segmentations froman expert sonographer (b, c, d).49Figure 4.4: Example wireframe visualization of the registered model onto ul-trasound images of the L2-3 and L3-4 intervertebral levels in the para-median view. The statistical shape model depicts anatomical featuresthat are not present in the plane of ultrasound acquisition, where thecomplete visualization of the model provides an approximate idea ofwhere the ultrasound plane intersects the lumbar spine.50able to determine the vertebral level location. Automatic vertebral level classifi-cation is a difficult problem given the limited acoustic window of a traditional 2Dultrasound transducer, although semi-automatic methods exist which require a spe-cific scanning protocol [88]. Furthermore, our method was not able to detect thelateral location of the transducer relative to the mid-sagittal plane. This is becausethe spine is approximately symmetrical about the mid-sagittal plane, which makesparamedian view ultrasound images from both sides indistinguishable by the clas-sifier. Conversely, our method generalizes to any ultrasound view of the spine thatcan be classified by a machine learning model.Our evaluation methodology is limited due to the lack of ground truth cor-respondences between the model points and the target points on the ultrasound.In order to obtain these ground truth labels, it would be necessary to acquire CTscans and ultrasound images from the same patient, which is infeasible due to costand radiation safety. A safer approach would be to let an expert generate corre-spondences by looking at unpaired images, but this would likely require additionalultrasound data to provide context. Paired CT and ultrasound images from phan-toms would also be an option, but several phantoms would be required to capturethe natural variations in lumbar spine shape. Lastly, there is no evaluation criterionfor anatomical features of the model that have no corresponding counterpart in theultrasound image. This is because we have restricted the individual vertebrae inour model to move rigidly, and we are assuming that a good fit to the posteriorsurfaces of the model equates to a good fit elsewhere in the vertebrae. This is anoptimistic assumption as there are non-rigid variations in the vertebrae of differentpeople, but the information contained in a single ultrasound image is insufficientto estimate the properties of non-rigid deformations.Regarding lamina segmentation, the resulting Dice coefficient mean and stan-dard deviation of 74.5± 4.9% indicates a considerable disagreement between thepredicted segmentation maps and the manually segmented ground truths. Qualita-tively, we observed that segments of the lamina were missed by the segmentationalgorithm, resulting in discontinuities in the predicted segmentation mask (Figure4.5). The registration algorithm was able to compensate in cases where only smallsegments of the lamina were missed, which appear to provide sufficient informa-tion for a satisfactory solution. The performance of our segmentation algorithm51(a) (b)Figure 4.5: Example of cases with poor segmentation performance. The reg-istration algorithm was able to compensate for partially segmented lam-ina as the correctly segmented portions appear to provide sufficient in-formation. Regions of the lamina that were missed by the segmentationalgorithm are highlighted in red.is likely attributed to the small size of our dataset, and we can expect the per-formance to improve significantly with a larger dataset of paramedian ultrasoundimages. Moreover, intraobserver variability in the manual labeling of ultrasoundimages may have introduced additional errors in the segmentation performance ofthe algorithm. The extent of the effect of intraobserver variability may be quantifiedwith repeatability studies and/or estimated with uncertainty estimation techniquesin machine learning.4.5 ConclusionThe interpretation of ultrasound images by anesthesiologists poses a challengethat requires additional training. We proposed a lumbar spine ultrasound imageaugmentation method based on the registration of a 3D statistical shape modelconstructed from CT scans directly to 2D ultrasound images. The augmentationof ultrasound images with a patient-specific model of the lumbar spine providesmore context to the ultrasound features and shows promise for a potential tool for52guidance and training for ultrasound-guided anesthesia. Regarding our registra-tion method, potential avenues for future work include methods for the stitchingof ultrasound planes that do not require the use of position tracking devices, whichwould allow the acquisition of adjacent planes for registration. Furthermore, ourwork does not address the visualization of epidural needles in ultrasound, which isa crucial aspect of real-time ultrasound guidance.53Chapter 5ConclusionIn this thesis, we presented two machine learning systems for ultrasound imag-ing in obstetrics: a system for automatic placenta segmentation from ultrasoundsweeps, and a system for ultrasound image augmentation via CT statistical shapemodel registration for ultrasound-guided epidurals. Our contribution to automaticplacenta segmentation is a fast and robust segmentation model that can automati-cally segment the placenta with comparable accuracy to manual segmentation. Forultrasound-guided epidurals, we contributed a system that automatically registersand superimposes a CT statistical shape model onto 2D ultrasound images of thelumbar spine.For the work on automatic placenta segmentation from ultrasound sweeps, weanalyzed four different deep learning architectures for spatiotemporal semanticsegmentation applied on a labeled dataset of 133 ultrasound sweeps, which con-sists of a total of 4531 frames. This labeled dataset is a subset of a larger datasetrequested from the EMMA clinic at the BC Women’s Hospital, consisting of ap-proximately 600 ultrasound sweeps of the placenta with associated maternal andnewborn outcomes. As discussed in Chapter 3, a key motivation for developingan automatic segmentation system was to provide a way to automatically processlarge datasets for subsequent analysis of placenta features in ultrasound and howthey relate to placenta-mediated outcomes in pregnancy. The proposed automaticsegmentation system offers a way to automatically measure placental dimensionssuch as volume and thickness, but future work may be carried out to analyze fea-54tures such as placental texture and specific types of lesions.Our work on lumbar spine ultrasound augmentation for ultrasound-guided epidu-rals is summarized by a 3D-to-2D registration approach where a 3D statisticalshape model of the lumbar vertebrae from CT scans is registered to 2D ultrasoundimages of the lumbar spine. In particular, we demonstrated a way to leverage im-age classification and segmentation to extract features from 2D ultrasound that canprovide context about the location where it intersects the lumbar spine, and use thatinformation to register a statistical shape model to provide a patient-specific visu-alization of the lumbar spine. Furthermore, the proposed method is generalizableto any view of the lumbar spine that can be classified and segmented by machinelearning models such as the ones used in Chapter 4. While our primary motivationwas to develop a system for ultrasound epidural guidance, the proposed systemmay also be useful as an educational tool for medical students and residents, as theincreasing adoption of ultrasound guidance has also resulted in the development ofsimulation-based methods that are more scalable than traditional cadaver models[73, 89, 90].5.1 Limitations and future workAs mentioned above, one of the motivations of our work was to segment the pla-centa for subsequent analysis of the ultrasound appearance and how it is associatedwith placenta-mediated diseases. However, one may attempt to carry out this anal-ysis by training image classification models directly on the ultrasound sweeps with-out any prior segmentation. We attempted this early on via transfer learning withstate-of-the-art image classification models and experienced severe overfitting. Wesubsequently concluded that our dataset size was too small for this particular typeof analysis, and acquiring more ultrasound sweeps, especially sweeps with associ-ated outcomes, was not an option. We also concluded that in order to potentiallyreduce the dataset size requirement for classification, one could segment the pla-centa beforehand to incorporate the prior knowledge that only the placenta is rel-evant for predicting placenta-mediated outcomes. The intuition behind this is thata segmentation preprocessing step prevents the subsequent training of a classifica-tion model from finding correlations between the placenta-mediated outcomes and55irrelevant pixels from the ultrasound sweep. However, we recognize that while wehave the prior knowledge that certain outcomes are placenta-mediated, there is notyet a clinical consensus on the use of placental texture and lesions in ultrasound aspredictors for placenta-mediated outcomes, to the extent of our knowledge.Regarding our work in lumbar spine augmentation, we demonstrated that ourregistration of 3D statistical shape model to 2D ultrasound produces satisfactoryvisualizations based on particular validation criteria which describes how well thecorresponding structures visible on each modality fit together. This set of vali-dation criteria poses a key limitation to the proposed approach, as we found noreliable way to validate structures like the vertebral bodies and spinous processes,which are present in the statistical shape model but not in the ultrasound images.This same limitation applies to the earlier methods for statistical shape registrationto stitched ultrasound images and 3D ultrasound discussed in Chapter 4. There-fore, we recognize that a study with paired ultrasound and CT scans acquired fromthe same set of patients is necessary to verify the validity of the proposed approachin a clinical setting. Further work in ultrasound augmentation includes the useof image-to-image translation models based on generative adversarial networks,which have recently been shown to transform images from one medical imagingmodality to another (e.g., US to MR, CT to MR, PET to CT) [91–93]. However, re-liable validation of the image-to-image translation task still requires paired imagesfrom both modalities.56Bibliography[1] L. Porto and R. Rohling, “Improving interpretability of 2-d ultrasound of thelumbar spine,” in 2020 IEEE 17th International Symposium on BiomedicalImaging (ISBI). IEEE, 2020, pp. 1–5. → page vi[2] L. J. Salomon, Z. Alfirevic, F. Da Silva Costa, R. Deter, F. Figueras, T. Ghi,P. Glanc, A. Khalil, W. Lee, R. Napolitano et al., “Isuog practice guidelines:ultrasound assessment of fetal biometry and growth,” Ultrasound inObstetrics & Gynecology, vol. 53, no. 6, pp. 715–723, 2019. → page 1[3] A. Bhide, G. Acharya, C. Bilardo, C. Brezinka, D. Cafici,E. Hernandez-Andrade, K. Kalache, J. Kingdom, T. Kiserud, W. Lee et al.,“Isuog practice guidelines: use of doppler ultrasonography in obstetrics,”Ultrasound in Obstetrics & Gynecology, vol. 41, no. 2, p. 233, 2013. →page 1[4] J. S. Dashe, D. D. McIntire, and D. M. Twickler, “Maternal obesity limitsthe ultrasound evaluation of fetal anatomy,” Journal of Ultrasound inMedicine, vol. 28, no. 8, pp. 1025–1030, 2009. → page 1[5] R. N. Uppot, D. V. Sahani, P. F. Hahn, M. K. Kalra, S. S. Saini, and P. R.Mueller, “Effect of obesity on image quality: fifteen-year longitudinal studyfor evaluation of dictated radiology reports,” Radiology, vol. 240, no. 2, pp.435–439, 2006.[6] O. Picone, I. Simon, A. Benachi, F. Brunelle, and P. Sonigo, “Comparisonbetween ultrasound and magnetic resonance imaging in assessment of fetalcytomegalovirus infection,” Prenatal Diagnosis, vol. 28, no. 8, pp. 753–758,2008. → page 1[7] G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi,M. Ghafoorian, J. A. Van Der Laak, B. Van Ginneken, and C. I. Sa´nchez, “Asurvey on deep learning in medical image analysis,” Medical ImageAnalysis, vol. 42, pp. 60–88, 2017. → page 157[8] S. Liu, Y. Wang, X. Yang, B. Lei, L. Liu, S. X. Li, D. Ni, and T. Wang,“Deep learning in medical ultrasound analysis: a review,” Engineering,vol. 5, no. 2, pp. 261–275, 2019. → pages 1, 41[9] S. Burlone, L. Moore, and W. Johnson, “Overcoming barriers to accessingobstetric care in underserved communities,” Obstetrics & Gynecology, vol.134, no. 2, pp. 271–275, 2019. → page 2[10] A. Banke-Thomas, K. Wright, and L. Collins, “Assessing geographicaldistribution and accessibility of emergency obstetric care in sub-saharanafrica: a systematic review,” Journal of Global Health, vol. 9, no. 1, 2019.→ page 2[11] M. Haggstrom et al., “Medical gallery of mikael haggstrom 2014,”WikiJournal of Medicine, vol. 1, no. 2, p. 1, 2014. → pages xi, 2[12] L. Alamo, A. Anaye, J. Rey, A. Denys, G. Bongartz, S. Terraz, S. Artemisia,R. Meuli, and S. Schmidt, “Detection of suspected placental invasion by mri:do the results depend on observer’experience?” European Journal ofRadiology, vol. 82, no. 2, pp. e51–e57, 2013. → pages xi, 2[13] J. T. Henderson, J. H. Thompson, B. U. Burda, and A. Cantor,“Preeclampsia Screening: Evidence Report and Systematic Review for theUS Preventive Services Task Force,” JAMA, vol. 317, no. 16, pp.1668–1683, 04 2017. → pages 3, 23[14] P. Guerby and E. Bujold, “Early Detection and Prevention of IntrauterineGrowth Restriction and Its Consequences,” JAMA Pediatrics, 05 2020. →page 3[15] L. Ghulmiyyah and B. Sibai, “Maternal mortality frompreeclampsia/eclampsia,” Seminars in Perinatology, vol. 36, no. 1, pp. 56 –59, 2012, maternal Mortality. → pages 3, 23[16] A. Sotiriadis, E. Hernandez-Andrade, F. da Silva Costa, T. Ghi, P. Glanc,A. Khalil, W. Martins, A. Odibo, A. Papageorghiou, L. Salomon et al.,“Isuog practice guidelines: role of ultrasound in screening for and follow-upof pre-eclampsia,” Ultrasound in Obstetrics & Gynecology, vol. 53, no. 1,pp. 7–22, 2019. → pages 3, 12, 13[17] K. Cheong, K. Leung, T. Li, H. Chan, Y. Lee, and M. Tang, “Comparison ofinter-and intraobserver agreement and reliability between three differenttypes of placental volume measurement technique (xi vocal™, vocal™ and58multiplanar) and validity in the in-vitro setting,” Ultrasound in Obstetricsand Gynecology, vol. 36, no. 2, pp. 210–217, 2010. → pages 3, 23, 24, 35[18] K. Allman, I. Wilson, and A. O’Donnell, Oxford Handbook of Anaesthesia.Oxford university press, 2016. → pages 4, 15, 39[19] J. M. Neal, R. Brull, J.-L. Horn, S. S. Liu, C. J. McCartney, A. Perlas, F. V.Salinas, and B. C.-h. Tsui, “The second american society of regionalanesthesia and pain medicine evidence-based medicine assessment ofultrasound-guided regional anesthesia: executive summary,” RegionalAnesthesia & Pain Medicine, vol. 41, no. 2, pp. 181–194, 2016. → page 4[20] J. M. Neal, “Ultrasound-guided regional anesthesia and patient safety:update of an evidence-based analysis,” Regional Anesthesia & PainMedicine, vol. 41, no. 2, pp. 195–204, 2016. → page 4[21] H.-E. Gueziri, C. Santaguida, and D. L. Collins, “The state-of-the-art inultrasound-guided spine interventions,” Medical Image Analysis, p. 101769,2020. → pages 4, 17, 40[22] A. Kim, G. Sendlewski, E. Zador, M. Kalsi, L. Zador, and V. Kurup,“Placing a lumbar epidural catheter,” New England Journal of Medicine,2018. → pages xi, 4[23] M. Pesteie, V. Lessoway, P. Abolmaesumi, and R. Rohling, “Automaticmidline identification in transverse 2-d ultrasound images of the spine,”Ultrasound in Medicine & Biology, 2020. → page 4[24] M. Pesteie, V. Lessoway, P. Abolmaesumi, and R. N. Rohling, “Automaticlocalization of the needle target for ultrasound-guided epidural injections,”IEEE Transactions on Medical Imaging, vol. 37, no. 1, pp. 81–92, 2017.[25] J. Hetherington, M. Pesteie, V. A. Lessoway, P. Abolmaesumi, and R. N.Rohling, “Identification and tracking of vertebrae in ultrasound using deepnetworks with unsupervised feature learning,” in Medical Imaging 2017:Image-Guided Procedures, Robotic Interventions, and Modeling, vol. 10135.International Society for Optics and Photonics, 2017, p. 101350K. → page 4[26] C. A. Andersen, S. Holden, J. Vela, M. S. Rathleff, and M. B. Jensen,“Point-of-care ultrasound in general practice: a systematic review,” TheAnnals of Family Medicine, vol. 17, no. 1, pp. 61–69, 2019. → page 559[27] T. Micks, K. Sue, and P. Rogers, “Barriers to point-of-care ultrasound use inrural emergency departments,” Canadian Journal of Emergency Medicine,vol. 18, no. 6, pp. 475–479, 2016. → page 5[28] T. Micks, D. Braganza, S. Peng, P. McCarthy, K. Sue, P. Doran, J. Hall,H. Holman, D. O’Keefe, P. Rogers et al., “Canadian national survey ofpoint-of-care ultrasound training in family medicine residency programs,”Canadian Family Physician, vol. 64, no. 10, pp. e462–e467, 2018. → page 5[29] R. Lindsey, A. Daluiski, S. Chopra, A. Lachapelle, M. Mozer, S. Sicular,D. Hanel, M. Gardner, A. Gupta, R. Hotchkiss et al., “Deep neural networkimproves fracture detection by clinicians,” Proceedings of the NationalAcademy of Sciences, vol. 115, no. 45, pp. 11 591–11 596, 2018. → page 5[30] M. E. Vandenberghe, M. L. Scott, P. W. Scorer, M. So¨derberg, D. Balcerzak,and C. Barker, “Relevance of deep learning to facilitate the diagnosis of her2status in breast cancer,” Scientific Reports, vol. 7, no. 1, pp. 1–11, 2017.[31] D. Ardila, A. P. Kiraly, S. Bharadwaj, B. Choi, J. J. Reicher, L. Peng, D. Tse,M. Etemadi, W. Ye, G. Corrado et al., “End-to-end lung cancer screeningwith three-dimensional deep learning on low-dose chest computedtomography,” Nature Medicine, vol. 25, no. 6, pp. 954–961, 2019. → page 5[32] M. Baad, Z. F. Lu, I. Reiser, and D. Paushter, “Clinical significance of usartifacts,” Radiographics, vol. 37, no. 5, pp. 1408–1423, 2017. → pages9, 11[33] L. Zhu, C.-W. Fu, M. S. Brown, and P.-A. Heng, “A non-local low-rankframework for ultrasound speckle reduction,” in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPR), July2017. → page 9[34] F. Dietrichson, E. Smistad, A. Ostvik, and L. Lovstakken, “Ultrasoundspeckle reduction using generative adversial networks,” in 2018 IEEEInternational Ultrasonics Symposium (IUS), 2018, pp. 1–4.[35] D. Hyun, L. L. Brickson, K. T. Looby, and J. J. Dahl, “Beamforming andspeckle reduction using neural networks,” IEEE Transactions onUltrasonics, Ferroelectrics, and Frequency Control, vol. 66, no. 5, pp.898–910, 2019.[36] A. S. Leal and H. M. Paiva, “A new wavelet family for speckle noisereduction in medical ultrasound images,” Measurement, vol. 140, pp.572–581, 2019. → page 960[37] L. Zhu, W. Wang, X. Li, Q. Wang, J. Qin, K.-H. Wong, K.-S. Choi, C.-W.Fu, and P.-A. Heng, “Feature-preserving ultrasound speckle reduction via l0minimization,” Neurocomputing, vol. 294, pp. 48–60, 2018. → pages xi, 10[38] L. Salomon, Z. Alfirevic, C. Bilardo, G. Chalouhi, T. Ghi, K. Kagan, T. Lau,A. Papageorghiou, N. Raine-Fenning, J. Stirnemann et al., “Isuog practiceguidelines: performance of first-trimester fetal ultrasound scan.” Ultrasoundin Obstetrics & Gynecology, vol. 41, no. 1, p. 102, 2013. → page 11[39] K. Salvesen, C. Lees, J. Abramowicz, C. Brezinka, G. Ter Haar, K. Marsˇa´l,B. of the International Society of Ultrasound in Obstetrics, and G. (ISUOG),“Isuog statement on the safe use of doppler in the 11 to 13+ 6-week fetalultrasound examination,” Ultrasound in Obstetrics & Gynecology, vol. 37,no. 6, pp. 628–628, 2011. → page 12[40] G. R. ter Haar, J. S. Abramowicz, I. Akiyama, D. H. Evans, M. C. Ziskin,and K. Marsˇa´l, “Do we need to restrict the use of doppler ultrasound in thefirst trimester of pregnancy?” Ultrasound in Medicine and Biology, vol. 39,no. 3, pp. 374–380, 2013. → page 12[41] H. A. Guimara˜es Filho, L. L. D. da Costa, E. A. Ju´nior, C. R. Pires, L. M. M.Nardozza, and R. Mattar, “Xi vocal (extended imaging vocal): a newmodality for three-dimensional sonographic volume measurement,” Archivesof Gynecology and Obstetrics, vol. 276, no. 1, pp. 95–97, 2007. → pages14, 24[42] G. N. Stevenson, S. L. Collins, J. Ding, L. Impey, and J. A. Noble, “3-dultrasound segmentation of the placenta using the random walker algorithm:reliability and agreement,” Ultrasound in Medicine & Biology, vol. 41,no. 12, pp. 3182–3193, 2015. → pages 14, 24, 35[43] E. J. Baird, “Identification and management of obstetric hemorrhage,”Anesthesiology Clinics, vol. 35, no. 1, pp. 15–34, 2017. → pages xii, 14[44] M. Paech, R. Godkin, and S. Webster, “Complications of obstetric epiduralanalgesia and anaesthesia: a prospective analysis of 10 995 cases,”International Journal of Obstetric Anesthesia, vol. 7, no. 1, pp. 5–11, 1998.→ page 15[45] A. T. Watanabe, E. Nishimura, and J. Garris, “Image-guided epidural steroidinjections,” Techniques in Vascular and Interventional Radiology, vol. 5,no. 4, pp. 186–193, 2002. → page 1561[46] A. Perlas, L. E. Chaparro, and K. J. Chin, “Lumbar neuraxial ultrasound forspinal and epidural anesthesia: a systematic review and meta-analysis,”Regional Anesthesia & Pain Medicine, vol. 41, no. 2, pp. 251–260, 2016. →page 16[47] S. Yu, K. K. Tan, B. L. Sng, S. Li, and A. T. H. Sia, “Automaticidentification of needle insertion site in epidural anesthesia with a cascadingclassifier,” Ultrasound in Medicine & Biology, vol. 40, no. 9, pp. 1980–1990,2014. → pages xii, 16[48] K. Ahn, H.-J. Jhun, T.-K. Lim, and Y.-S. Lee, “Fluoroscopically guidedtransforaminal epidural dry needling for lumbar spinal stenosis using aspecially designed needle,” BMC Musculoskeletal Disorders, vol. 11, no. 1,p. 180, 2010. → pages xii, 16[49] M. Brudfors, A. Seitel, A. Rasoulian, A. Lasso, V. A. Lessoway, J. Osborn,A. Maki, R. N. Rohling, and P. Abolmaesumi, “Towards real-time,tracker-less 3d ultrasound guidance for spine anaesthesia,” InternationalJournal of Computer Assisted Radiology and Surgery, vol. 10, no. 6, pp.855–865, Jun 2015. → page 17[50] A. Rasoulian, A. Seitel, J. Osborn, S. Sojoudi, S. Nouranian, V. A.Lessoway, R. N. Rohling, and P. Abolmaesumi, “Ultrasound-guided spinalinjections: a feasibility study of a guidance system,” International Journal ofComputer Assisted Radiology and Surgery, vol. 10, no. 9, pp. 1417–1425,2015. → pages 17, 40, 43[51] D. Behnami, A. Sedghi, E. M. A. Anas, A. Rasoulian, A. Seitel,V. Lessoway, T. Ungi, D. Yen, J. Osborn, P. Mousavi et al., “Model-basedregistration of preprocedure mr and intraprocedure us of the lumbar spine,”International journal of computer assisted radiology and surgery, vol. 12,no. 6, pp. 973–982, 2017. → page 17[52] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in3rd International Conference on Learning Representations, ICLR 2015, SanDiego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Y. Bengioand Y. LeCun, Eds., 2015. → pages 21, 31[53] J. A. Noble, N. Navab, and H. Becher, “Ultrasonic image analysis andimage-guided interventions,” Interface Focus, vol. 1, no. 4, pp. 673–685,2011. → page 2162[54] J. Gil, H. Wu, and B. Y. Wang, “Image analysis and morphometry in thediagnosis of breast cancer,” Microscopy Research and Technique, vol. 59,no. 2, pp. 109–118, 2002. → page 21[55] A. C. De Kat, J. Hirst, M. Woodward, S. Kennedy, and S. A. Peters,“Prediction models for preeclampsia: a systematic review,” Pregnancyhypertension, vol. 16, pp. 48–66, 2019. → page 23[56] G. Rizzo, A. Capponi, O. Cavicchioni, M. Vendola, and D. Arduini, “Firsttrimester uterine doppler and three-dimensional ultrasound placental volumecalculation in predicting pre-eclampsia,” European Journal of Obstetrics &Gynecology and Reproductive Biology, vol. 138, no. 2, pp. 147–151, 2008.→ page 23[57] L. Proctor, M. Toal, S. Keating, D. Chitayat, N. Okun, R. Windrim,G. Smith, and J. Kingdom, “Placental size and the prediction of severeearly-onset intrauterine growth restriction in women with lowpregnancy-associated plasma protein-a,” Ultrasound in Obstetrics andGynecology: The Official Journal of the International Society of Ultrasoundin Obstetrics and Gynecology, vol. 34, no. 3, pp. 274–282, 2009. → page 23[58] N. Milligan, M. Rowden, E. Wright, N. Melamed, Y. M. Lee, R. C.Windrim, and J. C. Kingdom, “Two-dimensional sonographic assessment ofmaximum placental length and thickness in the second trimester: areproducibility study,” The Journal of Maternal-Fetal & Neonatal Medicine,vol. 28, no. 14, pp. 1653–1659, 2015. → page 24[59] P. Looney, G. N. Stevenson, K. H. Nicolaides, W. Plasencia, M. Molloholli,S. Natsis, and S. L. Collins, “Automatic 3d ultrasound segmentation of thefirst trimester placenta using deep learning,” in 2017 IEEE 14thInternational Symposium on Biomedical Imaging (ISBI 2017). IEEE, 2017,pp. 279–282. → page 24[60] G. Wang, M. A. Zuluaga, W. Li, R. Pratt, P. A. Patel, M. Aertsen, T. Doel,A. L. David, J. Deprest, S. Ourselin et al., “Deepigeos: a deep interactivegeodesic framework for medical image segmentation,” IEEE Transactionson Pattern Analysis and Machine Intelligence, vol. 41, no. 7, pp. 1559–1572,2018.[61] R. Hu, R. Singla, R. Yan, C. Mayer, and R. N. Rohling, “Automated placentasegmentation with a convolutional neural network weighted by acousticshadow detection,” in 2019 41st Annual International Conference of the63IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, 2019,pp. 6718–6723. → page 24[62] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks forbiomedical image segmentation,” in Medical Image Computing andComputer-Assisted Interventions. Springer, 2015, pp. 234–241. → pages26, 43[63] F. Isensee, P. Kickingereder, W. Wick, M. Bendszus, and K. H. Maier-Hein,“No new-net,” in 4th International MICCAI Brainlesion Workshop, BrainLes2018 held in conjunction with the Medical Image Computing for ComputerAssisted Interventions Conference, MICCAI 2018. Springer Verlag, 2019,pp. 234–244. → page 26[64] E. M. A. Anas, P. Mousavi, and P. Abolmaesumi, “A deep learning approachfor real time prostate segmentation in freehand ultrasound guided biopsy,”Medical Image Analysis, vol. 48, pp. 107–116, 2018. → page 28[65] S. Xie, C. Sun, J. Huang, Z. Tu, and K. Murphy, “Rethinking spatiotemporalfeature learning: Speed-accuracy trade-offs in video classification,” pp.305–321, 2018. → page 28[66] S. Ji, W. Xu, M. Yang, and K. Yu, “3d convolutional neural networks forhuman action recognition,” IEEE Transactions on Pattern Analysis andMachine Intelligence, vol. 35, no. 1, pp. 221–231, 2012. → page 28[67] F. Milletari, N. Navab, and S.-A. Ahmadi, “V-net: Fully convolutional neuralnetworks for volumetric medical image segmentation,” in 2016 FourthInternational Conference on 3D vision (3DV). IEEE, 2016, pp. 565–571.→ page 31[68] A. Kendall and Y. Gal, “What uncertainties do we need in bayesian deeplearning for computer vision?” in Advances in Neural InformationProcessing Systems, 2017, pp. 5574–5584. → page 34[69] P.-Y. Huang, W.-T. Hsu, C.-Y. Chiu, T.-F. Wu, and M. Sun, “Efficientuncertainty estimation for semantic segmentation in videos,” in Proceedingsof the European Conference on Computer Vision (ECCV), September 2018.→ page 34[70] E. S. of Radiology (ESR et al., “Esr statement on portable ultrasounddevices,” Insights Into Imaging, vol. 10, no. 1, p. 89, 2019. → page 3564[71] E. Declercq and B. Chalmers, “Mothers’ reports of their maternityexperiences in the usa and canada,” Journal of Reproductive and InfantPsychology, vol. 26, no. 4, pp. 295–308, 2008. → page 39[72] P. Marhofer and V. W. Chan, “Ultrasound-guided regional anesthesia:current concepts and future trends,” Anesthesia & Analgesia, vol. 104, no. 5,pp. 1265–1269, 2007. → page 39[73] X. X. Chen, V. Trivedi, A. A. AlSaflan, S. C. Todd, A. C. Tricco, C. J.McCartney, and S. Boet, “Ultrasound-guided regional anesthesia simulationtraining: A systematic review,” Regional Anesthesia & Pain Medicine,vol. 42, no. 6, pp. 741–750, 2017. → pages 40, 55[74] S. Nagpal, P. Abolmaesumi, A. Rasoulian, I. Hacihaliloglu, T. Ungi,J. Osborn, V. A. Lessoway, J. Rudan, M. Jaeger, R. N. Rohling et al., “Amulti-vertebrae ct to us registration of the lumbar spine in clinical data,”International Journal of Computer Assisted Radiology and Surgery, vol. 10,no. 9, pp. 1371–1381, 2015. → page 40[75] A. Seitel, S. Sojoudi, J. Osborn, A. Rasoulian, S. Nouranian, V. A.Lessoway, R. N. Rohling, and P. Abolmaesumi, “Ultrasound-guided spineanesthesia: feasibility study of a guidance system,” Ultrasound in Medicine& Biology, vol. 42, no. 12, pp. 3043–3049, 2016. → page 40[76] Q. Huang, F. Zhang, and X. Li, “Machine learning in ultrasoundcomputer-aided diagnostic systems: a survey,” BioMed ResearchInternational, vol. 2018, 2018. → page 41[77] N. Baka, S. Leenstra, and T. van Walsum, “Ultrasound aided vertebral levellocalization for lumbar surgery,” IEEE Transactions on Medical Imaging,vol. 36, no. 10, pp. 2138–2147, 2017. → page 41[78] T. Ungi, H. Greer, K. Sunderland, V. Wu, Z. M. Baum, C. Schlenger,M. Oetgen, K. Cleary, S. Aylward, and G. Fichtinger, “Automatic spineultrasound segmentation for scoliosis visualization and measurement,” IEEETransactions on Biomedical Engineering, 2020. → page 41[79] I. Hacihaliloglu, A. Rasoulian, R. N. Rohling, and P. Abolmaesumi,“Statistical shape model to 3d ultrasound registration for spine interventionsusing enhanced local phase features,” in Medical Image Computing andComputer-Assisted Interventions, K. Mori, I. Sakuma, Y. Sato, C. Barillot,and N. Navab, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013,pp. 361–368. → pages 42, 4365[80] E. M. A. Anas, A. Seitel, A. Rasoulian, P. S. John, D. Pichora, K. Darras,D. Wilson, V. A. Lessoway, I. Hacihaliloglu, P. Mousavi et al., “Boneenhancement in ultrasound using local spectrum variations for guidingpercutaneous scaphoid fracture fixation procedures,” International Journalof Computer Assisted Radiology and Surgery, vol. 10, no. 6, pp. 959–969,2015. → pages 42, 43[81] T. Grau, R. W. Leipold, J. Horter, R. Conradi, E. O. Martin, and J. Motsch,“Paramedian access to the epidural space: the optimum window forultrasound imaging,” Journal of Clinical Anesthesia, vol. 13, no. 3, pp.213–217, 2001. → page 42[82] M. Pesteie, P. Abolmaesumi, H. Al-Deen Ashab, V. A. Lessoway, S. Massey,V. Gunka, and R. N. Rohling, “Real-time ultrasound image classification forspine anesthesia using local directional hadamard features,” InternationalJournal of Computer Assisted Radiology and Surgery, vol. 10, pp. 901–912,06 2015. → page 42[83] Y. Weng, T. Zhou, Y. Li, and X. Qiu, “Nas-unet: Neural architecture searchfor medical image segmentation,” IEEE Access, vol. 7, pp. 44 247–44 257,2019. → page 43[84] A. Rasoulian, R. Rohling, and P. Abolmaesumi, “Lumbar spinesegmentation using a statistical multi-vertebrae anatomical shape+ posemodel,” IEEE Transactions on Medical Imaging, vol. 32, no. 10, pp.1890–1900, 2013. → pages 43, 46[85] P. T. Fletcher, C. Lu, S. M. Pizer, and S. Joshi, “Principal geodesic analysisfor the study of nonlinear statistics of shape,” IEEE Transactions on MedicalImaging, vol. 23, no. 8, pp. 995–1005, 2004. → page 43[86] A. Myronenko and X. Song, “Point set registration: Coherent point drift,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32,no. 12, pp. 2262–2275, 2010. → page 46[87] B. Jian, B. C. Vemuri, and J. L. Marroquin, “Robust nonrigid multimodalimage registration using local frequency maps,” in Biennial InternationalConference on Information Processing in Medical Imaging. Springer,2005, pp. 504–515. → page 46[88] J. Hetherington, V. Lessoway, V. Gunka, P. Abolmaesumi, and R. Rohling,“Slide: automatic spine level identification system using a deep66convolutional neural network,” International Journal of Computer AssistedRadiology and Surgery, vol. 12, no. 7, pp. 1189–1198, 2017. → page 51[89] T. Ungi, D. Sargent, E. Moult, A. Lasso, C. Pinter, R. C. McGraw, andG. Fichtinger, “Perk tutor: an open-source training platform forultrasound-guided needle insertions,” IEEE Transactions on BiomedicalEngineering, vol. 59, no. 12, pp. 3475–3481, 2012. → page 55[90] Z. Keri, D. Sydor, T. Ungi, M. S. Holden, R. McGraw, P. Mousavi, D. P.Borschneck, G. Fichtinger, and M. Jaeger, “Computerized training systemfor ultrasound-guided lumbar puncture on abnormal spine models: arandomized controlled trial,” Canadian Journal of Anesthesia/Journalcanadien d’anesthe´sie, vol. 62, no. 7, pp. 777–784, 2015. → page 55[91] J. A. Noble, “Anatomy-aware self-supervised fetal mri synthesis fromunpaired ultrasound images,” in Machine Learning in Medical Imaging:10th International Workshop, MLMI 2019, Held in Conjunction withMICCAI 2019, Shenzhen, China, October 13, 2019, Proceedings, vol.11861. Springer Nature, 2019, p. 178. → page 56[92] Y. Lei, J. Harms, T. Wang, Y. Liu, H.-K. Shu, A. B. Jani, W. J. Curran,H. Mao, T. Liu, and X. Yang, “Mri-only based synthetic ct generation usingdense cycle consistent generative adversarial networks,” Medical physics,vol. 46, no. 8, pp. 3565–3581, 2019.[93] A. Ben-Cohen, E. Klang, S. P. Raskin, S. Soffer, S. Ben-Haim, E. Konen,M. M. Amitai, and H. Greenspan, “Cross-modality synthesis from ct to petusing fcn and gan networks for improved automated lesion detection,”Engineering Applications of Artificial Intelligence, vol. 78, pp. 186–194,2019. → page 5667Appendix ASupporting Materials68Figure A.1: Fourier spectrum for the texture comparison examples in Figure3.7. The figures are positioned in the same order, where the spectra onthe left are from ultrasound images captured with the Philips iU22, andthose on the right are from images captured with the GE Voluson E8.Note the difference in the distribution of regions with higher magnitudein the spectrum (brighter yellow regions), which indicate differences indirectionality and frequency of the ultrasound spectrum.69Figure A.2: Dice coefficient distribution for the cross-validation predictions(N=133) for all placenta segmentation models: (A) U-Net with outputconcatenation; (B) U-Net with ConvLSTM features; (C) Stacked Con-vLSTM; and (D) V-Net.Figure A.3: Descriptive statistics plots for the results of lumbar spine aug-mentation (N=43): (a) Dice coefficient distribution for test examplesfor the lumbar segmentation model; and (b) distribution of root meansquared registration errors between the registered model and groundtruth segmentation maps.70


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items