Open Collections

UBC Undergraduate Research

Feasibility of Determining Breast Density using Processed Mammogram Images McAvoy, Steven M. 2010

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52966-Thesis Steven McAvoy.pdf [ 1.1MB ]
JSON: 52966-1.0052225.json
JSON-LD: 52966-1.0052225-ld.json
RDF/XML (Pretty): 52966-1.0052225-rdf.xml
RDF/JSON: 52966-1.0052225-rdf.json
Turtle: 52966-1.0052225-turtle.txt
N-Triples: 52966-1.0052225-rdf-ntriples.txt
Original Record: 52966-1.0052225-source.json
Full Text

Full Text

Feasibility of Determining Breast Density using Processed Mammogram Images by Steven M. McAvoy A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF BACHELOR OF SCIENCE HONOURS in The I. K. Barber School of Arts & Sciences (Computer Science) THE UNIVERSITY OF BRITISH COLUMBIA April, 2010 c© Steven M. McAvoy 2010 Abstract Breast cancer is one of the most common cancers among women today and one of the most lethal. Breast tissue density has become an increasingly important factor in determining a patient’s overall breast cancer risk. With the advent of digital mammography, various computer algorithms have been developed to automate the calculation of breast density. Two types of images are produced from digital mammography machines: the raw image, acquired from the imaging sensor, and the processed image which contains propriety techniques for visual enhancement. Currently, au- tomated breast density algorithms focus on utilizing the raw image while radiologists use the processed image for visual inspection. The processed image is then stored within a patient’s medical file. Discovering a means to detect breast density from processed images would allow radiologists to assess retroactively the breast cancer risk of any patient who has previously received a digital mammogram using minimal financial and human resources. An investigation into the feasibility of using an existing breast density algorithm on processed images was explored. The algorithm was modified to accept processed images as input. Thirty-nine craniocaudal mammogram images containing both raw images and their corresponding processed images were used as an experimental control. The breast densities of each image were calculated using both the original and modified algorithms and the resulting correlation was measured. While the modified algorithm did not produce a strong correlation with the original algorithm, evidence suggests further algorithm modifications may lead to a desired outcome. ii Acknowledgements I would like to thank Dr. Patricia Lasserre at UBC Okanagan for her guid- ance, extensive knowledge in image processing, and supervision during my Honours course as well as Dr. Rasika Rajapakshe for the opportunity to study mammographic image processing at the BC Cancer Agency. I also wish to thank my mother and father who have supported me throughout my life. Finally, I would like to thank my wife, Michelle, for her love, encourage- ment, and patience during my studies. iii Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 History of Mammography . . . . . . . . . . . . . . . . . . . . 1 1.2 Breast Tissue Density . . . . . . . . . . . . . . . . . . . . . . 2 2 Current Breast Density Algorithm . . . . . . . . . . . . . . . 4 2.1 Raw vs. Processed Images . . . . . . . . . . . . . . . . . . . . 4 2.2 Algorithm Overview . . . . . . . . . . . . . . . . . . . . . . . 5 2.2.1 Background Removal . . . . . . . . . . . . . . . . . . 5 2.2.2 Edge Detection . . . . . . . . . . . . . . . . . . . . . . 8 2.2.3 Erosion . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.4 Dense vs. Fatty Tissue Separation . . . . . . . . . . . 9 2.2.5 Density Computation . . . . . . . . . . . . . . . . . . 12 3 Software Modifications . . . . . . . . . . . . . . . . . . . . . . . 14 3.1 Image File Metadata . . . . . . . . . . . . . . . . . . . . . . . 14 3.2 Visual Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.3 Processed Images as Input . . . . . . . . . . . . . . . . . . . . 15 3.4 Data Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.1 Analysis of Algorithm Processing Phases . . . . . . . . . . . . 19 4.2 Analysis of Highest Gradient Magnitude Locations . . . . . . 19 4.3 Analysis of Dense Tissue Segmentation . . . . . . . . . . . . . 21 iv Table of Contents 5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . 23 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 v List of Figures 2.1 Images produced in digital mammography . . . . . . . . . . . 5 2.2 Breast density calculation algorithm used by Bden . . . . . . . 6 2.3 Skin-line detection of a mammogram image . . . . . . . . . . . 7 2.4 Binary image resulting from Otsu’s thresholding algorithm . . 12 3.1 Example of pixel inversion . . . . . . . . . . . . . . . . . . . . 16 4.1 Radiologists vs. Bden . . . . . . . . . . . . . . . . . . . . . . . 18 4.2 Highest gradient magnitude pixel locations . . . . . . . . . . . 20 4.3 Locations of the three highest gradient magnitudes . . . . . . 20 4.4 Result of thresholding on raw and processed images . . . . . . 21 4.5 Histograms of raw and processed images . . . . . . . . . . . . 22 vi Chapter 1 Introduction In Canada, breast cancer is the most common cancer among women and early detection remains the most effective strategy for increasing survivability [1]. Determining a woman’s risk of developing breast cancer allows for more effective screening practises to help the detection of breast cancer as early as possible. Many factors such as age, genetic disposition [7], family history, and breast tissue density [10] play a significant role in determining overall cancer risk. 1.1 History of Mammography Mammography is a medical imaging technique which utilises low-intensity x- ray radiation to examine breasts. The fist mammography machines became commonplace in the late 1960’s and recorded their images on analog x-ray film. Starting in 1976, mammography machines began to see service as an screening device against breast diseases. Modern day mammography utilises digital imaging in which x-rays are converted to electrical signals via solid- state detectors. Typically, two images of each breast are taken at different angles. The craniocaudal projection, which is orientated along the vertical axis of the body, shows central and inner breast tissue. The mediolateral projection orientated along the horizontal axis, reveals the entire gland. During each of the images, X-ray radiation is attenuated at a greater rate by more dense matter such as ductal and fibroglandular tissue. This tissue appears as a greyish white colour within the image. Less dense material, such as fat tissue, attenuates X-rays less effectively and manifests as dark regions. Digital mammography machines output two image formats for each im- age that has been recorded. The first format, known as raw or for-processing, contains an unaltered representation of the x-ray intensity values with white areas corresponding to strong intensity values, and conversely, dark areas corresponding to weak values. The second format, called processed or for- 1 1.2. Breast Tissue Density presentation, has the manufacturer’s proprietary image enhancement tech- niques applied and its pixel values inverted. This allows radiologists to ex- amine an image which has been optimised for the human eye at the expense of losing digital information during the enhancement process. Since radiologists only use the processed images for clinical diagnosis, these are the only images which are stored. Provincial legal requirements govern the length of time which these images must be archived. 1.2 Breast Tissue Density To date, determining breast tissue density remains a manual and highly sub- jective process [6]. Moreover, a world-wide standard for determining breast tissue density has yet to be created. A common process for determining breast tissue density is BI-RADS (Breast Imaging Reporting and Data Sys- tem) [2], however its adoption remains limited. Within the province of British Columbia, radiologists examine mammographic images and determine tissue density using nothing more than their experience and their best estimate [4]. Experienced radiologists’ estimates can differ by as much as 25% when exam- ining the same image [6]. Unfortunately, since computing breast density in a manual fashion is time consuming, and therefore expensive, it is almost never performed at mammography screening centres within British Columbia. To enhance the effectiveness of the existing mammography screening pro- gram within British Columbia, an algorithm for computing breast density using digital image processing was created. The algorithm was implemented within a proof-of-concept software package, named Bden. This algorithm obtains the raw mammogram image, resizes it to 1024x1024 pixels for faster computational performance, and finally computes the resulting breast tissue density as a percentage. Use of the raw image format was instrumental in the algorithm’s approach since this format contains the highest quantity and quality of image information. While Bden remains as proof-of-concept software, its potential to con- tribute valuable information to a patient’s overall cancer risk assessment is an exciting prospect. However, the additional cost of storing the raw images along with the processed images is considerable. A more desired approach would be to analyse the risks posed by breast density strictly on processed images. To maximize the potential of software like Bden, the ability to assess retroactively breast cancer risk at zero cost for women who have previously 2 1.2. Breast Tissue Density received mammograms is the motivation behind this thesis. Due to the fact that the digital archives contain only processed images for these women, we must determine if Bden’s density algorithm can be altered and arrive at the same density percentage within an acceptable margin of error. 3 Chapter 2 Current Breast Density Algorithm 2.1 Raw vs. Processed Images Mammography uses low-level X-ray radiation to image breast tissue. When a mammographic imaging device creates an X-ray image, the amount of X- ray attenuation, or loss, over a given area is recorded digitally. The recorded information contains the intensity of X-ray radiation received at each mea- suring point on a recording device. Each measuring point corresponds to the smallest unit within a digital image called a pixel. A digital image is the visual representation of the recorded data with each pixel’s intensity value corresponding to a shade of grey. Typically, the greater a pixel’s intensity, the brighter the shade of grey appears. Conversely, those objects which at- tenuate X-ray radiation appear darker. These types of images are known as raw images since they are produced using a direct mapping of the recorded intensity. Since all human tissue has density, it appears as a darker object than the surrounding background in mammogram images. Within the breast tissue lie regions which are more dense than others. It is these regions from which breast density is eventually calculated. In figure 2.1(a), it is difficult to see any variation in tissue density within the breast due to minute changes in grey level values. Since most medical diagnoses are still done via visual inspection, the raw image makes for a poor diagnostic tool. To aid medical professionals, a second image is always produced for each raw image. The second image, called the processed or for-presentation image, is the product of various proprietary image enhance- ment techniques which have been performed on the raw image. The result is an image which brings out fine detail within the breast tissue and removes background pixels. This image is better suited for human visual inspection and bares a resemblance to traditional X-ray images which were produced on film. With respect to image processing, the raw image is always preferred 4 2.2. Algorithm Overview since it contains unaltered data. (a) Raw image (b) Processed image Figure 2.1: Images produced in digital mammography 2.2 Algorithm Overview Bden’s determination of breast density is a multi-stage algorithm. A num- ber of image processing steps are applied before breast density is computed. First, the background must be removed to ensure only breast tissue is anal- ysed. Second, the edges within the image must be detected to determine the locations where both dense and fatty tissue coincide. And lastly, the dense tissue must be segmented from the entire breast. With the dense tissue seg- mented, the overall breast density can be computed. Figure 2.2 shows the sequence of steps executed during computation of breast density by Bden. 2.2.1 Background Removal Due to the fact that X-rays leave the aperture of the linear accelerator with approximately uniform intensity, X-rays which do not receive attenuation will receive similar pixel intensity values. These unattenuated pixels correspond to the image background while the object of interest attenuates the X-rays and appears darker. To accurately determine breast density, only the portion of the image containing breast tissue is required. It must be cleanly separated from the background in a process which creates a new image containing those pixels which pertain to breast tissue. Bden performs this task by assuming 5 2.2. Algorithm Overview Figure 2.2: Breast density calculation algorithm used by Bden 6 2.2. Algorithm Overview that all raw mammogram images contain a bi-modal histogram from which the peak containing pixels with the highest intensity will be deemed the background. The halfway point between the leftmost peak (breast tissue) and the rightmost peak (background) is chosen as the threshold value. Pixels which contain an intensity value lower than the threshold value are copied into a new image called the skin-line image shown in figure 2.3(b). The resulting skin-line image contains the breast tissue on a black background. (a) A raw mammo- gram image (b) The resulting skin- line image Figure 2.3: Skin-line detection of a mammogram image With only the breast tissue remaining, digital processing can begin. The first step is to smooth, or blur, the image to reduce noise from the CCD and decrease sharp pixel intensity transitions. With respect to raw mammo- graphic images, reducing intensity transitions aids in more easily detecting the edges which separate large regions of breast tissue from fatty tissue. Bden performs the smoothing operation by convolving a 5x5 averaging kernel over the entire skin-line image. The averaging kernel begins at the top-left-most location within the skin-line image and for each pixel, p, within the image, a new value is calculated by summing the intensity values of the surrounding pixels within the 5x5 neighbourhood and dividing by 25 as seen in equation (2.1). This process is repeated a total of 5 times to take advantage of the cumulative effects. p̄ = 1 25 a+2∑ i=a−2 b+2∑ j=b−2 pij (2.1) 7 2.2. Algorithm Overview 2.2.2 Edge Detection From visual inspection it can be observed that dense tissue often transitions into fatty tissue within a short radius. Detecting these small density transi- tion regions gives rise to subsections of the breast which definitively contain both dense and fatty tissue. To locate the possible positions of these sub- sections, the technique of image edge detection is applied. Edge detection is the process of identifying the boundaries between two regions with varying intensity values. Edges are defined as any change in pixel intensity in either the horizontal or vertical direction. These regions are determined by examin- ing the gradient across the two neighbouring pixels within the image. Using equation (2.2), the magnitude of the gradient can be computed and stored within in a new image. |∇f(x, y)| = √ (∂xf(x, y))2 + (∂yf(x, y))2 (2.2) Bden employs the Sobel discrete differentiation operator to approximate gradient values. The operator consists of a horizontal mask and a vertical mask which approximate the partial derivatives in their respective directions. These two masks are convolved with the skin-line image to produce the re- sulting edge-detected image. For equations (2.3), (2.4), and (2.5) let A be the source image, G be an image which contains the gradient magnitude values, and ? denote the convolution operator. ∂xA = Gx = −1 0 1−2 0 2 −1 0 1  ? A (2.3) ∂xA = Gy = −1 −2 −10 0 0 1 2 1  ? A (2.4) Using the approximated partial derivatives, equation (2.5) determines the magnitude of the gradient vector. The higher the magnitude, the greater the increase in the rate of change of pixel intensity. Gpij = √ Gxij 2 +Gyij 2 (2.5) 8 2.2. Algorithm Overview 2.2.3 Erosion Any skin-line image on which edge detection has been applied will contain the strongest gradient magnitudes along the skin’s surface. This is to be expected since the transition from the background to the breast is quite large. This gradient magnitude must be removed so as not to interfere with the ability to detect the internal subsections of dense and fatty tissue. Erosion is a mathematical morphological operation which is commonly used in image processing to reduce the total area of a region within an image by a given amount. Typically applied to binary images, erosion employs an operator, or structuring element, which is convolved over the entire image. Only those pixels which reside completely within the structuring element remain. More formally, equation (2.6) defines erosion. Let A be the image which is to be eroded by a structuring element B. B contains the set of all pixels z such that B, translated by z, resides in A. A	B = {z|(B)z ⊆ A} (2.6) To perform this operation, a 9x9 pixel structuring element is iterated through the entire edge-detected image. At each pixel within the image, the structuring element is placed over the current pixel such that it is located at the centre of the structuring element. If each of the surrounding pixels within the 9x9 neighbourhood is of higher intensity than the background intensity (zero), then the current pixel is not modified. However, if any one of the surrounding pixels under the structuring element equals the background intensity, then all the pixels within the structuring element are set to this background. 2.2.4 Dense vs. Fatty Tissue Separation Before computing density, dense tissue must be separated from the surround- ing fatty tissue. An image processing technique known as thresholding is well-suited to accomplish this task [5]. Isolating the dense tissue from the breast area is achieved by selecting an optimal pixel intensity value which best represents the transition point from dense to fatty tissue. The result is a binary image whose intensity values below the optimal value are deemed dense tissue while values above this optimum are considered fatty tissue. A lower bound for the intensity of dense tissue is easy to determine: it is the lowest intensity value within the actual breast. Calculating an 9 2.2. Algorithm Overview upper bound, or the optimal threshold value, is more difficult since the point at which a pixel’s intensity is no longer classified as dense is unclear. To compute this upper bound, Otsu’s thresholding algorithm [8] is used. The algorithm takes a statistical approach to thresholding and selects the optimal intensity value based on which value gives the highest between-class variance [3]. Simply stated, the computed optimal intensity would produce a binary image such that the probability of pixels below the threshold value versus those above have a maximum variance. Let L be the total number of distinct intensity values within the skin-line image, M ×N be the total number of pixels, and ni be the number of pixels of a given intensity such that 0 ≤ i ≤ L− 1 and MN = n0 +n1 + . . .+nL−1. The probability that a pixel, p, will contain a specific intensity is ni divided by the total number of pixels available, MN , and the sum of each intensity’s probability will be equal to 1 as shown in equation (2.7). L−1∑ i=0 pi = 1, pi ≥ 0 (2.7) Let Pd and Pf represent the probability of the respective dense and fatty tissue pixels given a threshold value, k. Their probabilities can be found using equations (2.8) and (2.9). Pd(k) = k∑ i=0 pi (2.8) Pf (k) = 1− Pd(k) (2.9) The mean intensity value of both the dense and fatty tissue resulting from the selected threshold value is obtained using equations (2.10) and (2.11). The mean pixel intensity of the entire image for a threshold, k, is given by equation (2.12). This value is referred to as the global mean and can be verified by summing the probabilities of the dense and fatty pixels. md(k) = 1 Pd(k) k∑ i=0 ipi (2.10) mf (k) = 1 Pf (k) L−1∑ i=k+1 ipi (2.11) 10 2.2. Algorithm Overview mG = L−1∑ i=0 ipi = Pd(k)md(k) + Pf (k)mf (k) (2.12) Determining the between-class variance of the the dense and fatty proba- bilities is calculated using equation (2.13). The global variance of the entire skin-line image is calculated using equation (2.14) and will remain constant regardless of the selected threshold value. Therefore, the success of the se- lected threshold value can be quantified as a function of k, which computes the ratio of the between-class variance to the global variance and shown in equation (2.15). σ2B(k) = Pd(k) (md(k)−mG)2 + Pf (k) (mf (k)−mG)2 (2.13) σ2G = L−1∑ i=0 (i−mG)pi (2.14) η(k) = σ2B(k) σ2G (2.15) The optimal threshold value, k∗, is the intensity value which produces the largest between-class variance (2.16). σ2B(k ∗) = max 0≤k≤L−1 σ2B(k) (2.16) With the optimal threshold value determined, the dense tissue within the skin-line image can be successfully isolated. A binary image, B, is created from the skin-line image, A, using equation (2.17) which sets all pixels with an intensity below the optimal threshold value to an intensity value of 1 and all other pixels to an intensity value of 0. B(x, y) = { 1 if A(x, y) < k∗ 0 if A(x, y) ≥ k∗ (2.17) The entire skin-line image cannot be used to determine the optimal threshold since the range of pixel intensities is large along with the back- ground being represented by the population of pixels. To increase the accu- racy of the optimal threshold value, subsections of the skin-line image which 11 2.2. Algorithm Overview contain both dense and fatty tissue are analysed. One such subsection can be found by iterating through the edge-detected image and selecting the high- est gradient magnitude. This location indicates the highest rate of change from dense tissue to fatty tissue. Pixels within a small radius of this location will contain both dense and fatty tissue intensities. Bden samples a 30x30 neighbourhood centred on the pixel with the highest gradient magnitude and applies Otsu’s algorithm to this subsection. 2.2.5 Density Computation Using the binary image produced by Otsu’s thresholding algorithm (figure 2.4), the total number of pixels containing dense tissue can be found by summing the number of non-zero pixels. Similarly, the number of pixels which represent the total breast area is found by counting the number of non-zero pixels within the skin-line image. Figure 2.4: Binary image resulting from Otsu’s thresholding algorithm The final stage of Bden’s algorithm is determining overall breast density. Inserting the number of dense and breast tissue pixels found above, density can be computed using equation (2.18). Density = Number of dense tissue pixels Number of pixels in entire breast × 100 (2.18) To minimize the possibility of erroneous density computation, two addi- tional binary images are produced using different threshold values. The new threshold values are found by finding the next highest gradient magnitudes within the edge-detected image located outside the previously selected 30x30 neighbourhood. This ensures that artefacts within the breast tissue, such 12 2.2. Algorithm Overview as calcifications, which often show up as small objects with high densities, do not falsely influence the computed breast density. After all three density values have been computed, Bden selects the smallest density value as the density which best represents the breast. 13 Chapter 3 Software Modifications Bden was developed as proof-of-concept software, and thus only features which were critical to its development were implemented. To determine how its density algorithm would work with processed images, some modifications were required to extend existing functionality or add entirely new function- ality. 3.1 Image File Metadata Mammogram images are stored in a medical image file format known as DICOM [9]. These files contain a vast amount of metadata regarding the stored image. The metadata is stored as a series of informational elements, called tags, which map unique identifiers to specific attributes. Unmodified, Bden was unable to open processed image files due to its inability to recognize most of the DICOM tags contained within the mammogram image files. In particular, the tag representing the presentation type (0x0008, 0x0068) was not recognized. This tag contains values which indicate whether the stored image is a raw or processed image. Bden had been programmed to assume that all input images were raw images without verification of the presentation type tag. To rectify this shortfall, Bden’s metadata parser was upgraded to read the presentation type tag. To reduce the required source-code refactoring, a boolean member variable within the DicomFile class was added to indicate whether the image being read was raw or processed. This variable was made accessible to other classes within Bden so appropriate action could be taken. 3.2 Visual Tools In its original state, Bden did not provide a method for viewing the intermedi- ate images produced during the execution of its algorithm with the exception 14 3.3. Processed Images as Input of the skin-line image. The resulting images from the edge detection, erosion, and thresholding processes were not displayed due to the inflexibility of the graphics display API within Bden. To verify that the edge detection and erosion steps within the density algorithm were being performed correctly, Bden’s image output system was modified to allow an image to be drawn at any stage. This was accomplished by first modifying the mygl class to erase the display buffer using the correct OpenGL API calls each time a new image was to be drawn. Next, the myframe class was modified to include a new method, DisplayImage, which would draw a given image and pause execution of the algorithm. The density algorithm was then modified to include calls to DisplayImage at each stage of the algorithm so a visual inspection could be conducted. 3.3 Processed Images as Input Bden’s algorithm always expects images to be in raw format, and thus it was decided to alter processed images to resemble raw images as much as possible. Due to the fact that the processed images are essentially the inverse of the raw image, the pixel values simply needed to be inverted. One barrier to this approach was the small area of white text located in the top-left corner of the processed image as shown in 3.1(a). This text was part of the pixel data and needed to be removed prior to inverting the processed image’s pixel values. To accomplish this, a 336x191 box was centred on the text and pixels within the box were set to an intensity value of 0 to match the processed image’s background intensity value. With the text covered, the pixel values where then inverted by subtracting each pixel’s intensity value from 4095; the maximum possible intensity value within a processed image. The resulting image contained a pure white background as seen in 3.1(b). Simulating the skin-line detection was accomplished by iterating over the entire image and setting the intensity value of each pixel to 0 if its value was previously 4095. 3.4 Data Logging The ability to acquire data on the inner workings of Bden’s density compu- tation process was a necessity in order to compare how well the algorithm would work on processed images over raw images. A small logging facility 15 3.4. Data Logging (a) Processed image (b) Pixel inversion (c) Background removal Figure 3.1: Example of pixel inversion 16 3.4. Data Logging was integrated into the core density algorithm which produced comma delim- ited output on such attributes as the selected optimal threshold, the location of all three pixels with the highest gradient magnitude, and computed breast density. The output from the logging facility could be easily imported into any spreadsheet application for data analysis. 17 Chapter 4 Results When computing breast density using raw images, Bden performs fairly well. In figure 4.1, the density percentage calculated by Bden on both raw and processed images is compared with the density percentage calculated by ra- diologists. On two separate occasions, two radiologists where each asked to compute breast density from a set of 39 raw images with the assistance of software which requires manual configuration of all input parameters. With their density percentages recorded, the density percentages for each of the raw images within the set were then calculated using Bden. The results shown in figure 4.1 show a moderate correlation between Bden’s density calcula- tions on raw image and the average densities obtained from the radiologists. Processed images showed a much lower correlation. Had the algorithm suc- cessfully computed breast density on the processed images, the expectation would be a strong linear correlation between the calculations on the raw im- ages versus the processed images. To determine where further modification efforts could be focused, each phase of the algorithm was examined. Figure 4.1: Radiologists vs. Bden 18 4.1. Analysis of Algorithm Processing Phases 4.1 Analysis of Algorithm Processing Phases Beginning with the first stage of the algorithm, the removal of background pixels was analysed using the newly created visualisation tool. From visual inspection, it was clear that the processed image’s pixel intensity values were successfully inverted and that the removal of the background pixels was com- pleted without error. This was expected since the processed images contained a simplified background which used a single pixel intensity. Next, the edge-detection phase was examined visually and also found to complete without error. This too was expected since the Sobel operator only detects the interface between regions with high and low intensity values; it is agnostic to raw or processed images. A visual examination of the erosion phase concluded that it also com- pleted without error. Again, this was expected since erosion is a mathemat- ical morphological operation which is performed inside a neighbourhood of 9x9 pixels. The removal of the skin-line is a task common to both raw and processed images. 4.2 Analysis of Highest Gradient Magnitude Locations The examination of the algorithm continued using the newly added data log- ging facility. For each tested image, the location of each of the three points containing the highest gradient magnitudes were recorded. An average of each image’s three highest gradient magnitude locations was computed. Fig- ure 4.2 shows a plot of the average locations found for both raw and processed images and figure 4.3 shows an example of one of these locations superim- posed onto a raw and processed images. A correlation between the selected locations within the raw image versus the processed image is clearly visible. Some variation in the location of the gradient magnitudes was expected since sharp edges within the raw image was intensified during the conversion to a processed image. While the two locations within the raw and processed images may be different, they are expected to remain within a small neigh- bourhood since the strongest edges in the raw image are also the strongest edges within the processed image. 19 4.2. Analysis of Highest Gradient Magnitude Locations Figure 4.2: Highest gradient magnitude pixel locations (a) Raw image (b) Processed image (c) Combined Figure 4.3: Locations of the three highest gradient magnitudes 20 4.3. Analysis of Dense Tissue Segmentation 4.3 Analysis of Dense Tissue Segmentation In the final phase, isolation of dense tissue from the breast, differences in the algorithm begin to arise. Figure 4.4 shows the result of applying Otsu’s thresholding algorithm on a raw and processed image. The white region represents the dense tissue and there is a visible difference in the amount of tissue detected between the two image types. (a) Dense tissue in raw image (b) Dense tissue in processed image Figure 4.4: Result of thresholding on raw and processed images To investigate further, the histograms of the 30x30 subsections used to calculate the optimal threshold value were explored for both image types. Figure 4.5(a) shows a bimodal (2 distinct peaks) distribution of pixel inten- sities while 4.5(b) shows a multimodal (more than 2 peaks) distribution. The multimodal histogram is a product of two factors: the method used by Bden to create histograms and the application of histogram equalisation on the processed image. To simplify histogram analysis, Bden does not sample each individual intensity value. Instead, it plots the distribution of pixels within small ranges. Histogram equalisation evenly distributes pixels within a small intensity range over the entire range of available values. The application of histogram equalisation is unique to processed images as this technique aids the human eye in detecting subtle detail and is applied as part of the propri- ety image processing phase. Equalisation alone will not alter the modality of a histogram; it will only stretch the distribution. However, since the dis- tribution of pixels is much wider than Bden’s histogram sampling range, the multimodality appears. 21 4.3. Analysis of Dense Tissue Segmentation (a) Raw image histogram (b) Processed image histogram Figure 4.5: Histograms of raw and processed images 22 Chapter 5 Conclusion and Future Work In this thesis, the algorithm used by Bden was described in detail and its ability to compute breast density using processed images was tested. Due to the fact that a weak correlation between the density computed using raw and processed images exists, the conclusion is that Bden’s existing algorithm does not correctly compute breast density on processed images. The cause of error was investigated by using newly added visualisation tools. Analysis indicates that obtaining breast density is feasible with modifications to Bden’s existing algorithm. Each of the individual phases within the algorithm were examined, re- vealing that the final phase is the primary source of error. The appearance of a multimodal histogram within the 30x30 subsection of processed images, instead of the expected bimodal histogram, introduces input which is outside Bden’s domain when computing the optimal threshold value. The fact that Otsu’s thresholding algorithm is most accurate when ap- plied to bimodal histograms, and has demonstrated less-than-optimal perfor- mance when multimodal distributions are used [3], strengthens the argument that the thresholding phase is responsible for the inaccurate density calcula- tions. Further investigation into reasons why the algorithm fails on processed images is still required. The use of alternative dense tissue segmentation techniques is one such possible avenue. Currently, Bden applies a single thresholding algorithm to both image types. However, the effect of applying different algorithms for raw and processed images could yield better results. Another approach may entail additional processing on the histogram of the 30x30 subsection obtained from the processed image. The additional pro- cessing could reshape the histogram back to a bimodal distribution allowing for the possibility of the existing thresholding algorithm to remain. With im- provements, it is feasible that Bden will successfully compute breast density on processed images. 23 Bibliography [1] Nicholas Petrick Mark A. Helvie Mitchell M. Goodsitt Berkman Sahiner Lubomir M. Hadjiiski Chuan Zhou, Heang-Ping Chan. Computerized image analysis: Estimation of breast density on mammograms. Medical Physics, 28(6), 2001. [2] CJ D’Orsi DB Kopans. Breast Imaging Reporting and Data Systems. American College of Radiology, Reston, Virginia, 1993. [3] Rafael C. Gonzalez and Richard E. Woods. Digital Image Processing. Pearson Prentice Hall, Upper Saddle River, NJ, 2008. [4] Jennifer A. Harvey. Quantitative assessment of percent breast density: Analog versus digital acquisition. Technology in Cancer Research and Treatment, 3(6), 2004. [5] Robert A. Jong Jeffrey W. Byng, Martin J. Yaffe. Analysis of mam- mographic density and breast cancer risk from digitized mammograms. RadioGraphics, 18(6), 1998. [6] M.D. Karla Kerlikowske. The mammogram that cried Wolfe. New Eng- land Journal of Medicine, 356(3):267–299, 2007. [7] Jennifer Stone Anoma Gunasekara Dallas R. English Margaret R.E. Mc- Credie Norman F. Boyd, Gillian S. Dite. Heritability of mammo- graphic density, a risk factor for breast cancer. New England Journal of Medicine, 347(12), 2002. [8] Nobuyuki Otsu. A threshold selection method from gray-level his- tograms. IEEE Transactions on Systems, Man and Cybernetics, 9(1):62– 66, 1979. [9] Eric Martin Peter Mildenberger, Marco Eichelberg. Introduction to the DICOM standard, volume 12. 2002. 24 Bibliography [10] John N. Wolfe. Breast patterns as an index of risk for developing breast cancer. American Journal of Roentgenology, 126(3), 1976. 25


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items