Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Detection of soft tissue abnormalities in mammographic images for early diagnosis of breast cancer Sameti, Mohammad 1998

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_1999-389723.pdf [ 6.26MB ]
Metadata
JSON: 831-1.0065347.json
JSON-LD: 831-1.0065347-ld.json
RDF/XML (Pretty): 831-1.0065347-rdf.xml
RDF/JSON: 831-1.0065347-rdf.json
Turtle: 831-1.0065347-turtle.txt
N-Triples: 831-1.0065347-rdf-ntriples.txt
Original Record: 831-1.0065347-source.json
Full Text
831-1.0065347-fulltext.txt
Citation
831-1.0065347.ris

Full Text

Detection of Soft Tissue Abnormalities in Mammographic Images for Early Diagnosis of Breast Cancer by Mohammad Sameti B . S c , Sharif University of Technology, Tehran, 1989 M . A . S c , University of Waterloo, Waterloo, 1994 A THESIS S U B M I T T E D IN P A R T I A L F U L F I L L M E N T O F THE REQUIREMENTS FOR T H E D E G R E E OF D o c t o r of Philosophy in T H E F A C U L T Y OF G R A D U A T E STUDIES (Department of Electrical & Computer Engineering) We accept this thesis as conforming to the required standard  The UnrVersity of British Columbia November 1998 © Mohammad Sameti, 1998  ln  presenting this  thesis  in  degree at the University of  partial  fulfilment  of  the  requirements  for  an advanced  British Columbia, I agree that the Library shall make it  freely available for reference and study. I further agree that permission for extensive copying of  this thesis for scholarly purposes may be granted  department  or  by  his  or  her  representatives.  It  is  by the head of  understood  that  copying  my or  publication of this thesis for financial gain shall not be allowed without my written permission.  The University of British Columbia Vancouver, Canada  DE-6  (2/88)  Abstract Treatment of breast cancer is currently effective only if it is detected at an early stage. X-ray mammography is the most effective method for early detection, however, mammographic images are complex. Researchers have been utilizing image processing and image analysis techniques to assist radiologists in their difficult task of detecting tumors in mammographic images. To aid radiologists in earlier detection of breast cancer, a retrospective study of mammograms was conducted.  In this pioneer study, screening mammograms  taken prior to the detection of a malignant mass were analyzed. The aim is to determine if there exists any signs of cancer development in the screening mammograms prior to the detection of a mass by the radiologist. For 58 biopsy proven breast cancer patients who were diagnosed by identifying a malignant mass in their mammograms, 224 previous screening mammograms were collected. These mammograms were reviewed by an expert radiologist and three regions were marked on each of the two mammographic projections of each case: 1) the region which corresponds to the site in which the malignant mass subsequently developed, 2) a similar normal region on the same mammogram, and 3) the normal region on the previous screening mammogram of the opposite breast which corresponds to region 1. Sixty-two texture and photometric image features were calculated for all the marked areas.  ii  A stepwise discriminant analysis found that six of these features best distinguish between the n o r m a l and abnormal regions. T h e best linear classification function resulted in 72% average classification. A t its current stage, the system can be used by a radiologist to examine suspicious patterns in a m a m m o g r a m . T h e regions which are flagged by the system have a 72% chance of developing a malignant mass by the time of the next screening. Therefore, further evaluation of these patients (e.g., a screening e x a m i n a t i o n sooner than the usual one year interval) can result in earlier detection of breast cancer. A novel segmentation algorithm for m a m m o g r a m p a r t i t i o n i n g based on fuzzy sets theory was also devised.  T h i s algorithm considers the fact t h a t malignant  masses and parenchymal patterns have unclear and fuzzy boundaries in a m a m m o g r a m . It also takes into account the effects of neighboring pixels for this segmentat i o n . T h i s a l g o r i t h m was evaluated in combination w i t h a texture feature extraction step for detection of malignant masses in m a m m o g r a m s . T h e mass detection scheme resulted in 94.3% true-positive detection rate and 0.24 false-positives per image on a set of 35 m a m m o g r a m s .  iii  Contents  Abstract  "  Contents  iv  List of Tables  vii  List of Figures  viii  Acknowledgements  xi  Dedication 1  xu  Introduction  1  1.1  Motivation  1  1.2  H u m a n Female Breast  2  1.3  Breast C a n c e r  4  1.4  Screening M a m m o g r a p h y  6  1.5  Digital M a m m o g r a p h y  13  1.6  Objectives  14  1.7  Structure of Thesis  14  iv  2  3  Computer Assisted Reading of Mammograms: Background 2.1  Detection of Microcalcifications  18  2.2  State of the A r t in M a s s Detection  19  2.3  L o o k i n g at the Previous Screenings  30  2.4  R O C analysis  32  A Mammogram Segmentation Algorithm 3.1  36  Two-level Segmentation  37  3.1.1  Normalization  37  3.1.2  Weight Initialization  38  3.1.3  F u z z y M e m b e r s h i p Values  38  3.1.4  E r r o r Calculation  40  3.1.5  U p d a t e Rule  40  3.2  Multi-level Segmentation  42  3.3  A p p l i c a t i o n of the fuzzy segmentation algorithm  47  3.3.1  Step 1: M a m m o g r a m Segmentation  49  3.3.2  Step 2: Feature E x t r a c t i o n  49  3.3.3  Results of applying the mass detection method on a m a m m o gram d a t a set  4  16  53  Retrospective Study of Mammograms 4.1  60  Image Features  61  4.1.1  M o r p h o l o g i c a l Features  . . . .  4.1.2  P h o t o m e t r i c Features  67  4.1.3  Discrete Texture Features  69  4.1.4  M a r k o v i a n Texture Features  72  v  62  , 4.2  4.3  4.1.5  N o n - M a r k o v i a n Texture Features  74  4.1.6  F r a c t a l Texture Features  75  4.1.7  R u n - L e n g t h Texture Features  76  Preliminary D a t a Set  78  4.2.1  Polygons for marking the boundaries  81  4.2.2  A circle for marking the regions  86  T h e M a i n Database  88  4.3.1  Variation of the object diameter  96  4.3.2  E v a l u a t i n g the classification function  5 Conclusion and Future Suggestions  100  105  5.1  Overview and S u m m a r y  105  5.2  Conclusions  107  5.3  Suggestions for F u t u r e W o r k  108  Bibliography  112  Appendix A A stepwise discriminant analysis  123  A.l  Step 1  124  A.2  Step 2  125  A.3  Step 3  128  A.4  Step 4  129  A.5  Step 5  129  vi  L i s t of Tables  2.1  A s u m m a r y of the several detection techniques  4.1  Selected features for various object diameters  vii  30  100  List of Figures  1.1  Schematic of the human female breast  3  1.2  T h e craniocaudal positioning for m a m m o g r a p h y  7  1.3  T h e mediolateral oblique positioning for m a m m o g r a p h y  8  1.4  A n ill-defined mass in C C and M L O views of a breast  10  1.5  A spiculate mass in the two m a m m o g r a p h i c projections  11  1.6  A sample of clustered microcalcific'ations  12  2.1  Block diagram of the algorithm by L i  2.2  T h e schematic flow diagram of the C h a n g et ai. method  25  2.3  T h e flow chart of the K o b a t a k e et ai. algorithm  26  2.4  T h e block diagram of the Petrick et ai. method  27  2.5  A n overview of the D i a h i et ai. technique  28  2.6  A sample R O C curve  34  3.1  F u z z y membership function  39  3.2  F l o w chart of the fuzzy segmentation algorithm  43  3.3  A m a m m o g r a m containing a malignant mass  45  3.4  A n o t h e r m a m m o g r a m containing a malignant mass  45  3.5  T h e 3-level segmented m a m m o g r a m  46  viii  al  24  3.6  A n o t h e r 3-level segmented m a m m o g r a m  46  3.7  A sample m a m m o g r a m for comparing two segmentation method  3.8  T h e 3-level fuzzy segmented m a m m o g r a m  48  3.9  T h e same m a m m o g r a m segmented into 3 levels by thresholding . . .  48  3.10  A n image histogram for estimating threshold T  50  3.11  A m a m m o g r a m divided into 256 X 256 R O I ' s  54  3.12  Image R O I ' s with segmented mass-candidates  55  3.13  Plot of the two features for the mass and normal regions  56  3.14  T h e R O C curve of the mass detection method  57  4.1  T h e radius vector r/. and its angle of an object  66  4.2  T w o m a m m o g r a m s of the recent and previous screening, samples from  . .  47  the preliminary database  80  4.3  T h e previous screening m a m m o g r a m marked by polygons  82  4.4  T w o 256 X 256 R O I ' s with the marked polygons  83  4.5  Classification plot of the polygon method  85  4.6  T w o circles marked two regions in a m a m m o g r a m of the previous screening  .  86  4.7  T w o 256 X 256 R O I ' s containing the marked circles  87  4.8  Classification plot for the circle method  88  4.9  M a m m o g r a m samples from the main database  89  4.10  A n o t h e r sample from the main database  90  4.11  T h e M L O views of the same breast  91  4.12  B l o c k diagram of the retrospective analysis  92  4.13  T w o circles marked the  4.14  T w o 256 X 256 R O I ' s with circle objects in the center  mass-growing  ix  and a normal area  94 94  4.15  Classification plot for the object diameter of 140 pixels  95  4.16  Classification plot for the object diameter of 100 pixels . . . . . . . .  96  4.17 Classification plot for the object diameter of 120 pixels  97  4.18  Classification plot for the object diameter of 160 pixels  97  4.19  Classification plot for the object diameter of 180 pixels  98  4.20  Classification plot for the object diameter of 200 pixels  98  4.21  A normal m a m m o g r a m as a test case  4.22  T h e same m a m m o g r a m with object circles covering the whole m a m mogram  101  102  x  Acknowledgements M y first and foremost thanks go to my supervisor, D r . R a b a b W a r d , for her friendly supervision, patience, technical guidance and support. M y deepest appreciations go to D r .  Jacqueline M o r g a n - P a r k e s for her generous support in the clinical aspects  of this work. I am also grateful to D r .  B r a n k o Palcic for valuable discussions and  guidance in all aspects of the research. T h i s work would not be possible without the support of Screening M a m m o g raphy P r o g r a m of British C o l u m b i a , D r . L i n d a Warren and L i s a K a n . M y  gratitude  also goes to D r . C a l u m M a c A u l a y for his insights, and also my other friends in the Cancer Imaging D e p a r t m e n t  of the British C o l u m b i a C a n c e r Research Center for  creating a friendly work environment.  I am also thankful to D r .  Farzin Aghdasi  and D a n i e l Nesbitt for their help in connecting me with this project. T h e financial support of the Science C o u n c i l of British C o l u m b i a , X i l l i x Technologies C o r p . , and the National Scientific and Engineering Research C o u n c i l of C a n a d a is also greatly appreciated. M y special thanks go to my friend, B a h m a n M . F a r a h a n i , for his enlightening discussions.  M y unique thanks go to my friends, S h a h r a m & F a r n a z Tafazoli,  M a h a n Movassaghi, M e h d i & E l h a m K a z e m i - N i a , K e y v a n & B a h a r e h HashtrudiZ a a d , Sarah B a c h m a n n and many others for their help and goodwill.  Finally,  I  am indebted to my parents and brothers, Bijan and B a h m a n , for all their support and comfort. MOHAMMAD  The University of British Columbia November 1998 xi  SAMETI  To Nahid A  xn  Young  Victim  of Breast  ... Cancer  Chapter 1  Introduction 1.1  Motivation  Breast cancer is one of the most c o m m o n cancers in women of the developed countries of the world and it is the cause of death in approximately 20% of all females who die from cancer in these countries. A recent survey has shown that, with the possible exception of C h i n a , breast cancer incidence rates have been increasing over the last 20 years in all age groups in all countries of the world for which rates are obtainable  [1].  In C a n a d a , although mortality rates for breast cancer have declined slightly in the past decade, the incidence of breast cancer continues to rise, with the highest increase occuring among women aged 60 and over. In C a n a d a , the most frequently diagnosed cancer for women in 1996 continued to be breast cancer, with 18,600 new cases and 5,300 deaths caused by the disease [2]. Over their lifetimes 1 in 9 C a n a d i a n women will develop breast cancer, which results in the loss of 94,000 potential years of life per year.  1  In British C o l u m b i a , an estimated 2,800 new breast cancer cases were diagnosed in 1996 (out of 8,000 new cases of all cancers), and 600 deaths were caused by the disease in that year [2]. Treatment an early stage.  of breast cancer is currently effective only if it is detected at T h e most effective method of early detection is x-ray  graphic screening [3].  However, m a m m o g r a p h i c images are complex.  mammoDespite a  highly evolved human visual system, a radiologist requires m a n y years of training to detect subtle abnormalities in a complex parenchymal pattern. T o aid radiologists in this task, researchers have reported different image and vision processing techniques. Despite these efforts, there remains much room for improvement and this field continues to be an active area of research. C o m p u t e r vision and image analysis methods are in fact being used successfully in other areas of cancer imaging, such as cell classification in cervical cancer screening program [4] [5].  1.2  Human Female Breast  T h e human female breast is a well differentiated  apocrine sweat gland which se-  cretes milk during lactation. T h e two breasts are situated anterior to the right and left pectoral muscles, extending from the sternum medially to the mid-axillary line laterally. E a c h breast consists of a thin outer layer of skin, beneath which is a subdermal layer of fat tissue from several millimeters to about one centimeter thick. Beneath the fat layer lies the supportive connective tissue s t r o m a , which contains blood vessels, l y m p h channels and variable amounts of fat tissue. T h e s t r o m a also contains the glandular tissue, consisting of 15 or 20 lobes which subdivide into milk forming lobules and drain v i a an extensive ductal system through the external cen-  2  trally positioned nipple (Figure  1.1).  •CM.  F i g u r e 1.1:  Schematic of a the glandular tissue and milk forming lobes in a human  female breast  [6].  T h e glandular tissue extends throughout the entire breast and is separated from the pectoral muscle by only a thin layer of r e t r o m a m m a r y fat tissue.  The  upper outer quadrant, because of an additional extension referred to as the axillary tail, is the thickest portion, and extends furthest from the nipple, towards the axilla. V i r t u a l l y all breast cancers arise from the glandular tissue.  Therefore the  objective of m a m m o g r a p h y should be to visualize the glandular tissue with as much resolution and contrast as feasible, within the constraints of the desired low X - r a y  3  exposure.  M o s t breast cancers occur centrally and laterally, in proportion to the  relative amounts of glandular tissue in these areas, and it is i m p o r t a n t to choose the m a m m o g r a p h i c views which best evaluate these areas.  1.3  Breast Cancer  It has been realized for many years that cancer has a genetic component and at the level of the cell it can be said to be a genetic disease. C a n c e r cells contain many alterations which accumulate as tumors develop. Over the last 20 years, considerable information has been gathered on regulation of cell growth and proliferation leading to the identification of proto-oncogenes and t u m o r suppressor genes.  T h e proto-  oncogenes encode proteins which are components of the cell signaling pathways. M u t a t i o n s in these genes act dominantly and lead to a gain in function accelerating cell division. In normal cell growth there is a finely controlled balance between growthpromoting and growth-restraining signals such that proliferation occurs only when required. T h e balance is tilted when increased cell numbers are required, for example during wound healing and during normal tissue turnover.  Differentiation  of cells  during this process occurs in an ordered manner and proliferation ceases when no longer required. In t u m o r cells this process is disrupted, continued cell proliferation occurs and loss of differentiation may be found. In addition, the normal process of p r o g r a m m e d cell death may no longer operate. T u m o r s can be divided into two main groups,  benign  or  malignant. Benign  tumors are rarely life threatening, grow within a well-defined capsule which limits their size, and maintain the characteristics of the cell of origin and are thus usually well differentiated. M a l i g n a n t tumors invade surrounding tissues and spread to dif-  4  ferent areas of the body to generate further growths or metastases. It is this process which is life threatening. invasive.  Malignant tumors are also classified as non-invasive or  In breast cancer, the majority of cases (76%)  are invasive ductal carci-  n o m a [7]. Breast cancer usually presents as a postmenopausal disease. can be of particularly is detected as a  mass  poor prognosis when premenopausal.  However,  it  T h e disease usually  by self-examination or m a m m o g r a p h y , sometimes with skin  involvement and nipple retraction. A s m a m m o g r a p h y and biopsy methods are being increasingly applied toward early detection, smaller and smaller malignant lesions are characterized based on the appearance of microcalcifications in the tumor area and on histopathologic characteristics of frozen tissue sections. It has been estimated that, from the time of earliest possible detection, a 1-cm tumor mass requires up to 8 years to grow [8].  D u r i n g this preclinical period,  the tumor may widely metastasize to distant sites. Metastases are initially  detected  in regional l y m p h nodes prior to more distant spread to bone, lung, brain, and other sites. T h e disease is staged I through IV, depending on primary t u m o r size, l y m p h node involvement, and combined histologic grading. Stages I and II are intraductal, III  is locally invasive, and IV indicates more widely disseminated disease.  Local  therapies of surgery and radiation are highly successful in stages I and II of breast cancer. T h e risk of breast cancer seems to depend on a complex of familial, hormonal, and environmental factors.  Epidemiologic analysis has shown that early  puberty  and late menopause are risk factors, whereas loss of ovarian function early in life is protective.  M u c h current study surrounds dietary influences, with high dietary  intake of fat considered to be possibly a major risk factor. A g e is clearly a strong  5  risk factor; breast cancer is predominantly- postmenopausal in onset. T h e prognosis of breast cancer and its response to therapy strongly depend on characteristics of the disease. A s previously mentioned, a major distinction in disease classification is whether or not l y m p h nodes are involved. Some of the most important indicators of poor prognosis are poor nuclear grade, large t u m o r size, and increasing numbers of l y m p h nodes involved.  1.4  Screening Mammography  A s early as the 1950s, it became apparent radiologically that breast cancers were associated with irregular mass lesions and calcifications. Large scale screening studies in the 1960s and 1970s produced startling results that a s y m p t o m a t i c breast tumors too small to palpate could be detected, with the screened patients faring better than the control groups. M o r t a l i t y rate decreases of approximately 30% were observed when the screened patients were followed for more than a decade [9]. T h e role of m a m m o g r a p h y in breast cancer detection has changed over the last decade with the progressive scientific acceptance of the efficacy of mass screening for reducing breast cancer mortality. M o d e r n screen film m a m m o g r a p h y can reliably detect very small invasive breast cancers, with long term survival of 90%.  Breast  cancers less than 10 m m in size have a very low rate of axillary nodal metastasis (about 5%), the most important prognostic indicator of disease free survival [10]. T h e increase use of m a m m o g r a p h y has also resulted in a d r a m a t i c increase in the diagnosis of pre-invasive by clustered  ductal carcinoma in situ ( D C I S ) ,  microcalcifications.  evidenced primarily  W i t h technical developments that have improved  tissue visualization and decreased radiation dose, we can now find many tumors at a stage that was only uncovered by chance a decade ago.  6  D C I S , which can  essentially be totally cured, now represents 10% to 20% of all breast cancers found, and up to 50% of the nonpalpable cancers found in an intensive screening p r o g r a m . M a m m o g r a p h y is the most sensitive method of breast cancer detection and is the only recommended imaging technique for screening of the a s y m p t o m a t i c  female  population. A screening m a m m o g r a m is defined as a 2-view per breast radiologic examination to detect unsuspected breast cancer in a s y m p t o m a t i c w o m e n . shows the positioning of the patient for acquiring the left breast, and a sample C C projection  F i g u r e 1.2:  craniocaudal ( C C )  Figure  1.2  view of the  mammogram.  Positioning for the craniocaudal  film-screen  view.  A n arrow in  m a m m o g r a m indicates a mass in the outer hemisphere of the breast  T h e other projection view of the breast is  7  mediolateral oblique  the  [8].  (MLO)  which  produces the side view of the breast with a certain angle to capture the pectoral muscle on the x-ray  film.  F i g u r e 1.3 shows the patient positioning for the M L O  projection and a sample m a m m o g r a m .  F i g u r e 1.3: Positioning for the mediolateral oblique film-screen view. of the M L O view includes the axillary tail.  T h e pectoral muscles  Mammogram (arrowheads)  are included in the image. A n arrow shows the position of the mass in the upper hemisphere [8].  A n irregularly marginated mass on m a m m o g r a p h y is a p r i m a r y sign of breast c a r c i n o m a . T h e majority of breast carcinomas have an infiltrative, irregular appearance with spiculation [11]. A variety of benign lesions, including fibrocystic changes, radial scars and fat necrosis may also present as an ill-defined mass radiographically.  S  However, in many cases, biopsy is necessary to confirm the etiology of a poorly defined m a m m o g r a p h i c lesion. Secondary signs of malignancy, such as architectural distortion or microcalcifications associated with an irregular mass, are highly suspicious of c a r c i n o m a . Images of Figures 1.4 and 1.5 show the C C and M L O view m a m m o g r a m s of a breast with an ill-defined malignant mass on each. A p p r o x i m a t e l y 50% of breast cancers are associated with calcifications [6]. T h e appearance of the malignant calcifications is described as clustered, irregular in size and shape and ranging from very small to 3 m m in size. If microcalcifications are present without a mass, benign and malignant disease may be more to differentiate.  difficult  W i t h improvement in m a m m o g r a p h y technology and better reso-  lution, size and shape of the calcifications are more visible on the radiograms and radiologists are faced dealing with various types of calcifications. F i g u r e 1.6 shows an area of a m a m m o g r a m with calcification. In many ways breast masses are more difficult to detect than microcalcifications because they can be simulated or obscured by normal breast parenchyma [12]. Soft tissue abnormalities and masses are studied in this thesis. A c c o r d i n g to the guidelines provided by the C a n a d i a n M e d i c a l Association [13], when an abnormality is detected on screening m a m m o g r a p h y , radiologists assign the case to either of the following categories:  •  C a t e g o r y 1, B e n i g n . Not due to cancer.  •  C a t e g o r y 2, Low risk. Probability of cancer less than 2%.  •  C a t e g o r y 3, Intermediate risk. Probability of cancer 2% to 10%.  •  C a t e g o r y 4, High risk. Probability of cancer over 10%.  9  F i g u r e 1.4:  B o t h projection views of a breast containing an ill-defined malignant  mass, which appears brighter in the M L O view. 10  F i g u r e 1.5: B o t h projection views of another sample m a m m o g r a m with a spiculate malignant mass present in both CC and M L O view.  11  F i g u r e 1.6: A sample m a m m o g r a m with a cluster of microcalcifications. T h e magnified view of the lesion area is shown.  T h e follow up and management decisions will, therefore, vary according to the category of the abnormality. Digital m a m m o g r a p h y attempts to reduce the error that may occur in the process of this categorization. Besides m a m m o g r a p h y , other breast imaging techniques are also used. T h e s e techniques include ultrasonography and magnetic resonance imaging ( M R I ) . T h e ultrasonic imaging technology is emerging as a useful tool, though as yet it lacks the necessary resolution. It is most effective in differentiating cystic growths [14]. Breast M R I  is currently under investigation and is widely used in localization of  lesions [15, 16].  12  1.5  Digital Mammography  T h e term  digital mammography  refers to any system or technology in which digital  images of m a m m o g r a m s are utilized. Design and implementation of a digital x-ray imaging system for m a m m o g r a p h y have been studied by a number of research groups around the world and some prototypes are currently in clinical trial [17, 18].  Other  related technologies in digital m a m m o g r a p h y includes image processing, computeraided diagnosis and telemammography. Image processing enhances the appearance of breast lesions; computer-aided diagnosis helps radiologists to identify  potential  breast cancers in the image; and telemammography involves transferring of m a m m o grams to another site for initial reading or consultation. These related technologies, rendered possible by m a m m o g r a m digitization of the conventional screen-film systems, are greatly facilitated by direct digital recording of the image, making clinical implementation more practical and cost effective.  Computer-aided diagnosis  ( C A D ) can be defined as a diagnosis made by a  radiologist who uses the output of image analyses of the digitized radiograph when making his or her decision. C o m p u t e r - a i d e d diagnosis is designed to provide radiologists with possible lesion locations and quantitative measures of the lesion. Hence, C A D can provide information to help detect lesions and further provide information to aid the radiologist in deciding whether the lesion is malignant.  Computer-aided  diagnosis can be used as a "second opinion" by the radiologist. Recently there have been several major advances in the field of C A D as applied to m a m m o g r a p h y .  An  overview of these advances is presented in C h a p t e r 2. T h i s thesis focuses on the design and development of C A D techniques for detection of soft tissue abnormalities in digital m a m m o g r a m s .  13  1.6  Objectives  T h e ultimate objective is to develop a fully automated scheme for early detection of soft tissue abnormalities in digital m a m m o g r a m s . T h i s will aid radiologists in the task of detecting subtle abnormalities in m a m m o g r a m s , and diagnose more breast cancers in the earliest stages possible. T o this end, we studied the m a m m o g r a m s taken a year prior to the detection of the malignant masses by radiologists.  We  examined the hypothesis that signs of cancer are present in m a m m o g r a m s taken prior to cancer detection and even though these signs were not completely developed to be fully visualized, they may be detected by image analysis techniques. T o identify the very early signs of a malignant mass, we should find the specific features of these masses at their very early formation stage. T o render the system efficient and fully a u t o m a t e d , we should also develop a segmentation method so that the search for suspicious regions is confined to some areas.  1.7  Structure of Thesis  In the next chapter of this thesis, a comprehensive review of various mass detection algorithms for digitized m a m m o g r a m s is presented. It also includes an overview of microcalcification techniques and other issues concerning computer-aided diagnosis in m a m m o g r a p h y .  C h a p t e r 3 introduces a novel approach for segmentation of  digitized m a m m o g r a m s and presents an example of its applications. S t u d y of the m a m m o g r a m s taken prior to detection of a malignant mass, is the focus of C h a p t e r 4. In this chapter the novel idea of digitally analyzing the area of a m a m m o g r a m which later developed a visible malignant mass is introduced and inspected.  The  final chapter, C h a p t e r 5, concludes this thesis and presents possible avenues for future research in this field.  14  the art and outrage  of breast  cancer,  'Diagnosis"  18"x22" Mixed Media  '"Morror and astonishment are uritten aft over this woman's face as the deadly seriousness of her diagnosis hits her. She holds out hand, covered in her OHM blood, not believing this could ever happen to her. This painting expresses my utter disbelief as I sat on the end of the examining table and heard my surgeon say, 'I'm sorry to tellyou this, but it is malignant.' I felt unreal and full of dread. This couldn't be happening to me-but it zoos. And might again. I am still afraid." 'Mary •Ellen 'Ldwards-'Mc'Tamaney 1994  15  Chapter 2  Computer Assisted Reading of Mammograms: Background E a r l y investigators in the field outlined many basic rationales, approaches, and limitations of computer-aided diagnosis ( C A D ) in m a m m o g r a p h y . O n e of the earliest investigations into digital m a m m o g r a p h i c images was reported by W i n s b e r g et ai. in 1967 [19], who described a method that compares density patterns in various areas within an individual breast and between right and left breasts. T h e images for this method were obtained by scanning the m a m m o g r a m s with an optical scanner. A n algorithm was developed by K i m m e et al. in 1975 to localize abnormal breast regions [20].  T h e y calculated seven textural features of breast images and  compared corresponding regions of the left and right breasts. In 1972, A c k e r m a n and Gose used a computer to extract and subsequently merge four properties of m a m m o g r a p h i c lesions (calcifications, spiculation, roughness and shape) in order to classify them as benign or malignant [21]. late 1970's Wee et ai.  [22] and F o x et ai.  16  D u r i n g the  [23] developed methods to characterize  clusters of microcalcifications by computer as benign or malignant. In 1977, S m i t h et ai. introduced a measure of malignancy called "linear mass ratio" to distinguish between malignant and benign cases [24]. In their method the location of abnormality has to be determined manually. T h e n for a line profile passing through the mass, the "linear mass ratio" is calculated. F o r 33 m a m m o g r a m s , a threshold was set to separate cancer from fibrocystic diseases. Fourteen parameters of three basic textural features, intensity, roughness, and directionality were constructed by H a n d et ai. in 1979 to detect the suspicious areas on x e r o m a m m o g r a m s [25]. T h e i r results, based on findings in 30 x e r o m a m m o grams, with 10 malignant, 10 benign and 10 normal images, achieved a sensitivity  1  of 87% with a large number of additional detected suspicious areas. E a r l y researchers in the field realized that the extremely large memory and computational requirements of digital m a m m o g r a p h y and computer analysis of m a m mographic images limited  the practical application of their  computers continued to evolve rapidly,  techniques.  Digital  however, and more sustained interest  in  computer-based approaches developed within a decade. V i r t u a l l y no articles on the topic of computer-aided diagnosis in m a m m o g r a phy appeared in medical publications until the late 1980's. Since that time, a great interest'in the field has occured and a significant number of centers are now actively pursuing such work.  M o s t recent work in the field has focused on the  detection  of particular targets or on approaches to the characterization of detected malities.  Recent research is also distinguished from that of the early  abnor-  investigators  by greater use of image processing, more sophisticated feature analysis, and use of artificial intelligence methods. 'Refer to Section 2.4 for definition of  sensitivity. 17  In the following sections of this chapter, a brief review of microcalcification detection techniques, and a complete overview of various methods for detection of malignant tumors in digitized m a m m o g r a m s are presented.  T h e n some research  work which have looked at the previous screening m a m m o g r a m s for mass detection and also estimation of breast cancer risk factors are reviewed. Finally, a brief introduction of receiver operating characteristic ( R O C ) analysis is presented in Section 2.4.  2.1  Detection of Microcalcifications  Microcalcifications form an ideal subject for computer detection algorithms because of their clinical relevance, their potential subtlety, and the lack of coexisting normal structures that have the same appearance. Spiesberger first specifically studied the detection of microcalcifications, not only concentrating on the identification of individual calcifications, but also evaluating strategies for the detection of clusters [26]. One of the most influential works in computer-aided detection of microcalcifications was presented in 1988 by C h a n , D o i and coworkers at the University of Chicago [27, 28]. T h e y used an image subtraction approach after filtering the digital m a m m o g r a m s for microcalcification enhancement and detection.  Mathematical  morphology, new clustering filters, and artificial neural networks have been used to improve the overall performance of this basic scheme, most notably by Nishikawa and colleagues [29, 30]. M a n y other approaches to microcalcification detection have been reported in the past years, including work by F a m et ai. [31], Davies and D a n c e [32, 33], Astley et ai. [34], Karssemeijer [35, 36], Kegelmeyer and A l l m e n [37], Strickland and H a h n [38], and Netsch [39]. T h e c o m m o n feature in all these methods is that one or more  18  filters are used to determine some local contrast measures or features at each pixel inside a region of interest, usually representing the whole breast. Microcalcifications have high local contrast, however, there also exist other high-contrast structures such as vessel walls and thin strings of connective tissue. E a c h of the above mentioned work, uses a different approach to segment these high contrast calcifications, while reducing the number of falsely classified normal structures.  2.2  State of the A r t in M a s s D e t e c t i o n  In general, breast masses and abnormalities of the soft tissue are more difficult to detect than microcalcifications, because they can be simulated or obscured by normal breast parenchymal patterns [12]. L a i et ai.'s paper in 1989 presents a method for detecting one type of breast tumors, circumscribed masses, in m a m m o g r a m s [40]. T h e i r method employs a m o d ified median filter to enhance m a m m o g r a m images, and template matching to detect breast t u m o r . In the template matching step, suspicious areas are picked by thresholding their cross-correlation values. A percentile method is used to determine the threshold for each image. T h e same group at the University of A l b e r t a published two more papers in detection of breast tumors. T h e first one in 1991 describes an a s y m m e t r y approach to automatic detection of masses [41]. Strong structural asymmetries between corresponding regions in the left and right breasts are taken as evidence of the possible presence of a t u m o r in that region.  T h e approach employs two steps, alignment  of the m a m m o g r a m s , and an asymmetry check between corresponding positions. Several a s y m m e t r y measures based on image properties such as, brightness, roughness, brightness-to-roughness and directionality are used to capture different types  19  of asymmetries. In the second paper, published in 1992, N g and Bischof presented a method for detection and classification of both stellate and circumscribed lesions [42]. It was assumed that both types of masses appear as approximately circular bright masses with a fuzzy boundary, and that stellate lesions are, in addition, surrounded by a radiating structure of sharp fine lines. F o r detection of tumors, the same method of template matching described in [40] was used. T h e y employed three approaches to recognize radiating structure: edge-oriented, field-oriented and spine-oriented. T h e last approach was reported to produce the best results. Giger and colleagues at the University of C h i c a g o have developed a computervision scheme for the detection of masses on m a m m o g r a m s , that is based on deviations from the architectural s y m m e t r y of normal right and left breasts, with a s y m metries indicating potential masses [43, 44, 45, 46, 47].  F o r a given patient,  the  inputs to the system are the four conventional m a m m o g r a m s , the right and left  craniocaudal  views and the right and left  mediolateral oblique  views. A f t e r the au-  tomatic registration of the corresponding left and right breast images, a nonlinear subtraction technique is used, in which gray level thresholding is performed on the individual m a m m o g r a m s prior to subtraction. Ten images thresholded with different cut-off gray levels are obtained from the right breast image, and ten such images from the left breast image.  T h e n subtraction of the corresponding right and left  breast images is performed to generate 10 bilateral subtraction images. Run-length analysis is then used to link the d a t a in the various subtracted images. T h i s linking process accumulates the information from the set of 10 subtraction images into two images that contain locations of suspected masses for the left and right beasts. Next, feature extraction techniques, which include morphologic filtering and anal-  20  ysis of size, shape and distance from border, are used to reduce the number of false-positive detections. Schmidt et ai., from the same C h i c a g o group [48], also investigated the performance of their computerized scheme for detection of masses on a database of cases containing lesions missed prospectively by radiologists reading the m a m m o grams in routine clinical practice. T h e i r computer aided diagnosis ( C A D ) programs were capable of detecting approximately half of the lesions which were missed due to observational errors. T h e results presented in [48] are based on analysis of approximately 25% of the cases in the C h i c a g o group's missed lesion d a t a base containing 178 potential false-negative cases. Brzakovic et al. introduced a method for the detection of masses that makes use of fuzzy p y r a m i d linking [49, 50, 51].  T h e essence of the method is a multi-  resolution image segmentation. T h e method involves three tasks, creating the image p y r a m i d , redefining the image p y r a m i d using fuzzy p y r a m i d linking, and segmenting the image.  E a c h level of the p y r a m i d is a weighted combination of 4 X 4 pixel  neighborhood from the level below. T h e n , a fuzzy membership function is used to link each level pixels to the pixels in the level above. T h e image segmentation is achieved in a top-down pass through the p y r a m i d levels. Kegelmeyer developed a method which automatically detects stellate lesions in digitized m a m m o g r a m s [52, 53, 54, 55].  His method uses an analysis of the  histogram of edge orientations in local windows that are passed across the image to extract some image features (the Laws texture features reported in [56]). T h e n , instead of statistical methods, a binary decision tree ( B D T ) is used to classify the features. Detection results in five test images yielded a sensitivity of 83% with 0.6 false-positive findings per image.  He and his colleagues, later o n , extended their  21  method by developing a technique based on the circle Hough transform, to detect circumscribed lesions in screening mammography [57]. Another method for recognition of stellate lesions in digital mammograms, is presented by Karssemeijer [58]. His approach is based on statistical analysis of a map of pixel orientations. To estimate these orientations, a new method based on second order operators is proposed. The line-based orientation estimates are used to construct two operators which are sensitive to radial patterns of straight lines [59]. Woods and Bowyer from the University of South Florida also published a paper in computer detection of stellate lesions [60]. The process involves two steps, segmentation of candidate regions and classification of the candidates.  The seg-  mentation step uses a region growing process to expand the small regions detected according to their brightness. Then, a set of features is computed for each candidate region and statistical pattern recognition techniques are utilized to classify the candidate as either a stellate lesion or normal breast tissue. Chan et ai. [61] investigated the effectiveness of using texture features derived from spatial gray level dependence (SGLD) matrices (or Markovian texture features, as explained in Section 4.1.4) for classification of masses and normal breast tissue in digitized mammograms. One hundred and sixty eight ROI's (region of interests) with masses and 504 normal ROI's are examined in this work and 8 features were calculated for each region. These features are: correlation, entropy, energy, inertia, inverse difference moment, sum average moment, sum entropy and difference entropy. Their method resulted in an area under the R O C curve of 0.84 for the training set. For a test set, the area under the R O C curve was found to be 0.82.  2  In another approach for detection of tumors in digitized mammograms, the 2  See Section 2.4 for more about R O C analysis.  22  research group at the University of South F l o r i d a employed two steps of segmentation and classification [62].  In the segmentation step, regions of interest were first  extracted from the images by adaptive thresholding.  A further liable segmenta-  tion was achieved by a modified M a r k o v random field ( M R F ) model-based method (Figure 2.1).  In the classification step, the M R F segmented regions were classified  as suspicious or normal by a fuzzy binary decision tree based on a series of radiographic density-related features.  O n a d a t a base of 95 images (45 abnormals),  90% sensitivity with 2 false-positives per image was the outcome of applying their method. C h a n g , Zheng and G u r from the University of P i t t s b u r g h introduced a method which identifies a m a x i m u m of five suspicious mass regions per image. T h e i r method was tested with a database of 510 images including 162 verified masses [63].  It in-  cluded a series of five rule-based processes that selected one region with each of the 5 characterizations (Figure 2.2).  T h i s multi-stage process achieved a sensitivity of  95% while limiting false-positive detection rates to below and average of two per image. T h e robustness of this method was also evaluated  [64].  U s i n g two independent computer-aided diagnosis ( C A D ) schemes, the same group from the University of P i t t s b u r g h investigated the potential of improving the sensitivity of mass detection by applying a logical "or" operation and to improve the specificity using a logical "and" operation [65].  T w o independent mass detectors,  one with Gaussian band pass filtering and multilayer topographic feature analysis [66], and the other with a five-stage search for a single suspicious region [63] were applied to a large image database with 428 digitized m a m m o g r a m s with 220 verified masses.  W i t h an "or" operation, the combined results yielded 100%  with a false-positive detection rate of 2.07 per image.  23  sensitivity  A logical " a n d " operation  iowast resolution image get initial segmentation by adaptive thresholding  start torn Wtial window size w  0  evaluate the mean density of each dass of region reduce window size W  T  pixel classification  1  double resoMkm and window size  Figure 2.1: Block diagram of the multi-resolution M R F segmentation algorithm introduced by L i et al. [62].  24  Segment Digitized Mammogram  Stage 1: Detection of A Region with A Global Minimum In ft* Smoothed Image  Stage 2: Detection of A Region with A Local Minimum In the Original (wage  Stage 3: Detection of A Region with A Local Minimum In the Filtered (mage  Stage A; Detection el A Smart Region of Rounded Shape and Low Contrast  •  Overlap Reduction  Detection Results  Stage 5: Detection of A Smalt Region of Rounded Shape and High Contrast  F i g u r e 2.2: T h e schematic flow d i a g r a m of the C h a n g et al. C A D m e t h o d for mass detection [63].  produced a reduction of false-positive rate to 0.4 per image, but sensitivity also decreased to 90%. K o b a t a k e and Y o s h i n a g a ( T o k y o University of A g r i c u l t u r e and Technology) proposed a method for detection of spiculate masses in w h i c h line skeletons and a modified H o u g h transform were employed [67, 68]. T h e i r system was designed using 19 t r a i n i n g images, and was tested on a 34 image set. T h e correct classification rate was 74%. F i g u r e 2.3 shows the flow diagram of their m e t h o d . In another work, P e t r i c k et al. from the U n i v e r s i t y o f M i c h i g a n  presented  an approach for segmenting suspicious mass regions using a new adaptive densityweighted contrast enhancement  ( D W C E ) filter in conjunction w i t h a L a p l a c i a n -  Gaussian ( L G ) edge detector [69, 70].  T h e n a set of m o r p h o l o g i c a l and  25  texture  LZZ iris filter  I  area test shape analysis tumor candidates final d e c i s i o n  F i g u r e 2.3: T h e flow chart of the processing of the K o b a t a k e et al. t u m o r detection system [68].  features were extracted and used by a classification algorithm to differentiate regions within the image. F o r two independent sets of 84 m a m m o g r a m s used  alternatively  for training and testing, 4.4 false-positives per image at a true-positive  detection  rate of 90%, and 2.3 false-positives per image at a true-positive detection rate of 80% resulted.  T o reduce the false-positive rate the authors incorporated a tissue-  specific adaptive enhancement filter into overall segmentation scheme. T h i s reduced the false-positive rates to 3.8 and 2.0 for 90% and 80% true-positive detection rate, respectively [71]. F i g u r e 2.4 shows the block diagram of this mass detection system. T h e technique developed in [72] by P o h l m a n et al.  aims at distinguishing  benign and malignant breast lesions in digitized m a m m o g r a m s .  T h e lesions were  segmented from their surrounding background using an adaptive region technique.  In a set of 51 m a m m o g r a m s containing lesions of known  growing  pathology,  six different morphological descriptors were used to classify the t u m o r s . T h e results were demonstrated by different graphs and some comments and discussion regarding the presented results appeared in [73, 74]. T h e breast cancer detection method introduced by D i a h i et al. the m a m m o g r a m into small squared areas  (vignettes)  26  [75], divides  which are analyzed by extract-  Digitized Mammogram  I  DWCE Filtering  Object Detection  Morph.  Reduction  Define Local ROIs  Object Detection  DWCE Filtering  Morph. Reduction DWCE Filtering  Define ROIs  Texture Classification  Object Detection Potential Masses  Figure 2.4:  T h e block diagram of the Petrick et ai.  D W C E filtering  m e t h o d using two stages of  [71].  ing some specific features. A specialized artificial neural network (3-layer network trained with the B a c k - P r o p a g a t i o n algorithm) is subsequently used as a classifier. T h i s method is used for classification of microcalcifications and circumscribed opacities. T h e classification rates for vignettes with and without microcalcifications were 94% and 96% respectively; for vignettes with and without opacities, they were 91% and 97% respectively. F i g u r e 2.5 shows an overview of their technique. A new method for automatically detecting malignant masses in digitized m a m m o g r a m s is also reported in [76] by Miller and Ramsey.  T h e i r approach is  based on a non-linear method of multi-scale analysis. T h e y show that it is possible to detect more than 85% of all malignant masses by this m e t h o d , irrespective of their size. T h e article f r o m P a r r et ai.  focuses on the detection and classification of  anatomically different types of linear structures to enable accurate detection of ab-  27  Micro calcifcattonsl classifier Micro ale If i cations clusters classifier O p a c i t i e s classifie'  t  ,  \  Stellate i m a g e s classifier Clinical Others classifiers  data  Figure 2.5: A n overview of the D i a h i et ai. breast cancer detection technique [75].  normal line patterns [77]. T h e y demonstrate the automatic detection of lines via a non-linear multi-scale directional ridge operator and present a m e t h o d of modeling the cross-sectional intensity profiles of the linear structures.  A statistical repre-  sentation of these patterns for circumscribed lesion detection is also addressed in [78]. Polakowsky and colleagues introduced a new model-based vision ( M B V ) algorithm to structurally identify  suspicious R O I ' s (region of interests),  false-positives, and classify the remaining as malignant or benign [79].  eliminate T h e differ-  ence of Gaussian ( D o G ) filters are used to highlight suspicious regions in a m a m m o g r a m . T h e n size, shape, contrast and Law texture features are used to develop the prediction module's mass models. Derivative-based feature saliency techniques are employed to determine the best features for classification. T h e best nine features produced an overall classification accuracy of 100%  for the segmented  masses with a false-positive rate of 1.8 per full breast image.  28  malignant  R a n g a y y a n et al. from the University of C a l g a r y used a region-based measure of image edge profile acutance which characterizes the transition in density of a region of interest (ROI) along normals to the R O I at every b o u n d a r y pixels of the boundaries of tumors, and proposed its application to discriminate  between  benign and malignant tumors [80, 81]. In addition, they studied the complementary use of various shape factors based upon the shape of the R O I , such as compactness, Fourier descriptors, moments, and chord-length statistics to distinguish between circumscribed and spiculated tumors. T h e classification effects of the above mentioned features were studied on a total of 54 m a m m o g r a m s which included 16 circumscribed benign, 7 circumscribed malignant, 12 spiculated benign, and 19 spiculated malignant lesions. T h e results indicate the importance of including lesion edge definition with shape information for classification of tumors. In an attempt to predict breast cancer with artificial neural networks, 266 biopsied lesions were randomly selected in 254 adult patients for the study [82]. T h e r e were 96 malignant and 170 benign lesions.  O n the basis of nine m a m m o -  graphic findings and patient age, a 3-layer back-propagation network was developed to predict whether the malignant lesions were  in situ  3  or invasive.  T h e mammo-  graphic findings were estimated by two radiologists, and were not extracted directly from images using image processing and image analysis techniques. A total accuracy of 79% was obtained by applying this m e t h o d . W o o d s and Bowyer reviewed five detection algorithms in terms of the general detection framework of the two-phase detection schemes composed of pixel level segmentation and region level classification. Table 2.1 shows their comparison of these five methods.  T h e i r review shows some fundamental advantages to concentrating  T h e term in situ means: "in the natural or original position or place" and it refers to a disease which has not invaded the other surrounding tissues. 3  29  ALGORITHM/  SEGMENTATION:  CLASSIFICATION:  T P RATE & F P S PER IMAGE  REFERENCE  PIXEL FEATURES  REGION F E A T U R E S  University of  intensity  size, contrast,  Chicago [46]  & contrast  & circularity  intensity  shape, contrast,  & contrast  size & smoothness  intensity  size, compactness  [51]  & contrast  &; mean intensity  Karssemeijer [58]  2-line orientation  size  L i et ai.  [62]  Brzakovic et ai.  85%  FPs/image: T P rate:  1-line orientation  none  & 4 general texture  3.0  90%  F P s / i m a g e : 2.0 T P rate:  67%  FPs/image: T P rate: T P rate: FPs/image:  0.1  89%  FPs/image:  texture measures Kegelmeyer [55]  T P rate:  0.4  97% 0.28  Table 2.1: S u m m a r y of several detection algorithms and reported performance [83].  efforts on the early pixel level analysis of the segmentation phase [83]. E a c h of the above presented schemes utilizes only some characteristics of mass lesions, and therefore, performs reasonably well on the m a m m o g r a p h i c lesions containing those features. However, they may not accomplish a high sensitivity rate for other types of mass lesions. Hence, despite the number of publications in this subject, the search for a more general algorithm which can detect different mass types continues.  2.3  ^  Looking at the Previous Screenings  In the clinical evaluation of the screening m a m m o g r a m s , radiologists always c o m pare the recent m a m m o g r a m s with images of the previous screening in search of any significant change of patterns.  Brzakovic and coworkers used the same ap-  proach in digital m a m m o g r a m s and presented a method for m a m m o g r a m analysis which detects cancer signs by comparing newly acquired m a m m o g r a m s with previ-  30  ous screenings of the same patient [84, 85]. T h e comparison is carried out regionally between appropriate m a m m o g r a m s in three sequential steps: m a m m o g r a m registration, m a m m o g r a m partitioning, and analysis and comparison of regional intensity statistics.  T h e registration step is done by detecting potential control points in  each m a m m o g r a m and then establishing correspondence between those points [86]. T h e same approach is also used in [50, 51] to perform the m a m m o g r a m partitioning step.  T o compare corresponding regions in m a m m o g r a m s , intensity statistics for  each pixel in a region are generated. A significant change in the statistics may lead to detection of an abnormality  [87].  Sallam and Bowyer presented a similar approach in which a two-stage image unwarping technique for registration of time-sequence m a m m o g r a m s is introduced [88]. T h e next step of the algorithm analyzes the difference image by using the difference histogram to select a threshold for extracting the significant differences [89]. T h e registration step is a critical part of this algorithm since it has to compensate for the difncult-to-recover time-sequence differences which are caused by a complicated nonlinear three dimensional deformation projected onto the image plane.  Wavelet  transform is used to improve the performance of the registration step. In several other studies, radiologists have looked at the m a m m o g r a m s retrospectively and analyzed cancers which were missed in the earlier screening m a m m o grams [90, 91, 92]. These studies estimate that approximately 25% of cancer signs which are visible on the retrospective review are not detected in the existing breast cancer screening programs. Researchers have also applied image analysis techniques on digitized m a m mograms to characterize m a m m o g r a p h i c parenchymal patterns and to predict breast carcinoma risk factors [93, 94, 95]. F o r example, Caldwell et al. described a method  31  of characterizing m a m m o g r a p h i c parenchymal patterns which is based on the calculation of the  fractal dimension  of digitized m a m m o g r a m s . F o r a set of  70 m a m m o -  grams, they showed that the average weighted proportion agreement a m o n g three radiologists in classifying patterns according to the Wolfe grades [96,  97]  was 85%,  while agreement between the radiologists and the fractal classifier was 84%.  The  developed method may prove to be useful in establishing an index of risk for breast cancer a n d , ultimately, in determining intervals between examinations for individuals in a m a m m o g r a p h i c screening programs. In C h a p t e r 4 of this thesis, we also study the previous screening m a m m o grams, however with a different approach. O u r aim is to achieve earlier detection of breast cancer. We explain how the previous m a m m o g r a m s are searched for textural and photometric features which can distinguish between the area that subsequently became a malignant mass, and other areas containing normal m a m m o g r a p h i c patterns.  2.4  R O C analysis  Diagnostic accuracy has been quantified in terms of a variety of measures. These measures include:  •  percent correct, in which the percentage of correctly diagnosed cases is reported.  •  sensitivity and specificity, in which the percentage of correctly diagnosed positive cases, and the percentage of correctly diagnosed normal cases are reported.  • receiver operating characteristic(ROC)  32  curves.  T h e R O C analysis is now widely accepted as the most meaningful standard of diagnostic accuracy, in part due to the limitation conventional R O C curve shows how  positive fraction  of other measures [98].  true-positive fraction  ( T P F ) changes with  A  false-  ( F P F ) as a "threshold of abnormality" or "critical confidence level"  is varied (Figure 2.6). T h e true-positive fraction is the fraction of all actually positive patients or images that is correctly called positive by the diagnostic test. T h e falsepositive fraction is the fraction of all actually negative patients or images that is falsely called positive by the test. T h e true-positive fraction is also addressed as the  sensitivity of  the diagnostic method and is often reported as a percentage.  Many  researchers also report the average of the number of false-positives detected image, instead of the false-positive fraction.  Specificity  4  per  is another measure that is  similar to sensitivity and can replace the false-positive fraction.  Specificity is the  percentage of the normal patterns (or images, or ROI's) which are correctly classified as normal. Since it is desirable to have a high ratio of true-positive fraction to falsepositive fraction, higher R O C curves indicate better diagnostic accuracy, so the area under an estimated R O C (usually noted by A ) z  is often used as a simple global  index of diagnostic accuracy. T h e conventional R O C analysis only applies to those diagnostic studies where for each case, it is required to choose between two (i.e., positive and negative) states of t r u t h .  A l t h o u g h diagnostic decision making tasks are not always binary,  by  defining the two states of t r u t h appropriately, the conventional R O C analysis can be meaningfully applied to many real clinical situations. However, generalized forms of R O C analysis have been proposed to address this problem, such as localization 4  Another term for specificity is the "true-negative percentage".  33  OC curVe  0.0  0.2  0.4  0.6  0.8  1.0  False Positive Fraction  Figure 2.6: A sample R O C curve.  R O C ( L R O C ) and free-response R O C ( F R O C ) . Definition of the L R O C and F R O C curves and a s u m m a r y of other issues regarding the R O C analysis are presented in [99].  34  the art and outrage  of breast  cancer  "Circus de Vida"  24"  X 43"  Colored Pencil on Wood  "My first mastectomy, in 1991, presaged a half decade of obsessive creation, my way of coping with instead of acknowledging a lopsided scar-slashed chest, 1 refused lo lookJ or five years,'father,I busied myself unth earning two mire degrees beyond my M.:A., teaching full-time, and creating over eight hours a day. 'Eventually, my creativity became a means for healing rather than evasion. This is a self-portrait, replete with personal symbols of the clown's emotional upheaval and anger with her life. She. is the crucified clown trying to walk\if\e tightrope, and she is the. clown who tries to balance her life spheres as she makes an obscene gesture. Inner heating cannot begin until the first step is taken: expressing one's anger and rage.:" •J Harm 'DeMitle 1995  35  Chapter 3  A M a m m o g r a m Segmentation Algorithm Intensity variation in m a m m o g r a m s is the first observation made by a radiologist. Changes in the gray-level values of a certain area in a m a m m o g r a m , along with other kinds of information such as shape, size and texture may lead a radiologist to the detection of an abnormality. Therefore, dividing a m a m m o g r a m into different regions according to their intensity is the primary step in the detection of abnormalities. D i v i d i n g an image into different regions, i.e.,  image segmentation,  is based  on two basic properties of gray-level values: discontinuity and similarity. P o i n t , line, edge and boundary detection methods primarily rely on discontinuity information; thresholding, region growing and region splitting and merging methods rely more on similarity information. T o perform the segmentation on a digital m a m m o g r a m , each pixel of the image has to be assigned to one region.  T h e method presented in the following  sections, performs a multi-level segmentation (i.e., many regions) based on a  36  fuzzy  membership value assigned to each pixel.  F i r s t , the algorithm divides the image  into two regions. T h e n each region is divided into two new segments (subregions), and the process can be continued until the desired number of segmentation levels is reached. In designing this segmentation algorithm, the fuzzy sets approach [100] is employed to make the process of assigning pixels to different regions more accurate. T h i s is because the boundaries of parenchyma and malignant masses in a m a m m o gram do not have sharp transitions and appear unclear.  T h i s algorithm forms a  part of a scheme for soft tissue abnormality detection (such as a mass), and since such abnormalities are larger than a certain size, to decide which region the pixel belongs to, we need to look at the intensity information of the surrounding pixels of a particular pixel, as well as its own gray-level value [101].  3.1  Two-level Segmentation  T o achieve the segmentation of the image into two levels, our algorithm requires five steps. These steps are explained in Sections 3.1.1 to 3.1.5.  3.1.1  Normalization  M a m m o g r a p h i c images which are used in this study, are saved in the " P I C " image format, in which each pixel has a gray-level value between 0 and 255.  F o r each  image, the m a x i m u m and the m i n i m u m gray-levels are determined and accordingly the pixel values are normalized to a floating point number between 0.0 and  1.0.  T h i s normalization is done in a linear fashion so that each pixel whose gray level lies between the m i n i m u m and the m a x i m u m (inclusive) gray-levels of the original image corresponds to a normalized pixel value.  37  3.1.2  Weight Initialization  A s previously mentioned, in addition to the pixel intensity, the gray-level values of the pixels surrounding this pixel, should also contribute in the assignment of the pixel to a certain region. Therefore, a n m x m mask with different weight coefficients is used. T h e weights are set so that the neighborhood effect is included. T h e mask size is chosen to be 5 X 5 pixels, and its weight values are:  W ( i > i  ) =  exp(-ilMH)  where ||(i,j)|| is the Euclidean distance between position  (3.1)  of the mask and the  central position. T h e constant parameter a controls the shape of the exponential. Increasing a causes the mask to have larger weight values, and thus a higher contributing neighboring pixels. T h e mask will have a weight value of 1 at the center and values smaller than 1 in the other positions. A n o t h e r constant parameter, /?, is also introduced to control the  maximum  possible effect of the neighboring pixels in each iteration. T h e weights are normalized such that: ^ W ( i , i ) = /3,  i, j /  mask center  (3-2)  A g a i n , increasing (3 allows the neighboring pixels to have more effect in deciding which region the central pixel belongs to.  3.1.3  Fuzzy Membership Values  After the normalization and weight initialization steps, a fuzzy membership value is assigned to each pixel of the image. T h i s is done by applying a fuzzy membership function  (Figure 3.1)  to each pixel of the image.  T h e fuzzy membership  value  assigned to the pixel indicates the degree of its membership to one of two sets. T h e  38  Fuzzy Membership Function -0.5  0.0  -0.3  -0.1  0.1  0.3  0.5  [  0.7  0.9  1.1  1.3  1.5  1  : -0.5  -0.3  -0.1  0.1  0.3  0.5  ^0.7  1.7  0.9  1.1  1.3  0.0  1.5  Normalized gray-level Figure 3.1: Fuzzy membership function.  membership value determines how close a pixel is to becoming a member of either set. Here, our fuzzy sets are the set of pixels which belong to region "0", and the set of pixels which belong to region "1". T h r e s h o l d T is the gray-level value corresponding to 0.5 on the fuzzy m e m bership value axis (i.e., the Y axis). T h e location of T relative to the function curve is fixed, i.e., a larger T shifts the membership function to the right, and a smaller T shifts the function to the left. T h e threshold value T is an i m p o r t a n t factor which has to be chosen carefully. T h e value of the threshold T may be chosen based on the histogram information of the image. A natural choice is the m i n i m a l value between two peaks of the image histogram. A n o t h e r way of finding the threshold is based on an entropy measure defined by K a p u r et al in [102]. In our implementation of the alg o r i t h m , the histogram valley m i n i m u m is chosen as the threshold value. Note that if a linear function (with slope 1) is chosen as the fuzzy membership function, and the threshold T is chosen to be 0.5, the fuzzy membership values actually become  39  the normalized image gray-level values.  3.1.4  Error Calculation  A t this stage of the algorithm, after normalization, neighborhood weight initialization and fuzzy membership assignment, an iterative process starts to decide which pixel belongs to which region. A n error function is used to indicate whether all the image pixels are assigned to either region "0" or region " 1 " .  T h e error function is  defined as:  £ = £0(i,j)(l-0(i,j)) where  0(i,j)  (3.3)  is fuzzy membership value of the image pixel located at i t h row and  j t h column. Note that E is always a non-negative number, and it has its m i n i m u m , zero, only if the fuzzy membership value, A l s o note that values of  E  is m a x i m u m if  0(i,j),Vi,j  0(i,j)  so as to drive  stage the value of each  0(i,j)  0(i,j),  E  of each image pixel is either 1 or 0.  = 0.5 for all  O u r aim is to update the  to its m i n i m u m value which is zero. A t that  will be equal to 0 or 1 and the corresponding pixel  will belong to region "0" or region "1". T o update the values of procedure will be used. If in each iteration,  0(i,j)  0(i,j)  an iterative  changes in the same direction as  that of reducing the error E, segmentation becomes closer to its final stage.  3.1.5  Update Rule  Now that the error function is defined, the fuzziness measure of each pixel (represented by its fuzziness membership) has to be updated so as to reduce the error function value. Inspired by the Gradient method and also B a c k - P r o p a g a t i o n algo-  40  rithm for training neural networks [103], we define the updating rule as follows:  0 (i,j) +  where term  E(i,j)  = 0(iJ) +  dO(i,j)  is the error caused by pixel  0(i,j) (l — 0(i,j)^  0(i,j)  (-Wr4)  V  and  n  • <>{iJ)(l ~ 0(i,jj)  is the learning rate. T h e  becomes zero for a pixel whose fuzziness membership value  is 0 or 1, meaning the pixel is already classified and no further change in its  fuzzy membership value is needed. T h e main problem with the updating rule of E q . 3 . 4 is that in calculating the change in a pixel fuzzy membership value, only the information from the considered pixel is used.  However, in segmenting a m a m m o g r a p h i c image for the purpose of  mass detection, the information of the neighboring pixels as well as that of the pixel itself are needed. T o satisfy this requirement, E q . 3 . 4 is changed to:  0 (l,j) +  dE(k,iy  = 0(i,j) + T)  dO(k,l),  lk,ieN° where around  W(i,j) 0(i,j).  0 (i,j) +  0(i,j)  l-0(i,j)  (3.5)  are the weight coefficients of the mask defining a neighborhood i V ° U s i n g the error function of E q . 3.3, E q . 3 . 5 can be written as:  = 0(i,j) + 2n  E  W{k,l)[O(k,l)-0.5  0(i,j)[l-0(i,j))  (3.6)  k,l£N° Using E q . 3 . 6 to update the fuzzy membership value of a pixel will reduce the error measure in each iteration, and does not change the values of already assigned pixels. A n o t h e r effect of the term  0(i,j) (l - 0(i,j)^j  in E q . 3 . 6 is to produce small  steps when 0(i, j) has a value close to 0 or 1. B u t we know that for those pixels with a fuzziness value close to 0 (or 1), it is highly unlikely that they become members of region "1" ("0"), a n d therefore, the process can be accelerated for those pixels.  41  (3  T o do so, E q . 3 . 6 is changed to:  0+(i,j)  0(i,j) + AO(i,j  where (3.7)  AO(i,j )  where we keep  0 (i,j) +  limited to 0 <  0 (i,j) +  < 1.  Notice that E q . 3 . 7 makes a  quick decision for pixels with fuzzy values close to 0 or 1 (because has a large value), but for a fuzzier pixel  (J2k ieN° W(k,l)(0(k,l)  (0(k,l)  — 0.5)  — 0.5) closer to  zero), there is a smaller change in its intensity, causing a delay in segmentation of the pixel. In other words, for a pixel with  0(i,j)  close to 0.5, a quick assignment  to one of the regions in not desired. T h e update rule of E q . 3 . 7 results in delay in assigning such pixels until their neighboring pixels (which are not as fuzzy) become a member of either set, and then pushes that pixel towards becoming a member of the appropriate region. O n e advantage of this algorithm is that the effect of neighboring pixels is controlled by the factors 2rj and W(k,  /), and can be adjusted according to the type of  application. A l s o notice that reducing the error function (Eq.3.3) in each iteration, is still satisfied by the update rule of E q . 3 . 7 .  F l o w digram of this segmentation  algorithm is shown in F i g u r e 3.2.  3.2  Multi-level Segmentation  T o divide the image into two regions, the above algorithm stops after observing zero error in an iteration of the above explained process.  B u t if more than 2 regions  in the image is required, the algorithm attempts to divide each of these regions  42  ^  start  J  gray level normalization  image input  mask weight initialization  histogram calculation & threshold extraction  fuzzy membership value assignment  error calculation update fuzzy membership value <^errw=^  stop  F i g u r e 3.2:  -No-  J"  ^)  F l o w chart of the fuzzy segmentation algorithm for each level of seg  mentation.  43  into two new segments. T h i s process may be continued until the desired number of segmentation levels is obtained. For a 3-level segmentation, as an example, new threshold values have to be found for the pixels of the original image belonging to the region "0" (call it To) and for the pixels of the original image belonging to the region "1" (T\).  Now the fuzzy  membership function for region "0" is limited between 0 and 0.5, and also shifted to To, and that of the region "1" is limited between 0.5 and 1.0 and shifted to  T\.  T h e weight coefficients of the neighborhood mask can either remain the same, or be defined differently for each region. T w o separate error functions have to be defined:  E=  (3.8)  0(i,3)\(0.5-0(i,j))\  0  i,je region "0" and  Ei=  £  - 0.5)1(1.0 - 0 ( i , i ) )  (3.9)  i,je region "1"  (0(i,j) -  0.5))  is used. T h a t is because the pixels of the region "0" can have values greater  than  Notice that the absolute value of the term (0.5 -  0(i,j))  (or  0.5, and the pixels of the region "1" can have values less than 0.5. T h e factor of 0.5 in the update equation E q . 3 . 7 has also to be changed to 0.25 and 0.75 for updating the pixels of regions "0" and "1", respectively. Images of Figures 3.3 and 3.4 are two digitized m a m m o g r a m s . B o t h m a m mograms contain malignant masses which appear brighter than other areas.  The  fuzzy segmentation algorithm for 3 level segmentation is applied on these images and the results are shown in Figures 3.5 and 3.6. (Also see [101].) T o compare the result of our segmentation algorithm with the c o m m o n segmentation m e t h o d ,  thresholding,  a sample m a m m o g r a m of F i g u r e 3.7 was consid-  44  Figure 3.5: M a m m o g r a m of Figure 3.3 is segmented into 3 levels.  ered.  Image of F i g u r e 3.8 is the result of applying the 3-level fuzzy segmentation  algorithm on that image, and Figure 3.9  is the result of applying a 3-level thresh-  olding on the same image. T h e thresholding results in m a n y small regions. It also causes many disconnected regions with rough edges. However, the fuzzy segmentation algorithm eliminates the unimportant small regions and smoothes the edges. Therefore, a small calcification or a film artifact will not appear on the result of a fuzzy segmented image.  F i g u r e 3.7: A sample m a m m o g r a m for comparing two segmentation m e t h o d .  3.3  Application of the fuzzy segmentation algorithm  M a n y algorithms developed for detection of malignant  masses in digitized  mam-  mograms consist of two steps. A segmentation step which produces a number of suspicious regions, and a classification step which usually uses some image features to discriminant between the malignant tumor and other n o r m a l regions.  1  O u r de-  ' A summary of these algorithms is presented in Section 2.2 and also in Woods and Bowyer's article [83].  •17  Figure 3.8:  M a m m o g r a m of F i g u r e 3.7 is segmented into 3 levels using the fuzzy  segmentation  F i g u r e 3.9:  algorithm.  T h e same m a m m o g r a m of Figure 3.7 is segmented into 3 levels using  the thresholding m e t h o d .  48  veloped segmentation algorithm was evaluated by employing the algorithm as the first step of a mass detection technique. T h e following two sections explains the two-step mass detection method and presents the results of applying this mass detection method on a set of m a m m o g r a m s .  3.3.1  Step 1: M a m m o g r a m Segmentation  Intensity information of a m a m m o g r a p h i c image is used in this step to divide the image into different regions. Details of an iterative algorithm which employs the fuzzy sets approach and achieves m a m m o g r a m segmentation are introduced in Section 3.1 and 3.2. In this implementation of our fuzzy segmentation algorithm, only two levels of segmentation are used.  E a c h m a m m o g r a m is manually divided into  256 X 256 regions (ROI's) and the algorithm is applied on each R O I individually. T h i s enables the intensity d y n a m i c range of each R O I to determine the threshold T , therefore the algorithm will perform better for segmentation of subtle masses. T h e threshold T is chosen to be the gray level value at the valley between two peaks of the image histogram of each R O I . F i g u r e 3.10 shows a sample histogram and the value chosen for the threshold.  3.3.2  Step 2: Feature E x t r a c t i o n  T h e fuzzy segmentation algorithm identifies some regions which are suspected to be masses, called  mass-candidates  hereafter.  In the second step of our detection  m e t h o d , 24 discrete texture image features are calculated for each mass-candidate. These features are thoroughly discussed in Section 4.1.3 of this thesis, but here is a s u m m a r y of these features [104]. T o calculate the discrete texture features, each of the mass-candidates is  49  100001  1  1  A sample image histogram 1 1 1  1  260  Figure 3.10: A sample of image histogram for determining the threshold T for the segmentation  algorithm  divided into three discrete regions according to their optical density values.  The  optical density for each pixel is defined by E q . 3.10.  O A i = log  In which Iij  is intensity of the pixel i,j  ( ^ )  (3.10)  and 7o is the background intensity chosen  to be the mode of the gray level values in the background area. T h e three discrete regions are defined by two optical density thresholds.  In order to make the mea-  surements independent of mass brightness variation, the thresholds are scaled to the mean optical density of the region. E a c h pixel of a mass-candidate is assigned to one of the three density regions,  low, medium  and  high,  resulting in disconnected  regions. T h e fraction of the total mass-candidate area which is occupied by each optical density region defines the first set of features.  50  For the low density region,  e.g., it is denned as:  lowJDD-area = where  A  is the total mass-candidate area and  — ^  (3.11)  A  A\  0VJ  is the total area of low density  region (number of pixels in the low density region). R a t i o of the integrated optical density of each density level forms the second set of features:  lowDD-ratio  =  (3-12)  where  IOD IOD  = J2 i  (- )  = J2 ^  (- )  0Di  3  0D  low  3  13  14  low  F o r the medium a n d high optical density regions features similar to those of E q . 3.11 and 3.12 are defined. A n o t h e r set of features involves comparing the mean optical density  (MOD)  of the low and other regions. F o r example:  low-VS-med-OD = ^° ™ MOD D  ed  (3.15)  K low  where tures,  MODi  ow  = IODi /Ai ow  ow  and  MOD  med  low .VS-high JDD and low.VS-medhi.OD,  = IOD /A . med  med  '  T w o other fea-  are defined similarly where for the  latter the combined medium and high density regions are considered. T h r e e other features are  lowjderuobj, med-den-obj  and  highjden-obj, which  are the numbers of discrete 8-connected subcomponents of the mass-candidates consisting of more than one pixel of low, medium and high optical density, respectively. Compactness of the low optical density region is calculated as:  low JD D .compactness  = ^  ^A  low  51  ™^  lo  (3.16)  where  Pi  ow  is the sum of all perimeters of the low density region components. T h e  same feature is similarly calculated for the m e d i u m , high and med-high regions. T h e next set of features measures the average distance between the geometrical center of the mass-candidate and all pixels from each density level. For the low density region, e.g.,  low-average-distance — where d (center  , lowjpixelij)  J2i j ^(center , lowjpixelij) •—  =  (3-17)  is the distance, in pixels, between a low density pixel  and the center of the mass-candidate region  (center). R  is the mean radius of the  mass-candidate. A n asymmetry measure of the distribution of different levels of optical density is given by the final set of discrete texture features calculated. For the low density region:  , . • d(CMi low jcenter .region =  ow  where  d(CMi  0W  , center) denotes  gravity of the low density region  , center) =  the distance between  center  (o.loj and the center of  (CMi ). 0W  T o determine the best features that can separate the mass-candidates (obtained  by the segmentation step)  wise discriminant  into either  analysis was e m p l o y e d .  2  low .aver age Ai stance and low.VS-med.OD,  masses or normal regions, a step-  It  was found that the two  features,  can best distinguish between the two  masses and normal regions groups. See Section 4.2.1 and Appendix A for more details about this stepwise discriminant analysis.  2  52  3.3.3  Results of applying the mass detection m e t h o d on a m a m m o g r a m data set  A set of 36 m a m m o g r a m s (from 18 breast cancer patients) with a malignant mass in each of the two projections was collected.  These cases were randomly chosen  from breast cancer patients who were diagnosed after the detection of a malignant mass in their m a m m o g r a m s . A l l the m a m m o g r a p h i c films were digitized at a spatial resolution of 100 ^ m / p i x e l and a photometric resolution of 12 b i t / p i x e l , using the A n a l y t i c a l Imaging M a m m o g r a p h y  (AIM)  system [105,  106, 107].  However, for  the purpose of this work, only the 8 most significant bits of each pixel were used (1 b y t e / p i x e l ) .  E a c h pair of these 36 m a m m o g r a m s were x-ray images of the same  breast (with a tumor) from two different views,  craniocaudal and mediolateral oblique  ( C C and M L O ) . A n expert radiologist reviewed all the m a m m o g r a m s and marked the position of the mass on each film. Because of an incomplete imaging view, one of the M L O m a m m o g r a m s did not contain the mass, therefore, 35 masses were available. In order to do the first step of the detection m e t h o d , i.e., the image segmentation, each image is divided into 256 X 256 pixel R O I ' s . O n e of these R O I ' s is positioned such that it contained the mass (Figure 3.11). T w o levels of segmentation are produced for each R O I by our fuzzy segmentation algorithm, introducing a number of mass-candidates in each m a m m o g r a p h i c image. F i g u r e 3.12 shows two R O I samples and the mass-candidates that resulted from applying our segmentation algorithm. T o avoid unnecessary computations, before the feature extraction step is performed, the following optional procedure may be used.  T h e mass-candidates  which have an area of 1200 pixels or less are disregarded. T h i s will eliminate  53  the  F i g u r e 3.11:  A m a m m o g r a m containing a malignant mass is divided into 256 X 256  R O I ' s . T h e left-most R O I contains the malignant mass.  bright areas of a m a m m o g r a m , such as a large calcification, which are smaller than about 0.4 c m in diameter.  These areas may be picked up as a mass-candidate by  the segmentation algorithm, but are very unlikely to be of the malignant mass type. Notice that the elimination of the small mass-candidates is not an essential step of our detection m e t h o d , but it does reduce the c o m p u t a t i o n a l time. In the second step, the discrete texture features are calculated for each of the  remaining  low-VS-medium  mass-candidates. JOD,  F o r the features  low .average-distance  the resultant d a t a are plotted in F i g u r e  3.13.  and  Notice the  clustering of the masses group in the lower left corner of the plot. In calculation of the discrete texture features, the thresholds t h a t divide each mass-candidate into the three low, medium and high discrete regions (as discussed in Section 3.3.2), were set to values of h and | of the m a x i m u m optical density of  54  Figure 3.12:  Image samples of two R O I ' s and the marked mass-candidates,  A n R O I containing a malignant mass, patterns,  (b):  A n R O I with n o r m a l m a m m o g r a p h i c  (c): T h e fuzzy segmentation algorithm applied on the R O I of image  Notice the marked b o u n d a r y of the suspicious area after segmenting the R O I two regions,  (d):  (a): (a). into  R O I image of (b) processed with the same fuzzy segmentation  algorithm resulting in a number of mass-candidates.  55  Q O  0.36  -  0.33  -  0.30  -  o  O  I  0.27  -  0.24  H  T3 ^  oo' °> I  -2  ^ o  21  0.18  H A  0.15  H^  A  O  normal regions  o °o° ^  o n  OO  \  o o o  „ o o  # \ 0  A  A A ^ A A A  O  o-p  A A  0.12 - \ 0.09  ° \  masses  O  o  v  A  1  —r 0.68  n—•—i—'—i— —i— 1  0.72  0.76  0.80  0.84  -1 0.92  1  I 0.96  1  1— 1.00  low_average_distance  F i g u r e 3.13:  T h e vertical and horizontal axes represents the two texture features  which were found to be the best classifiers.  T h e two classes of n o r m a l  patterns  and malignant masses are plotted (not all the normal cases are shown). Notice the clustering of the masses in the lower left corner.  some selected mass-candidates. A better choice of these threshold values based on a larger d a t a base, may improve the accuracy and reduce the false-positive rate. The  best linear classification function for grouping malignant masses from  other normal m a m m o g r a m patterns in F i g u r e 3.13, achieved 94.3%  true-positive  detection rate and 0.24 false-positives per image, using the Jack-knifed classification method [108]. F i g u r e 3.14 shows the R O C curve for this experiment.  Employing a  non-linear classification function to classify masses from n o r m a l regions in the image texture feature space may improve the performance of the m e t h o d .  An  artificial  neural network is a good example of a non-linear classifier. O u r mass detection C A D algorithm uses only one m a m m o g r a m for the pro-  56  Jackknifed, ROC curve  Classification, ROC curve  'n  100  /OJ  1  Jo/  40  0  j  i  /  1  >  /  fa  1  1 —  0  / /  60  L  1  */ 1  >  1  .  1  i I  1  *  J 1  —  a  I  1  20 ^  80 0  i  /  /  J  1  /  i  ..  1  /  a  I3  /  7  60  _.  i L  sit  >  cP  *  —  1  ..  L  ..  CD  J  tru  80  i  1  1  1  /  0^  1  1  !  !__  —  20  1  20  40  60  0* 80  - | -  1  40 1  /  1  k  1 1 1  /  1  20  -  1  "  - f -  40  60  80  % false-positive  % false-positive  F i g u r e 3.14: T h e R O C curve for applying the mass detection method on 35 m a m mograms, each containing a malignant mass.  cess and yet achieves a high success rate of 94% sensitivity with only 0.24 falsepositives per image. T h i s compares favorably with results of other studies (refer to Table 2.1). T h e only study with a higher sensitivity and a similar false-positive rate was reported by Kegelmeyer [55], which only targets stellate masses (one type of malignant masses). O u r database was not classified according to the mass type and it contains other mass types such as circumscribed and irregular, as well as stellate masses. However, for such comparison to be meaningful a c o m m o n database is required for testing different mass detection schemes. U s i n g a single m a m m o g r a p h i c image in this technique results in a relatively fast processing time. Considering the additional information that can be obtained from the m a m m o g r a m of the opposite breast, it is feasible to expand our C A D algorithm to achieve an even lower false-positive rate.  57  T h e strength of this algorithm lies in the segmentation step, whose design is based on the properties of m a m m o g r a p h i c patterns.  T h e threshold T  in the  algorithm, which the fuzzy membership function is shifted to, is an i m p o r t a n t factor of our segmentation process and should be selected carefully based on the image histogram information.  58  the art and  outrage  of breast  cancer  "Persephone's Return"  3 1 " x 28" Oil Pastel. Water Color. Colored Pencii  "After I was diagnosed with cancer, I became fascinated by the story of (Persephone, the Qreek\goddess of eternalspring, of innocence. Mductedto the Underworld, 'Persephone ate the seeds of the pomegranate, symbol of fruition and creativity. "Eventually, she was released, innocent no longer. I imagine that she felt she had a new chance to find her life again, to embrace the light. Like (Persephone, I journeyed in the darkjealms andusedthe seeds of creativity to find my way home. Hy imagining myself as the goddess of eternal spring, I was able to escape from the pain, the grieving, the dar^andbarren landscape the doctors painted for me. I have returned to the light, to living moments as they come and embracing every second I have."  Joyce Hadtke 1995  59  Chapter 4  Retrospective Study of Mammograms Survival from breast cancer is directly related to the stage at diagnosis. T h e earlier the detection, the higher chances of successful treatment. In an attempt to improve early detection, a study has been undertaken to analyze the screening m a m m o g r a m s of breast cancer patients taken prior to cancer detection. F o r women who were diagnosed with breast cancer by detection of a malignant mass in their m a m m o g r a m s , a search for any structural or textural distortion in the screening m a m m o g r a m s prior to diagnosis was undertaken. T h i s study intends to examine the hypothesis that:  "there exist differences between the region that subsequently became a malignant mass, and other normal areas of the mammographic images taken in the last screening examination prior to detection of the tumor." T o investigate this hypothesis, collecting a m a m m o g r a m database of cancer patients with at least one screening m a m m o g r a m examination prior to diagnosis of a malig-  60  nant mass was an essential step. For each such case, the site of the region in the previous screening m a m m o g r a m which corresponds to the malignant  mass which  subsequently developed in the next screening m a m m o g r a m , is hereafter called the  mass-growing  region.  T h i s site were located in every previous screening m a m m o -  g r a m . T h e next step was to examine different types of image features in order to find the best features to distinguish between the  mass-growing  region and other normal  regions of the m a m m o g r a m s . In the following sections of this chapter, the set of image features used in distinguishing the  mass-growing  regions from other normal patterns in the m a m -  mograms is introduced. Some of these features are also used for measuring the cell nuclear texture in image cytometry [109] and they have been modified for our application. Following this, two approaches for the detection of  mass-growing  regions  on a small database (the preliminary study) is explained. T h e preliminary study on a small database is to test the promise of the "hypothesis". In this study we only used the m a m m o g r a m s of 20 patients with and without the assumption of knowing the region boundaries. Finally, evaluation of the method is performed on a larger set of m a m m o g r a m s (the main database) and the results are presented.  4.1  Image Features  T h e following sets of features are applied on the m a m m o g r a m s . T h e y can be classified as:  1. M o r p h o l o g i c a l features  2. P h o t o m e t r i c features  3. T e x t u r a l features  61  T h e features studied in each set are described in the following sections. T h e textural features include the  discrete, Markovian, non-Markovian, fractal and run-length tex-  ture features. T h e r e are 62 photometric and texture features and 45 morphological features explained in the following.  T h e morphological features which define the  shape, size and boundary variation of the object, were only used in the preliminary study of the previous screenings.  In the preliminary study, it was assumed that  the boundary of the normal and abnormal regions are known. T h i s assumption was removed in evaluation of the main database, and a fixed size circle was used to mark the b o u n d a r y of each region.  These features are calculated over an image  object,  which is an area of the image marked by a closed boundary or an image mask. T w o methods for marking the objects are introduced, later, in section 4.2.  4.1.1  Morphological  Features  M o r p h o l o g i c a l features estimate the object size, shape and b o u n d a r y variations.  • Area, A, is defined  as the total number of pixels belonging to the object.  • x-centroid, y-centroid, are the coordinates of the geometrical center of the object defined with respect to the image origin.  i  E X-centroid  E  y-centroid  (4.1)  A j(Eobject  j  A  where i and j are image pixel coordinates. l The notation  is sum over the variable i for pixels belonging to the object.  62  (4.2)  mean-radius, max-radius,  are the mean and m a x i m u m values of the length  of the object's radial vectors from the object centroid to its edge pixels.  mean .radius max.radius where  = f — — j ^ —  (4-3)  jk  =  max(rfc)  (4-4)  is the rcth radial vector, i.e. the vector between the fcth pixel on the  object border and  (x.centroid, y.centroid),  N  and  is the number of pixels on  the object edge.  var-radius,  is the variance of length of the object's radial vectors, r^.  (ru - r)  2  var.radius = ^ =l) k  (4.5)  k  TV - 1  where f is the mean of  sphericity,  v  ;  (Eq.4.3).  is calculated as a ratio of the radii of two circles centered at the  object centroid. O n e circle is the largest circle that is fully inscribed inside the object perimeter, corresponding to the m i n i m u m length of the object's radial vectors. T h e other circle is the m i n i m u m circle that completely circumscribes the object's perimeter, corresponding to the m a x i m u m length of the object's radial vectors.  sphericity = eccentricity,  min.radius — = max.radrus  min(rfc)  ; — r  max(rfc)  (4.6 J  may be interpreted as the ratio of the major axis to minor axis  of the best fit ellipse which describes the object, and gives the m i n i m a l value of 1 for circles. Eccentricity is calculated as:  eccentricity = . j  63  m  a  x  (4.7)  where  X x and A ,- are the maximal and minimal eigenvalues of the second ma  m  n  central moment matrix of the object's pixels. The second central moment matrix is:  r %mmnt2  %ycrossrnrnnt2  <Eycrossmmnt2  Urarant2  and each term is calculated as:  /  E % —  ^  X  1=1 iEobject  I E ^Ucrossmmnt2  — ^^  i—  1  *  i^object  ™ E j€object  j~  L  \  A J  M  1  ) \ 2  E -'=  J  i  ymmnt2  £  jEobject  M  3  \  )  where L and M are number of columns and rows of the image, respectively. inertia-shape, is a measure of the "roundness" of an object. It is calculated as the moment of inertia of the object, normalized by the area squared.  It  gives the minimal value of 1 for circles. 27T  inertia shape  .  R%  £  i,j£object  A  2  (4.?  /here Rij is the distance of the pixel from the object centroid. compactness, is another measure of the object's roundness and is calculated as:  compactness  64  4irA  (4.9)  where  P is the object perimeter. Compactness gives the m i n i m a l value 1 for  circles.  obj-orient, represents the object orientation measured as a deflection of the main axis of the object from the Y direction. , .  .  180 / T T  obJ .orient —^ where  X  max  V2  _I  X-max  — + tan L  ymmnt2  (4.10)  %ycrossmmnt2  is the m a x i m a l eigenvalue of the second central moment matrix  introduced in E q . 4 . 7 .  A geometrical interpretation  of the  obj-orient  is that  it is the angle (measured in degrees and in a clockwise sense) between the Y axis and the best fit ellipse major axis. T h i s feature m a y be appropriate for comparison of the tumor progression in m a m m o g r a m s of consecutive screening.  elongation, is a measure of the relative extent of the object along the principle direction (corresponding to the major axis) relative to the direction orthogonal to it. These lengths are estimated using Fourier Series coefficients of the radial function of the object.  T h e radial function of the object (r(f?)) is calculated  by sweeping the radius vector (r^, introduced in Eq.4.3) in equal angular steps and creating a function of angle 6 (Figure 4.1).  elongation  ao + 2Ja\ + bl =  = a  where a  2  0  -  (4-11)  2Ja\ + b\  and b are Fourier T r a n s f o r m coefficients of the radial function of the  object, r(6),  2  defined by:  (0) = ^  r  +  J2a cos(n6) + Y,anSm(n9) n  71=1  71=1  65  (4.12)  Figure 4.1: T h e radius vector  and the corresponding angle 9 are shown for an  object.  •  low-freq-ft, gives an estimate of the coarseness in the variations of the object boundary. It is measured as the energy of the lower harmonics of the Fourier spectrum of the object's radial function (chosen from 3rd to 11th harmonics).  lowjreq-ft where a  n  and b  n  n  =J2( l a  + l)  (4.13)  b  are defined in Eq.4.12.  • high-freq-ft, gives  an estimate of the fine boundary variations. It is measured  as the energy of the high frequency Fourier spectrum (chosen from 12th to 32nd harmonics) of the object's radial function. 32  high-freq-ft = > T (a* + b ) 2  n  (4.14)  71=12  • hrmnOl-ft,  hrmn32-ft,  are estimates of boundary variations a n d cal-  culated as the magnitude of the Fourier Transform coefficients of the object radial function for each harmonic from 1st to 32nd.  (4.15)  66  where apj and bjy are the TVth harmonic of the Fourier T r a n s f o r m coefficients defined in Eq.4.12.  4.1.2  Photometric Features  P h o t o m e t r i c features give estimations of functions of the absolute intensity and the  optical density  levels of the object, as well as their distribution characteristics. F o r  an object area in a m a m m o g r a m , the intensity of each pixel is its gray level value. T h e optical density value for each pixel of a m a m m o g r a m object is defined as:  ODij = where  ODij  j t h row, Iij  log  (4.16)  is the optical density of the pixel located at the i t h c o l u m n and the is the intensity value of that pixel, and IQ is the background intensity.  Note that the above definition of optical density is modified by a minus sign from its original format [110]. F o r objects within a m a m m o g r a m where the background intensity is usually less than the intensity of the object pixels, changing the minus sign to a plus is required to produce positive values for OD.  T h e background  intensity is chosen to be the mode intensity value of the image pixels excluding the pixels belonging to the object.  O t h e r possible choices for IQ could be the mean  intensity value, or the first image pixel intensity value.  In calculation of these  features for our d a t a bases, IQ is the mode value, unless it is stated otherwise.  • OD-sum,  is the unnormalized measure of the integrated optical density of the  object.  ODsum =  ]T  ODij  i,j£object where  ODij  is defined in Eq.4.16.  67  (4.17)  var-intensity, mean-intensity, are the variance and mean of the intensity function of the object.  mean-intensity = I — ^ ° ^ —  (4.18)  var -intensity —  (4.19)  l  t  ' ' ° J  e  6 j  e  b  ct  ^ —_ f  where A is the object area measured as the total number of pixels within the object.  O D - m a x , is the largest value of the optical density of the object.  OD.max = max(OD )  (4.20)  i:j  object O D - v a r , is the normalized variance of optical density function of the object.  (ODij - OD) 2  E OD-var =  „  (4.21)  {A-l)OD  2  where OD is the mean value of the optical density of the object which is calculated by:  .X.  0  D  " i,j£object A  OD = " " " J C  (4.22)  J C 1  In the E q . 4 . 2 1 , the variance is divided by the square of the mean optical density in order to make the measurement independent of the local intensity of the mammogram. OD-skewness, measures the asymmetry of the optical density distribution about the mean. It is calculated as the normalized third moment of the optical density function of the object.  ODskewness  E. (ODn-ODY t,jeobject  =  (4.23)  /  (A-1)  \/ 3  E  (ODij-OD)*  \i,j£object 68  j  2  v  T h e skewness is divided by variance of the integrated optical density, O D , in order to make this feature independent of the object contrast.  OD-kurtosis,  is a measure of flatness of the object's optical density function.  OD-kurtosis =  Object  (A-1) 4.1.3  {ODij - OD) 4  E  •  _  (  4  2  4  )  E . {ODij-OD)*) \i,jeobject J  Discrete Texture Features  T h e discrete texture features are based on the segmented regions of the object. These regions belong to the  low, medium  and  high  optical densities. T h e segmentation is  based on two thresholds of the optical density.  A n y pixel with an optical density  value greater than the first threshold, O D - h i g h - t h r e s h , belongs to the high O D region, and the pixels with optical density less than O D - h i g h - t h r e s h and greater than the second threshold, O D - m e d - t h r e s h , belong to the m e d i u m O D region. T h e remaining pixels are assigned to the low O D region. T h e assignment of the pixels, which is based on their O D values, results in three regions which are not necessarily connected.  T h e O D - m e d - t h r e s h and O D - h i g h - t h r e s h are chosen to be | and | of  the m a x i m u m O D value of some selected objects in the m a m m o g r a m samples. In the following features the superscripts low, med and hi corresponds to the low, medium and high optical density discrete regions, respectively.  • low-OD-area, med-OD-area, hi-OD-area,  represent the ratio of the area  of the low, medium and high optical density regions of the object to the total object area. AIOW  lowJDD-area  = ——  (4.25)  med-OD-area  = ———  (4.26)  69  hiDD-area  = 4r  low-OD-sum, med-OD-sum, hi-OD-sum,  (- ) 4  27  are calculated as the value of  the integrated optical density of the low, medium a n d high density regions, respectively, divided by the total integrated optical density,  OD * 1  E  lowJJDsum  =  ODsum.  ——  low  (4.28)  C^L>sum E medDDsum  OD*  ed  = ^ -'J  sum  OD%  E  hLODsum  (4.29)  =  (4.30)  ^Dsum low-OD-cmp, med-OD-cmp, hi-OD-cmp, medhi-OD-cmp,  are charac-  teristic of the compactness of the low, m e d i u m , high, and combined medium and high density regions respectively. E a c h optical density region is treated as a single (possibly disconnected) object.  low-OD-cmp = med-OD-cmp = hi-OD-cmp  I plow-) 2 {  '  |-pmed\2  ± — 47rA m e  =  ^ — ^  —  - ± - r - — - ,  (4.31) —  (  4  .  3  2  )  (4.33)  / • p i n e d _j_ p h n 2  medhi-OD-cmp  -^r  (4.34)  where P is the s u m of the perimeters of the disconnected regions for each of the optical density intervals.  low-avg-dst, med-avg-dst, hi-avg-dst, medhi-avg-dst, represent  the av-  erage separation between the low, m e d i u m , high and combined medium and high density pixels from the center of the object, normalized by the object  70  mean-radius. R iow l  . .E .  low-avgAst =  i,j£object  A  (4.35)  • mean^radius E j^jr ^ Hj  low  1  med-avg-dst =  i,j(zobject  A  (4.36)  • mean-radius  med  E  hijavg-dst =  i,j€object  A  ^ "+  E  medhi-avg-dst = where  (4.37)  • meanj-adius  hl  i ,3 ^object  (A  E  e  m e r f  + A' ) • 11  i $  i,j(Eobject  (4.38)  mean-radius  is defined by analogy to Eq.4.8 as the distance from pixel i, j to the  mean-radius  object centroid, and the object  is defined by E q . 4 . 3 .  med-vs-low-OD, hi-vs-low-OD, medhi-vs-low-OD,  represent the ratios  of the average optical density of the m e d i u m , high, a n d combined medium and high density regions versus the low density region.  /  med-vsJow-OD  E  £  OD*  EDY  i,j£object  OD *»\ L  ijfzobject  J^raed  (4.39)  A}  ow  J  ( E OD?j  i,j£object  hi-VsJowJDD  i,jEobject  / medhi-vsJow-OD  E  V  OD% + ed  ijEobject  =  (4.40)  E i,j€object  ODf' (4.41)  V  /  E. ijEobject  OD{ ow •<  lo  A  V  low-obj, med-obj, hi-obj,  are simply the number of discrete subcomponents  of objects consisting of more than one pixel of low, medium and high optical densities, respectively. These features produce a measure of how smoothly the optical density value changes across the object.  71  4.1.4  Markovian Texture Features  M a r k o v i a n texture features represent an attempt to characterize gray level variations between adjacent pixels in the image.  M o s t commonly, one calculates conditional  probabilities over the entire object pixels and various statistics of this distribution are defined as features.  2  However, to be computationally efficient, the s u m and  difference histograms are used [111]. These are defined as follows:  • H (l), s  the sum histogram, is the probability of two neighboring pixels having  gray levels which s u m to I.  • H (m),the d  difference histogram, is the probability of neighboring pixels having  gray level differences of m. B o t h histograms are calculated for the object. F o r the purpose of this implementation, the gray level d y n a m i c range of the object is quantized to 40 levels.  • entropy, represents a measure of disorder in object gray level organization. Large values of entropy correspond to very disorganized distribution, such as salt-and-pepper random field.  entropy =  -  (£ # . ( 0  logH,{1)  + £  H {m) log H (m)) d  d  (4.42)  • energy, in contrast to entropy, the energy feature gives large values for an object with a spatially organized gray scale distribution.  E n e r g y gives large  values to an object with large regions of constant gray level.  energy = £  (H (l)) + £ {H (m)f 2  s  ;  d  (4.43)  m  A common practice is to define the co-occurrence matrix, A of object pixels. Each element of that matrix, 5^ (element on the /ith row and the uih column), stands for the conditional probability of the pixel of gray level /J occuring next to a pixel of gray level v. The features introduced in this section can be calculated using the co-occurrence matrix by performing the summation over the matrix elements. 2  72  contrast,  gives large values for an object with frequent large gray level varia-  tions. It is based on the estimation of intensity differences between neighboring pixels.  contrast = ^ rn Hd(m) 2  (4.44)  m homogeneity, is the opposite of contrast which measures the smoothness of the object image intensity. A large value for homogeneity indicates an object with slight and spatially s m o o t h gray level variation.  homogeneity = ^ 77—7—yz~Hd{ni) { T " rn)  (4.45)  M  correlation, produces a large value if an object contains large connected subcomponents of constant gray level and with large gray level differences between adjacent components.  correlation where I  q  = i (  2I ) H (l)  - ^ m H (m)  2  -  q  2  s  d  )  (4.46)  is the mean intensity of the object calculated for the 40-level q u a n -  tized gray scale,  clump-shade,  I. q  gives large absolute values for objects with a few distinct  clumps of uniform intensity having large contrast with the rest of the object.  £(/-27,) jf (Z) 3  s  clumpshade  3-  =  (4-47)  (j:(i-2i ) H (i)y 2  q  s  Negative values for clump-shade correspond to dark clumps against a light background while positive values indicate light clumps against a dark background.  73  clump-prominence, measures the darkness of the clumps.  Z(l-2I ) H (l) 4  g  s  clump .prominence = —  2  (4.48)  fe(Z-2/,)2ff,(Z)  N  VI  Large values of clump-prominence indicate a dominance of clumps which have a high contrast to the background intensity levels.  4.1.5  Non-Markovian Texture Features  Non-Markovian texture features describe texture in terms of global estimation of gray level differences of the object.  • den-light-spot, den-dark-spot, are the total numbers of local maxima and local minima ^N  max  and J V  m m  (density of light and dark spots), respectively,  of the object intensity function based on the image averaged by a 3 X 3 window, and normalized by the object area.  dendightspot  =  ———  (4.49)  denAarkspot  =  ———  (4.50)  where A is the area of the object. • range-average, is the intensity difference between the average intensity of the local maxima and the average intensity of the local minima.  E  ranqeMveraqe = where  I  max  and  I  min  . . ,.  jmax <•./  . . <V-  y> jmin  :  (4.51)  are the intensity values of the local maxima and minima.  74  • range-extreme,  is the intensity difference between the largest of the local  m a x i m a and the smallest of the local m i n i m a of the object intensity function.  range.extreme  = m a x ( r ™ ) - min(7  m i n  )  (4.52)  • cntr-of-gravity, represents the distance from the geometrical center of the object to the "center of b o d y " of the optical density function, normalized by the mean-radius of the object.  cntrjoj.gravity '  E  (4.53)  i- ^  \  0  %D L  ,,3  =  C  M  /  2  x-centroid  +  E ' '  J  e  i-OD  \  t]  g ^  m  :  y-centroid  mean-radius where  ODsum  is defined in Eq.4.17;  X-centroid  and  y.centroid  are defined  in Eq.4.1 and E q . 4 . 2 , respectively.  4.1.6  Fractal Texture Features  T h e fractal texture features are based on the surface area of the three dimensional surface of the object's optical density plotted versus the x and y image spatial coordinates. Since optical density is a discrete function (defined on a set of image pixels), the plot will have the form of a bar graph. a unit area in the x — y plane.  T h u s , each pixel is assigned  T h e area of the sides of the three dimensional  structure is proportional to the change in the pixel optical density with respect to its neighbors. T h e large values of fractal areas correspond to large objects containing small subcomponents with high optical density variations between them [112, 113].  75  • fractall-area,  is the area of the three-dimensional surface of the object's  optical density, explained above.  fractall.area  =  ]T  (\ ij  ~ iU-i)\  0D  0D  + \ ij• ~ C Z ) ( _ | + l ) 0D  i  1)i  (4.54)  •  f r a c t a l 2 - a r e a , is another fractal dimension but based on an image in which four adjacent pixels forming corners of squares are averaged into single pixels,  fractall-area  thereby representing a change of scale of  feature.  Obviously,  the scaling can be done on another level, e.g., nine ( 3 x 3 ) .  fractall-area  =  (4.55)  (\OD  - OA  i2J2  2 ( i 2  _ | + \OD 1 }  l2j2  - OD  l + l)  « 2 , J 2 £ object  where  OD{ j  2 2  is a scaled optical density function of the image with 4 pixels  averaged into one.  • fractal-dimn,  fractall-area  is calculated as the difference between logarithms of and  fractall-area  divided by log 2.  The  fractal Aimn  value  must lie between 2 and 3 with larger values corresponding to images of higher spatial resolvable detail.  , , fractalAimn  log (fractall-area) — \og(fractal2-area) =  -  •  (4.56)  log2  ^  '  T h i s feature gives a measure of the fractal behavior of the image, associated with a rate at which measured surface area increases at finer scales.  4.1.7  Run-Length Texture Features  Run-length features describe texture in terms of gray level runs, representing sets of consecutive pixels having the same gray level value. T h e length of the run is the  76  number of pixels in the r u n . T h e following features are calculated over the image with intensity function values quantized down to 8 quantization levels. length texture features are defined using gray level length matrices, of the four principal directions  6 = 0, j , \  and  TZ  T h e rune pq  for each  lZ  9  E a c h element of m a t r i x  vq  specifies the number of times that the object contains a run of length q, in a given direction (9, consisting of pixels lying in quantization level p (out of 8 quantization levels).  • shortO-runs, shorts-runs, short|-runs, s h o r t s - r u n s ,  give large values  for objects in which short runs oriented at each of the principal directions, dominate. p qioiject  ^  shortO .runs — ———  —- —  (4-57)  Q  p,q&object  • longO-runs, long^-runs, long|-runs, long^-runs,  give large values for  objects in which long runs dominate.  qK 2  E  "P9  longOjruns = ' °^ p qe  (. )  ct  4  58  P,q&object  grayO-Ievel, grayf-Ievel, grayf-Ievel, gray^f-level,  are features which  estimate gray level non-uniformity. T h e lowest values of these features happen when runs are equally distributed throughout the gray levels.  grayd level =  p  e  °  6  j  e  r  f  ^ °  E  p,q£object  &  J  g  g  '  (4.59)  ^-pq  • runO-length, run—length, run^-length, run^-length,  are features which  estimate the non-uniformity of the run lengths, taking on their lowest values  77  when the runs are equally distributed throughout the lengths.  ?  f  runOJength = ° q€  bject  V  ?•  Zf° * he  }  (4.60)  p,q(zobject • runO-percent, run^-percent, run^-percent, run^-percent,  are calcu-  lated as the ratio of the total number of possible runs to the object's area, A.  These features assume their lowest values for images with the most linear  structure.  runOjpercent — ' ° ^ P 9  J  (4-61)  T h i s feature has a m a x i m a l value of one for the most spatially disorganized and intensity randomized structure of an object.  T h e run-length texture features usually are not used as independent measurements at each orientation. If an average or other combination of each set of these features is used, the features remain invariant to the possible rotation of the object.  4.2  Preliminary Data Set  In the first attempt to examining the hypothesis that differences exist between the  mass-growing area  and other normal areas of the m a m m o g r a m s in the screening  examination prior to cancer detection, we assume that the exact boundaries of the suspicious areas in each m a m m o g r a m are known. If our hypothesis that these areas are distinguishable from other similar but normal areas in the m a m m o g r a m s of the same screening examination proves to be true, then we will drop this assumption and repeat the study. F o r this preliminary study, m a m m o g r a m s of 20 breast cancer patients were collected.  These cases were randomly selected from British  78  C o l u m b i a breast cancer patients who have had at least one screening examination before detection of cancer. A l l cases were collected through the Screening M a m m o g raphy P r o g r a m of British C o l u m b i a . These are biopsy proven breast carcinomas in which a malignant mass was identified in the patients' m a m m o g r a m s , resulting in diagnosis. E a c h screening examination consists of 4 m a m m o g r a m s , two projections of each breast, the  craniocaudal ( C C )  and the  mediolateral oblique  (MLO)  view.  For  all the collected cases a malignant mass was discovered in both projections in one 5  breast. A l l m a m m o g r a m s of the current screening and the screening prior to cancer detection were reviewed by an experienced radiologist. T h e radiologist identified the site of the malignant mass in the m a m m o g r a m s of diagnosis, and then marked the corresponding  mass-growing  areas in the m a m m o g r a m s of the same breast from the  previous screening. F i g u r e 4.2 shows the C C view of a breast containing a malignant mass, and the m a m m o g r a m of the same breast taken in the previous screening. Besides the  mass-growing  area, on each of the two projections of the previ-  ous screening, the radiologist identified a normal region which was similar to the  mass-growing  region. In addition to these 4 regions (one  mass-growing  region and  one normal region on each projection), the radiologist determined the area which corresponded to the  mass-growing  region in the previous screening m a m m o g r a m of  the opposite breast. T h u s , 6 regions were marked for each set of previous screening m a m m o g r a m examinations, four normal, one in each m a m m o g r a m , and two abnormal (or  mass-growing  ) areas, one in each projection view of the cancer developing  breast. A l l m a m m o g r a m s were digitized at approximately 100 ^ m per pixel using the  79  F i g u r e 4.2: M a m m o g r a m samples from the preliminary database.  T h e top image  was taken in 1994, where a malignant mass was detected. M a m m o g r a m of the same breast taken in the previous screening, 1993, is shown in the b o t t o m image.  The  malignant mass and its corresponding area on the previous m a m m o g r a m are marked by arrows.  80  AIM  3  system. T h i s system contains a M i c r o l m a g e r C C D camera which produces  12 bits for the intensity of each digitized pixel. However, the 8 most significant bits of each pixel intensity  (256 gray levels) is employed for the purpose of this study.  T h i s can reduce the digitization noise and also the complexity of c o m p u t a t i o n . E a c h m a m m o g r a m is digitized to a 1024 X 1280 pixel image. T w o methods for examination of the selected regions are considered. In the first method the boundary of each region, normal or a b n o r m a l , is drawn  manually  as a polygon. T h e n all the features explained in section 4.1 were calculated for each region marked by a polygon. T h e second method uses a fixed size circle, rather than a polygon with no fixed size, which covers the object of interest. T h e morphological features are not calculated in this case, since the object morphology is equivalent for all normal and abnormal objects because of the fixed size and shape of the objects studied.  T h e following sections explain the outcome of applying each method on  the collected m a m m o g r a m s .  4.2.1  Polygons for marking the boundaries  In this approach, the boundary of the each distinguished region is marked by drawing a polygon as the boundary.  E x a m p l e s of these polygons are shown in F i g u r e 4.3.  T o calculate the image features for each of the marked objects, a 256 X 256 region of interest (ROI)  is considered around each polygon area and containing each of the  polygons. E a c h R O I is placed such that the polygon is approximately located at the center of the R O I . F i g u r e 4.4 shows examples of such regions. U s i n g R O I containing the object rather than the whole image for feature calculation routines provides independency of the global intensity d y n a m i c range of the image. E a c h 256 X 256 Analytical Imaging Mammography system, a mammogram digitization and analysis system designed and introduced by F. Aghdasi, D. Nesbitt and R. Ward [106, 107]. 3  81  R O I image is fed to the feature calculation routines and the object intensity and background intensity values are determined accordingly.  F i g u r e 4.3: A m a m m o g r a m of the previous screening (also shown in F i g u r e 4.2) with the  mass-growing  region and a normal region marked by two polygons.  A f t e r calculating the different types of image features for each of the marked areas (objects) a stepwise discriminant analysis is conducted to find the best combination of features that best distinguishes between the n o r m a l and a b n o r m a l (mass-  growing )  regions. T h i s combination of variables (features) is called a classification  function. T h e discriminant analysis begins with no variables (features) in the classification function. Variables are entered or removed one at a time according to their calculated "F-values" [114]. In each step, the F-values are calculated considering: - the variables which already entered the classification function,  82  F i g u r e 4.4: T w o 256 x 256 R O I ' s extracted from the m a m m o g r a p h i c image of F i g ure 4.3 with the marked polygons in the center.  - the distance between the groups (two groups, normal a n d a b n o r m a l , here) - the significant variables out of classification function. T h e discriminant analysis evaluates the F-values at each step a n d the variable with the highest F-value is entered to the classification f u n c t i o n . T h i s is the variable that adds the most to the separation of the groups.  T h e analysis also evaluates  the F-values for the already entered variables and the variable with the lowest F value (which is also lower than certain threshold) is removed from the classification function.  These step calculations continue until the desired n u m b e r of variables  is reached, or no other significant variable is found. T h e final number of selected variables, the forward a n d backward steps, and the F-to-enter  a n d F-to-remove  threshold values are set by the user and according to the application. T h e linear classification function has the general form of:  Y where  —  C + C\Featurei + C2Feature + • • • + C Feature 0  Feature\, Feature?,..., Feature  2  n  n  n  (4.62)  are the selected features (variables) a n d  83  the coefficients Co, C * i , . . . , C are calculated for each group. For instance, in the n  case of two groups of normal and abnormal, two classification functions with the same selected features and different coefficients are produced. analysis evaluates these two functions for each input case.  The discriminant  Then it assigns the  case to the group for which its classification function has the largest value. These functions can also be used to classify new input cases.  4  The best linear discriminant functions resulted in 81.6% average classification for the two groups of normal and abnormal regions. The number of features used in this classification is limited to three. Any greater number of features increases the dimensionality of the feature space and therefore, for a limited number of independent cases in the database, it produces data dependent results. The ratio of number of samples per number of selected variables is usually considered to be greater than 30 for a minimum dependency of results on the data. The feature that are selected in this classification function are hrmnl4-ft, hrmn21-ft and den-dark spot which are defined by Eq.4.15 and Eq.4.50, respectively. The stepwise discriminant analysis assumes equal prior probabilities for the both groups of normal and abnormal, and therefore it produces the best average classification results. However, unequal prior probabilities affect the computation of the constant term in the classification function (coefficient Co in Eq.4.62) and thus computation of the posterior probabilities and classification results for each group. (See Eq.A.19 in Appendix A.) The graphs of Figure 4.5 are resulted from applying different values of prior probabilities. The x-axis is the percentage of the normal cases which are classified correctly. The y-axis is the percentage of the abnormal cases which are classified correctly as abnormal. Note that the direction of x-axis 4  See Appendix A for more details about this stepwise discriminant analysis.  84  Jackknifed Class., polygon  Classification, polygon 120  1  1 1  100  1 K  i  i  1 1  1 1  i 1 1  --|"^7"Or tfoo  0,y  1201  1  1  1  80  60  40  1 —  1 1  0  1  A  f  (0  *  T  40  -/oj-  20  10  /  1  '  "  i i 1 1 1 1 1  i i  1  T  1 1 1 1 1 1 1 1 I  20  1 1 1  ()  1 1 1  0 100  I  I  1 I  80  60  40  1 1  1 I  1  0  100  % normal  20  0  % normal  F i g u r e 4.5: Classification plot of the percentage of normal and a b n o r m a l groups. T h e x-axis (y-axis) is the percentage of the n o r m a l (abnormal) cases which were classified correctly. T h e Jackknife classification method resulted in the plot on the right.  is reversed to produce a graph similar to an R O C curve. T h e plot on the right of the F i g u r e 4.5 is the classification result using the  Jackknife m e t h o d . T h i s method  classifies each case into the group w i t h the highest posterior probability according to the classification function computed  from all the d a t a except the case being  classified. T h i s is a special case of the general cross-validation method in which the classification functions are computed on a subset of cases, and the probability of misclassification.is estimated from the remaining cases [115]. W h e n each case is left out in t u r n , the method is known as jackknife. T h i s method reduces the chance of having an overly optimistic estimate of the probability of misclassification.  85  4.2.2  A circle for marking the regions  In order to draw a polygon to represent the boundary of each  mass-growing  region,  the abnormal pattern should be known, which is the case for our d a t a base. However, in a practical situation where the m a m m o g r a m s of the next screening with a malignant mass are not available, recognition of the exact b o u n d a r y of the  growing  mass-  structure is not feasible. Therefore, the second approach for m a r k i n g the  normal and a b n o r m a l areas is considered. In this method a fixed-size circle is used to mark the general area of the terns.  F i g u r e 4.6  mass-growing  region and the selected normal pat-  shows a sample of these circles located on an a b n o r m a l region  and a normal region both belonging to one m a m m o g r a m . A n immediate result of  Figure 4.6: the  M a m m o g r a m of the previous screening (also shown in F i g u r e 4.3)  mass-growing  with  region and a normal region marked by two circles.  using a fixed-size circle as the object of feature calculation is the elimination  86  of  morphological features.  Since all the objects are of the same shape and size, the  morphological features will result in equal values for all the objects, including normal and abnormal. This will terminate any possible presumption of understanding the mass-growing region boundaries. For calculation of the photometric and texture features for each of the circular objects, similar to the polygon case, a 256 X 256 ROI is considered to contain and surround the object. The center of the ROI is aligned with the center of object circle. Two of these regions are shown in Figure 4.7. Then each ROI is considered as the image input containing an object for the feature calculation routines. The resulting feature values for the two groups of normal and abnormal cases then form the input to the stepwise discriminant analysis.  Figure 4.7: Two 256 X 256 ROI's extracted from the mammographic image of Figure 4.6 with the marked circle in the center of each. The features that best distinguished between the two groups are correlation, ODsum  and shortO-runs, defined by equations Eq.4.46, Eq.4.17 and Eq.4.57, re-  spectively. The best discriminant function resulted in 75.3% average classification. Figure 4.8 shows the percentage of the abnormal versus normal classification. 87  Figure 4.8:  Classification, circle  Jackknifed Class., circle  % normal  % normal  Classification plot of the percentage of normal and a b n o r m a l groups.  T h e Jackknife classification method resulted in the plot on the right.  4.3  The Main Database  Considering the encouraging outcome of the study on a small d a t a set of m a m m o grams, collection of a larger d a t a base to examine the hypothesis is a natural next step. Because of the variations in the types of breast cancer and also different types of masses, a larger number of cases can reduce the dependency of analysis on the d a t a set. It also allows a larger number of features to be used in the discriminant function and yet maintain the ratio of samples per each feature. M a m m o g r a m s of all women who were diagnosed with breast cancer by detection of a malignant mass in their m a m m o g r a m s in 1996 or before were collected from five British C o l u m b i a Lower M a i n l a n d breast screening clinics. A l l cases are biopsy proven cancers and each patient had at least one screening examination prior to the detection of the t u m o r in their subsequent m a m m o g r a m . T h e diagnostic study and  88  F i g u r e 4.9:  M a m m o g r a m samples from the main database.  T h e top image was  taken in 1995, a malignant mass detected. M a m m o g r a m of the same breast taken in the previous screening examination, 1994, is shown in the b o t t o m image.  mass-growing  site is marked on the 1994 image.  89  The  F i g u r e 4.10: A n o t h e r sample from the main database. T h e C C view of the diagnostic screening is on top and the image of the previous screening is shown on the b o t t o m .  90  F i g u r e 4.11:  T h e M L O view of the same breast as in F i g u r e 4.10.  T h e image on  the left is from diagnostic screening, and image on the right is taken in the previous screening.  91  the previous screening examination were, on average, 13.6 months apart ( m i n i m u m number of months between the two screenings: 10 months, m a x i m u m :  18 months,  mode: 13 months). These women had also agreed to participate in research studies. T h e collected cases consisted of 58 patients, which .included the 20 previously collected m a m m o g r a m s of the small d a t a base studied earlier. Images of Figures 4.9, 4.10 and 4.11 show sample m a m m o g r a m s of this database. nant mass in the diagnosis m a m m o g r a m , and the  T h e site of the malig-  mass-growing  site of the previous  screening examination are marked by arrows. screening mammograms  REVIEW by the radiolgist  MARKING 6 regions in each set of previous screening mammograms  DIGITIZATION using the A I M s y s t e m  CIRCLING every marked region on the images  EXTRACTING the features  CLASSIFICATION by the stepwise discriminant analysis  classification function and results  F i g u r e 4.12: Block diagram of the previous screening analysis.  F i g u r e 4.12 shows the block diagram of our previous screening analysis. A l l the m a m m o g r a m s of the recent and previous screenings were reviewed by an expert  92  radiologist. F o r each C C and M L O projections of the m a m m o g r a m s for the previous screening, 3 areas were marked, as in the case for the previous preliminary set. These 3 areas were: one  mass-growing  data  region, one normal region of the same  m a m m o g r a m , and one normal region in the opposite breast which corresponds to the  mass-growing  area. For 4 of the 58 cases, one of the projections of the previous  screening study was either missing or incomplete.  Therefore, 2 m a m m o g r a m s for  each of these 4 cases, and 4 m a m m o g r a m s for each of the other 54 cases were available to analyze. T w o hundred and twenty four previous screening m a m m o g r a m s were digitized at about  150 fim  per pixel using the A I M  system.  Similar to the second  approach applied on the small d a t a base, a fixed-size circle is used to mark each of the 336 selected regions (112  mass-growing  of which 112 belong to the opposite breast).  or a b n o r m a l , and 224 normal regions, T h e size of the circle is chosen such  that it covers the area corresponding to that of the malignant mass subsequently developed in the m a m m o g r a m s at cancer diagnosis (Figure 4.13). In other words, by inspecting all the m a m m o g r a m s of the recent screening with a malignant mass in each, it was determined that all the tumors are less than about 2 c m in diameter. Therefore, the diameter of the circle used for marking the  mass-growing  regions is  chosen to be 140 pixels, which translates to just over 2 c m on the m a m m o g r a m  film.  A 256 X 256 R O I was considered to surround each object and for each circled object all the photometric and texture features were calculated (Figure 4.14).  To  recognize the best features for classification of abnormal from normal group, the same stepwise discriminant analysis was applied.  Because of the larger number of  samples in this database, six image features were allowed to be used in construction of the discriminant function. T h e six features which best classified the two groups  93  F i g u r e 4.13: T h e previous screening m a m m o g r a m of the F i g u r e 4.9 with the mass-  growing region  on the right and a normal region on the left, both marked by two  circles.  F i g u r e 4.14:  T w o 256 X 256 R O I ' s extracted from the m a m m o g r a p h i c image of  F i g u r e 4.13 with the marked circle in the center of each.  91  Classification, d=140  100  80  60  40  20  Jackknifed Class., d=140  100  0  % normal  80  60  40  20  0  % normal  F i g u r e 4.15: Classification plot of the percentage of normal and a b n o r m a l groups. T h e Jackknife classification method resulted in the plot on the right. Diameter of the object circle is 140 pixels.  were:  o  correlation,  o  cntr-of .gravity,  o  med-VsJowJOD,  o  lowMvg.dst,  o  fractal-dimn,  o  medhi-OD-cmp,  defined by Eq.4.46  defined by Eq.4.54  defined by Eq.4.39  defined by Eq.4.35  defined by Eq.4.56  defined by Eq.4.34  T h e best linear discriminant function resulted in 71.8% average classification between the normal and abnormal groups [116]. Variation of the discriminant function  95  constant parameter resulted in the plots of Figure 4.15. T h e jackknifed classification results are also plotted in the same  4.3.1 To  figure.  Variation of the object diameter  investigate the effects of various sizes of the object circle, the same process was  repeated for different diameters of the circle. F o r instance, a circle of 160 pixels in diameter was used to mark the  mass-growing  region. T w o other circles of the same  size were also used to mark the two other normal regions of the same screening mammograms.  T h e classification plots of the calculated features for the various  object circle size are shown in figures Figure 4.16 to F i g u r e 4.20.  Classification, d=100  100  80  60 40 % normal  20  Jackknifed Class., d=100  0  100  80  60 40 % normal  20  0  F i g u r e 4.16: Classification plot of the percentage of normal a n d abnormal groups. T h e Jackknife classification method resulted in the plot on the right.  Diameter of  the object circle is 100 pixels.  In both cases of the larger and smaller diameters for the object circle, the same discriminant analysis was applied. T h e performance of the resulted classifica-  96  F i g u r e 4.17:  Classification plot of the percentage of normal and abnormal groups.  T h e Jackknife classification method resulted in the plot on the right. Diameter  of  the object circle is 120 pixels.  Classification, d=160  100  80  Figure 4.18:  60 40 % normal  20  Jackknifed Class., d=160  0  100  80  60 40 % normal  20  0  Classification plot of the percentage of normal and abnormal groups.  T h e Jackknife classification method resulted in the plot on the right. Diameter the object circle is 160 pixels.  97  of  F i g u r e 4.19: Classification plot of the percentage of normal and a b n o r m a l groups. T h e Jackknife classification method resulted in the plot on the right.  Diameter of  the object circle is 180 pixels.  F i g u r e 4.20: Classification plot of the percentage of normal and a b n o r m a l groups. T h e Jackknife classification method resulted in the plot on the right. the object circle is 200 pixels.  98  Diameter of  tion function decreased comparing to the circle diameter of 140 pixels. A s shown in figures 4.16 and 4.17, the area under the curve decreases as the circle diameter becomes smaller. T h e average classification results were 68.5% and 65.8% for diameter of 120 and 100 pixels, respectively. B o t h resulted in lower performance comparing to the original 140-pixel diameter  (71.8% and F i g u r e 4.15).  F o r circle diameters  larger than 140 pixels, the performance is lower again. F o r circle diameters of 160, 180 and 200 pixels, the best average classification percentage were 67.6%,  64.4%  and 62.8%, respectively (see also figures 4.18, 4.19 and 4.20). T h e outcome of this experiment  indicates that our primary choice of the  object diameter, 140 pixels, results in the best performance. T h e diameter of 140 pixels translates into just over 2 cm on the x-ray m a m m o g r a m selected as the circle diameter  film,  and it was  because that 2 c m in diameter is the size of the  largest malignant mass in the m a m m o g r a m s of the recent screening in our database (at which the patients were diagnosed by cancer). F o r a smaller circle diameter than 140 pixels, some areas of the previous screening m a m m o g r a m s which developed a mass in the following screening are not included. Therefore, some of the available information is neglected. F o r a larger object diameter, we are including some areas of the m a m m o g r a m s which are normal and classify them as abnormal regions. T h i s can result in lower performance of the training process. In each of the above cases of different circle diameters, the number of features allowed in the discriminant function was again limited to six. F o u r of these six features were c o m m o n for all cases with different circle diameters (with the exception of diameter = 200). T h e four c o m m o n features were:  cntr.of.gravity  (Eq.4.54),  medjvsJowjOD  (Eq.4.39) and  correlation  (Eq.4.46),  low.avg.dst  (Eq.4.35).  However, two of the six selected features were not the same in all circle diameter  99  cases. T h i s was to be expected since the area covered by the circle objects of different diameters, are not the same. New areas covered by an object of a larger circle diameter may introduce other features which are more effective in the classification. Six selected features by the discriminant analysis in each of the circle diameter cases are listed in Table 4.1.  DIAMETER=140  DIAMETER=120  DIAMETER=160  correlation (Eq.4.46) cntr-of-gravity (Eq.4.54) med-vsJow-OD (Eq.4.39) low-avg-dst (Eq.4.35) fractal-dimn (Eq.4.56) medhi.OD.cmp (Eq.4.34)  correlation (Eq.4.46) cntr JOj'.gravity (Eq.4.54) medjvsJowJDD (Eq.4.39) low.avg.dst (Eq.4.35) fractaljdimn (Eq.4.56) medhi.avg.dst (Eq.4.38)  correlation (Eq.4.46) cntr .of .gravity (Eq.4.54) med.vs.low.OD (Eq.4.39) lowjavg.dst (Eq.4.35) fractal-dimn (Eq.4.56) med-OD-cmp (Eq.4.32)  DIAMETER=100  DIAMETER=180  D I A M E T E R = 200  correlation (Eq.4.46) cntr .of .gravity (Eq.4.54) med.vs.low.OD (Eq.4.39) low.avg.dst (Eq.4.35) OD.skewness (Eq.4.23) medhi.avg.dst (Eq.4.38)  correlation (Eq.4.46) cntr JO f .gravity (Eq.4.54) med.vsJow.OD (Eq.4.39) low.avg.dst (Eq.4.35) med.OD.cmp (Eq.4.32) energy (Eq.4.43)  correlation (Eq.4.46) cntr-of-gravity (Eq.4.54) med.vs.low.OD (Eq.4.39) med-OD-cmp (Eq.4.32) energy (Eq.4.43)  Table 4.1: Selected features for various object circle diameters.  (only 5 features) F o u r features are  c o m m o n a m o n g all cases (with the exception of diameter=200 pixels). Rows 5 and 6 of the table are different for different diameters.  4.3.2  Evaluating the classification function  T h e aim of this section is to evaluate performance of the classification function obtained in the previous section from the training set (main database with 224 images), on a test set.  T h e classification function which resulted from applying  the stepwise discriminant analysis on the 336 normal and a b n o r m a l regions (with  100  the circle diameter of 140 pixels) is used for this evaluation.  T h i s function was  applied on a set of normal m a m m o g r a m s . T h e normal m a m m o g r a m s collected as the test set were the m a m m o g r a m s of the normal breast, usually f r o m the screening examinations prior to the year of cancer detection. T w o approaches were exercised:  •  F i r s t , all the parenchymal patterns and structures of the n o r m a l m a m m o g r a m s were covered by a number of object circles. T h i s approach a t t e m p t s to simulate the case in which a radiologist uses our analysis to verify every m a m m o g r a p h i c structure and estimate the remote chance of cancer development.  A sample  of these n o r m a l m a m m o g r a m s with a number of object circles placed on the image is shown in F i g u r e 4.21.  T h e classification function resulted from the  F i g u r e 4.21: A n o r m a l m a m m o g r a m with object circles placed on every parenchymal pattern and structure. O n e of the test cases.  training set was used to evaluate 143 of such normal regions, f r o m which 75.5% were classified correctly. T h i s result is consistent with the classification plots  101  resulted from our training set.  •  In the second approach, for a number of normal m a m m o g r a m s , each m a m m o gram is completely covered with neighboring circles. T h i s approach simulates a possible automatic analysis of m a m m o g r a m s which simply examines every region on the m a m m o g r a p h i c image. A sample m a m m o g r a m covered by neighboring circles is shown in Figure 4.22.  Figure 4.22:  Note that the overlapping circles are  A normal m a m m o g r a m with neighboring object circles covering the  whole m a m m o g r a m . O n e of the test cases.  also possible. T h i s can fill the uncovered areas between the circles and therefore not miss any m a m m o g r a p h i c pattern.  T h e downside of the overlapping  circles is increase in the computation time at the stage of feature calculation. T o reduce the number of unnecessary feature calculations, a simple thresholding was conducted before the feature calculation step. In this thresholding  102  step, circle objects which their image pixels are dark (e.g., gray level value less than 15) are omitted. T h i s includes the circles positioned on the background area, and also circles on the dark parts of m a m m o g r a m with no present pattern of structure.  T h e features were calculated for all the objects after the  thresholding step. T h e n the classification function resulted from the training set was applied on 276 regions of these normal m a m m o g r a m s . T h e percentage of these regions which were classified correctly was 82.2%. T h e classification percentage is higher than that of the training set (72.1%). T h i s is expected since the normal regions selected for the training set are chosen to be similar to the  mass-growing  regions.  In fact, the classification percentage for these test regions is expected to be higher than 82.2%, because of two reasons. Firstly, because of of the background objects which were eliminated before the feature calculation step and classified as normal.  Secondly, because of some regions in the m a m m o g r a m  that do not contain any suspicious-looking structures in them and they should be classified as normal regions easily. However, it should be recognized that the training was only performed on the normal cases similar to the abnormal regions. Therefore, the classification function was not trained to handle the normal objects which are easily distinguished from the a b n o r m a l regions. F o r a fully automated system, the training step needs to be completed for all types of normal regions, by including them in the training, or adding a preliminary step to exclude (and classify as normal) the normal cases which are easy to identify.  103  the art and outrage  of breast  cancer.  "The Mastectomy Quilt"  64" x 52" Textile  "After my bilateral mastectomy, I became alarmed by stories of women who found lumps in tlxeir breasts but avoided treatment because tftey feared disfigurement. I was also concerned that many women do not realize they have a choice about implants and cosmetic surgery. It is not necessaty to conform to society's image. 'Pie storij of the quilt reads from left to right. A healthy, whole woman walks along with everything right in her world. 'Then sfmgets a mammogram. Slie receives a diagnosis. Slie has the surgery. After her recovery, shegoes back, to the doctor, wfio asks if she would tike to have more surgery for implants slw says no! 'The message of the quilt is enjoyment of life amid flowers and music,'fearoJ disfigurement is no excuse for postponing a mammogram. A changed body is not important-life is!" Suzanne 'Marshall 1992  104  Chapter 5  Conclusion and Future Suggestions 5.1  Overview and Summary  E a r l y detection of breast cancer has resulted in reduction in the mortality rate for women affected with this disease. Image processing and image analysis techniques help radiologists in the difficult task of m a m m o g r a p h i c t u m o r detection.  Ultimately,  this will result in earlier detection of cancer, and reduce the number of observational misses. T h e study presented in this thesis, focuses on early detection of soft tissue abnormalities in m a m m o g r a p h i c images, which can be difficult to detect due to their unclear boundaries and their similarity to surrounding normal parenchymal patterns. Segmentation of m a m m o g r a m s is an essential step in most of the mass detecting algorithms. W e devised a m a m m o g r a m segmentation algorithm based on the properties of m a m m o g r a p h i c patterns. T h e algorithm employs the fuzzy sets theory  105  and considers the effects of neighboring pixels to achieve a multi-level  partitioning.  Performance of our segmentation algorithm was evaluated by c o m p a r i n g the partitioned m a m m o g r a m to the outcome of a thresholding m e t h o d . T h e algorithm was also combined with a feature extraction step to form a mass detection scheme. T h i s detection scheme uses only a single m a m m o g r a m for detection, and yet achieves a 94% sensitivity and less than 0.25 false-positives per image. T h i s compares favorably with results of other researchers. Considering the relatively low computational time (because of using a single m a m m o g r a m )  our mass detection scheme has the  potential of being combined with other detection methods which use the image of the opposite breast. T h i s can further improve the false-positive rate. In C h a p t e r 4 of this thesis, study of the previous screening m a m m o g r a m s was presented. T h i s is another step towards earlier detection of breast cancer.  In  this study, instead of using image analysis techniques to detect any existing cancers on a m a m m o g r a m , we focused on the m a m m o g r a m s taken prior to the detection of a malignant mass. T h e m a m m o g r a m s where cancer masses were detected, were only used to identify  the corresponding mass locations on the previous screening  m a m m o g r a m examination.  These areas are referred to as  masses were later detected in these locations. T h e  mass-growing  mass-growing  areas as  area was compared  with other normal patterns in the previous screening m a m m o g r a m s . A total of 452 m a m m o g r a m s were collected. radiologist and the sites of the  A l l the m a m m o g r a m s were reviewed by an expert  mass-growing  whose appearances are similar to those of the  areas, as well as other normal areas  mass-growing  areas, were marked  on the 224 m a m m o g r a m s of the previous screening examination.  F o r each of the  marked normal and abnormal regions, 62 photometric and texture image features were calculated. A discriminant analysis was then applied to find the features that  106  best distinguish between the normal and abnormal regions. T h e linear classification function with the best 6 features resulting from the discriminant analysis, produced 71.8% average classification between the normal and abnormal groups.  5.2  Conclusions  Utilizing the properties of m a m m o g r a p h i c patterns plays an i m p o r t a n t rule in devising a m a m m o g r a m segmentation algorithm.  O u r fuzzy segmentation  algorithm  exploits these properties and therefore has a high performance. A s the first step of a mass detection algorithm, our fuzzy segmentation algorithm has the advantage that it produces a smaller number of mass-candidate areas and thus a smaller number of false-positives. T h i s also reduces the unnecessary feature calculations in the second step of the mass detection scheme, and therefore results in a less computationally expensive m e t h o d . O u r proposed mass detection method resulted in a high sensitivity percentage, and since it only uses a single m a m m o g r a m , it can be combined with other detection methods and achieves even a higher performance. In the retrospective study of m a m m o g r a m s , which we have pioneered, it was shown that there exist differences between the region that subsequently becomes a malignant mass, and other normal areas of the m a m m o g r a p h i c images taken in the last screening examination prior to the detection of a mass. In other words, for the m a m m o g r a m s which were called normal at the time of screening (and developed a malignant mass in the following screening examination), our feature extraction technique was able to detect signs of cancer development in 72% of the cases studied. In clinical practice, this can result in further investigation of a suspicious region, a n d / o r reducing the time interval until the next screening, which at present, is a one year interval. F o r these cases, the time interval may be shortened to 6 months.  107  O u r feature extraction system, at its current stage, can be used by a radiologist to mark any region on the screening m a m m o g r a m which appears to be suspicious. T h e system then calculates and classifies the features for this region. T h e region will be flagged if the system classifies it as abnormal. T h e main contributions of this work are:  -  T h e fuzzy segmentation algorithm which was specifically designed for the partitioning of m a m m o g r a m s .  -  T h e study of the m a m m o g r a m s of the previous screening examination, by applying image processing techniques, and determining the textural differences between the normal and abnormal image regions.  -  T h e introduction of modified texture image features for application on m a m mographic images. These features can be used in a mass detection scheme as well as in the retrospective study of m a m m o g r a m s .  T h e most significant contribution of our work, however, is that we have pioneered the retrospective study of m a m m o g r a m s .  W e have shown, for the first  time, that the areas where masses were detected by radiologists could be detected as suspicious by the system a year earlier.  A s a result, radiologists could conduct  closer follow-up on such patients in order to detect a malignancy earlier than would occur otherwise.  5.3  Suggestions for Future Work  T h e followings are a few suggestions for continuing this research.  •  C l i n i c a l evaluation of the retrospective study of m a m m o g r a m s is probably the  •108  most important follow-up for this research. O u r feature extraction technique was trained on a database with 224 normal and 112 a b n o r m a l regions. T h e system was also tested on a number of other normal m a m m o g r a m s . Testing on a larger number of abnormal cases can assure the performance of the system, and this can be achieved through a clinical evaluation.  T h e system at the  current stage has to be operated by a radiologist.  T h e mass detection scheme presented in Section 3.3 achieved a high sensitivity rate.  However, the scheme was tested on a relatively small database of 36  m a m m o g r a m s , and training on a larger database is required to complete the evaluation. O u r mass detection method uses only a single m a m m o g r a m , and considering the valuable information which can be obtained from the image of the opposite breast, combining our detection method with other available schemes which use the opposite m a m m o g r a m would form an appropriate step towards further research.  T o create a fully automated feature extraction system for the retrospective study of m a m m o g r a m s , the current system can be combined with the fuzzy segmentation algorithm.  In the current feature extraction system, the areas  which are to be examined have to be selected manually (usually by a radiologist). If proper threshold values are determined for the fuzzy segmentation algorithm, it can be employed as a preliminary step for the feature extraction system. T h e segmentation algorithm will find the suspicious areas on the previous screening m a m m o g r a m s , and then feature calculation and classification will determine the abnormal (or high risk) regions. T h i s combination of segmentation and feature extraction steps requires a larger training set and  109  verification of the o p t i m a l values for the system parameters.  For both the feature extraction system and the mass detection scheme, linear classification functions were used. A non-linear discriminant function such as an artificial neural network, can probably improve the classification result.  Direct digital m a m m o g r a p h y systems are currently being evaluated in clinical trials, therefore, fully digitized m a m m o g r a m s (without the intermediate file medium)  are expected to be available soon.  Considering the high contrast  and the larger gray level d y n a m i c range of these images (compared to images resulting from digitization of m a m m o g r a p h i c films), more photometric and texture information can be extracted from these images. A n interesting and important area for future research is to apply our feature extraction technique on the directly digitized m a m m o g r a m s . T h i s may expose a new set of features to distinguish the normal from  mass-growing  110  areas to better advantage.  the art and outrage  of breast  cancer,  "Artists use the transformative power of paint and clay to magnify andexplain mere singular experience. 5\s an artist who teaches, I wondered how I could maf^e my experience with breast cancer relevant to my teenage students. Using the familiar image of "Botticelli's Venus as a metaphor, I painted an altered vision of feminine Beauty. 'When asked to describe the differences between the original painting and my version, the girl students talked about changes in the flowers and the absence oj the ocean. The boys said, 'She's only got one breast.' So the girls avoided the subject, while the boys focused obsessively on it. To me, her eyes, serene but touched by sadness, reflect awareness of her wound, but her beauty is unmarred. 'We are, after all, more than the sum of our parts." Carole Honicelli 1995  1 1 1  Bibliography [1] G . U r s i n , L . Bernstein, and M . C . Pike. mortality.  Trends in cancer incidence and  In R. D o l l , J . F . F r a u m e n i J r . , and C . S. M u i r , editors,  Surveys, Vol. 19, page  Cancer  241. C o l d Spring H a r b o r L a b o r a t o r y Press, C o l d Spring  H a r b o r , N Y , 1994. [2] National  Cancer  Institute  of C a n a d a .  Canadian  cancer  statistics  1996.  T o r o n t o , C a n a d a , J a n u a r y 1996. [3] M . Moskowitz. M a m m o g r a p h y to screen a s y m p t o m a t i c women for breast cancer.  American Journal of Roentgenology,  143:457-459, 1984.  [4] D . M . G a r n e r , A . Harrison, J . Korbelik, B . Palcic, C . M a c A u l a y , J . M a t i s i c , and G . H . A n d e r s o n . A u t o m a t e d cervical cell prescreening system (access).  In ASCP/CAP, Scientific Symposium 3200: Recent Advances in Cytology Automation, L a s Vegas, 1992. [5] B . Palcic and B . J a g g i . Image cytometry system for morphometric ments of live cells. In D . L . W i s e , editor,  velopments and Applications, [6] E . S. Paredes.  measure-  Bioinstrumentation: Research, De-  pages 923-991. B u t t e r w o r t h Publishers, 1990.  Atlas of Film-Screen Mammography.  U r b a n & Schwarzenberg,  Inc., Baltimore, M D , 1989. [7] F . Macdo'nald and C . H . J . F o r d .  Molecular Biology of Cancer.  B I O S Scientific  Publishers L i m i t e d , O x f o r d , U K , 1997. [8] G . W . M i t c h e l , J r . and L . W . Bassett.  The Female Breast and Its Disorders.  W i l l i a m s & W i l k i n s , Baltimore, M D , 1990. [9] S. A . Feig. Decreased breast cancer mortality through m a m m o g r a p h i c screening: Results of clinical trials.  Radiology,  112  167:659-668, 1988.  [10] L . T a b a r , G . Fagerberg, and S. W . Duffy et al. U p d a t e of the Swedish twocountry program of mammographic screening for breast cancer.  Radiol Clin  North Am, 30:187-195, 1992. Mammographic Interpretation, A Practical Approach, Second  [11] M . J . Homer.  Edition.  M c G r a w Hill C o m p a n i e s , Inc., New Y o r k , N Y , 1997.  [12] C . J . V y b o r n y and M . L . Giger. C o m p u t e r vision and artificial intelligence in  American Journal of Roentgenology,  mammography.  162:699-708, 1994.  [13] T h e Steering C o m m i t t e e on Clinical Practice of the C a r e and Treatment of Breast C a n c e r . Investigation of lesions detected by m a m m o g r a p h y .  dian Medical Association Journal, 158(3  Cana-  S u p p l ) : s 9 - s l 4 , F e b r u a r y 1998.  [14] K . Richter, S. H . H e y w a n g - K o b r u n n e r , and K - J . W i n z e r et ad. Detection of malignant and benign breast lesions with an automated U S system: results in 120 cases.  Radiology,  205:823-830, 1997.  [15] E . S. F o b b e n , C . Z . R u b i n , L . Kalisher, A . G . D e m b n e r , M . H . Seltzer, and E . J . Santoro.  Breast M R imaging with commercially available techniques:  radiologic-pathologic correlation.  Radiology,  196:143-152, 1995.  [16] T . A . C o o n s . M R I ' s role in assessing and managing breast disease.  Technology,  Radiologic  67(4):311-340, 1996.  [17] A . R. C o w e n , G . J . S. P a r k i n , and P. Hawkridge. Direct digital m a m m o g r a p h y image acquisition.  European Radiology,  7:918-930, 1997.  [18] J . P. Hogge, D . S. A r t z , and M . T . Freedman. U p d a t e in digital mammography.  Critical Reviews in Diagnostic Imaging,  38(1):89—113, 1997.  [19] F . W i n s b e r g , M . E l k i n , J . M a c y , V . B r o d a z , and W . W e y m o u t h .  Detection  of radiographic abnormalities in m a m m o g r a m s by means of optical scanning and computer analysis.  Radiology,  89:211-215, 1967.  [20] C . K i m m e , B . J . O'Loughlin, and J . Sklansky. A u t o m a t i c detection of suspicious abnormalities in breast radiographs. In A . Klinger, K . S . F u , and T . L . K u n i i , editors,  Data Structures, Computer Graphics, and Pattern Recognition,  pages 429-447. A c a d e m i c Press, New Y o r k , 1975. [21] L . V . A c k e r m a n and E . E . Gose. Breast lesion classification by computer and xeroradiography.  Cancer,  30:1025-1035, 1972.  113  [22] W . G . Wee, M . Moskowitz, N - C . C h a n g , Y - C . T i n g , and S. P e m m e r a j u . E v a l uation of m a m m o g r a p h i c calcifications using a computer p r o g r a m .  Radiology,  116:717-720, 1975. [23] S. H . F o x , U . M . Pujare, W . G . Wee, M . M o s k o w i t z , and R. V . P. H u t t e r . A computer analysis of mammographic microcalcifications: global approach. In  Proceedings of the IEEE 5th International Conference on Pattern Recognition, pages 624-631, New Y o r k ,  1980.  [24] K . T . S m i t h , S. L. Wagner, R. B . Guenther, and D . C . S o l m o n .  T h e diag-  nosis of breast cancer in m a m m o g r a m s by the evaluation of density patterns.  Radiology,  125:383-386, 1977.  [25] W . H a n d , J . L . Semmlow, L . V . A c k e r m a n , and F . S. A l c o r n . screening of x e r o m a m m o g r a m s : the breast. [26] W .  Computer and Biomedical Research,  Spiesberger.  Mammogram  Biomedical Engineering,  Computer  a technique for defining suspicious areas of 12:445-460, 1979.  IEEE Trans, on  inspection by computer.  26:213-219, 1979.  [27] H - P . C h a n , K . D o i , C . J . V y b o r n y , K . L. L a m , and R. A . S c h m i d t . C o m p u t e r aided detection of microcalcifications in m a m m o g r a m s : methodology and preliminary clinical study. [28]  Investigative Radiology,  H - P , C h a n , K . D o i , and C . J . V y b o r n y et al.  23:664-671, 1988.  Improvement in radiologists'  detection of clustered microcalcifications in m a m m o g r a m s : computer-aided diagnosis. [29]  Investigative Radiology,  R. M . Nishikawa, M . L. Giger, and K . D o i et ai.  the potential of  25:1102-1110, 1990.  Computer-aided  detection  and diagnosis of masses and clustered microcalcifications from digital m a m mograms.  SPIE,  1905:422-432, 1993.  [30] Y . W u , K . D o i , M . L . Giger, and R. M . Nishikawa. C o m p u t e r i z e d detection of clustered microcalcifications in digital m a m m o g r a m s : applications of artificial neural networks. [31]  Medical Physics, 19:555-560,  1992.  B . W . F a m , S. L . O l s o n , P. F . W i n t e r , and F . J . Scholz.  A l g o r i t h m for the  detection of fine clustered calcifications on film m a m m o g r a m s .  Radiology,  169:333-337, 1988. [32] D . H . Davies and D . R. D a n c e . A u t o m a t i c computer detection of clustered calcifications in digital m a m m o g r a m s . 118,  Physics in Medicine and Biology, 35:111—  1990.  114  [33] D . H . Davies and D . R. D a n c e .  T h e automatic computer detection of sub-  tle calcifications in radiographically dense breasts.  Biology, 37:1385-1390, [34]  S. Astley, I.  Physics in Medicine and  1992.  H u t t , and S. A d a m s o n et al.  computer vision and human perception.  A u t o m a t i o n in  SPIE,  mammography:  1905:716-730, 1993.  [35] N . Karssemeijer. A stochastic method for automated detection of microcalcifications in digital m a m m o g r a m s . In ing, [36]  Information Processing in Medical Imag-  pages 227-238. Springer-Verlag, New Y o r k , 1991.  N . Karssemeijer. A d a p t i v e noise equalization and detection of microcalcification clusters in mammography.  IJPRAI,  7:1357-1367, 1993.  [37] W . P. Kegelmeyer, J r and M . C . A l l m e n . Dense feature maps for detection of calcifications. In A . G . G a l e et ai, editor,  Digital Mammography,  pages 3-12.  Elsevier Science B . V . , A m s t e r d a m , New Y o r k , 1994. [38]  R. N . Strickland and H . I. H a h n . Wavelet transforms for detecting microcalcifications  in m a m m o g r a m s .  IEEE Trans, on Medical Imaging, 15(2):218-229,  1996. [39] T . Netsch. Detection of microcalcification clusters in digital m a m m o g r a m s : a scale-space approach. In B . A r n o l d s  der Medizin. [40]  et al, editor, Digitate Bildverarbeitung in  G M D S , Freiburg, 1996.  S. M . L a i , X . L i , and W . F . Bischof. O n techniques for detecting circumscribed masses in m a m m o g r a m s .  IEEE Trans, on Medical Imaging, 8(4):377-386,  1989. [41] T . K . L a u and W . F . Bischof. A u t o m a t e d detection of breast tumors using the asymmetry approach. [42]  Computer and Biomedical Research,  24:273-295, 1991.  S. L. N g and W . F . Bischof. A u t o m a t e d detection and classification of breast tumors.  Computer and Biomedical Research,  25:218-237, 1992.  [43] F . F . Y i n , M . L . Giger, K . D o i , C . E . M e t z , C . J . V y b o r n y , and R. A . Schmidt. C o m p u t e r i z e d detection of masses in digital m a m m o g r a m s : analysis of bilateral subtraction images.  Medical Physics, 18(5):955—963,  1991.  [44] F . F . Y i n , M . L. Giger, C . J . V y b o r n y , K . D o i , and R. A . S c h m i d t . parison of bilateral-subtraction  Com-  and single-image processing techniques in the  computerized detection of m a m m o g r a p h i c masses. 28(6) :473-481, 1993.  115  Investigative Radiology,  [45] Y . W u , M . L . Giger, K . D o i , C . J . V y b o r n y , R. A . S c h m i d t , a n d C . E . M e t z . Artificial neural networks in m a m m o g r a p h y : in the diagnosis of breast cancer.  Radiology,  application to decision making 187:81-87, 1993.  [46] R. M . Nishikawa, M . L . Giger, K . Doi,. C . J . V y b o r n y , a n d R. A . Schmidt. C o m p u t e r aided detection and diagnosis of masses and clustered microcalcifications from digital m a m m o g r a m s . tors,  In K . W . Bowyer and S. Astley, edi-  State of the Art in Digital Mammographic Image Analysis,  pages 82-102.  W o r l d Scientific P u b . , 1994. [47] M . L . Giger, P. L u , Z . H u o , U . Bick, C . J . V y b o r n y , R. A . S c h m i d t , W . Zheng, C . E . M e t z , D . Wolverton, R. M . Nishikawa, W . Zouras, a n d K . D o i . C A D in digital m a m m o g r a p h y : computerized detection and classification of masses. In A . G . G a l e et  &l,  editor,  Digital Mammography,  pages 281-287. Elsevier Science  B . V . , A m s t e r d a m , New Y o r k , 1994. [48] R. A . Schmidt, R. N . Nishikawa, K . Schreibman, M . L . Giger, K D o i , J . P a paioannaou, P. L u , J . Stucka, and G . B i r k h a h n . C o m p u t e r detection of lesions missed by mammography.  In A . G . Gale et  a.1,  editor,  Digital Mammography,  pages 405-408. Elsevier Science B . V . , A m s t e r d a m , New Y o r k , 1994. [49] D . B r z a k o v i c , S. M . L u o , and P. Brzakovic. A n approach to automated detection of tumors in m a m m o g r a m s .  IEEE Trans, on Medical Imaging,  9(3):233-  241, 1990. [50] D . B r z a k o v i c , P. Brzakovic, and M . Neskovic. screening of m a m m o g r a m s .  SPIE,  A n approach to automated  1905:690-701, 1993.  [51] D . Brzakovic and M . Neskovic. M a m m o g r a m screening using multiresolutionbased image segmentation. In K . W . Bowyer and S. Astley, editors,  Art in Digital Mammographic Image Analysis,  State of the  pages 103-127. W o r l d Scientific  P u b . , 1994. [52] W . P. Kegelmeyer, J r . C o m p u t e r detection of stellate lesions in m a m m o g r a m s .  SPIE,  1660:446-454, 1992.  [53] W . P. Kegelmeyer, J r . Evaluation of stellate lesion detection in a standard m a m m o g r a m d a t a set.  SPIE,  1905:787-798, 1993.  [54] W . P. Kegelmeyer, J r . Evaluation of stellate lesion detection in a standard m a m m o g r a m d a t a set. In K . W . Bowyer and S. Astley, editors,  Art in Digital Mammographic Image Analysis, P u b . , 1994.  116  State of the  pages 262-279. W o r l d Scientific  [55] W . P. Kegelmeyer, J r , J . P r u n d e d a , P. B o u r l a n d , A . Hillis, M . Riggs, and M . Nipper.  C o m p u t e r aided m a m m o g r a p h i c screening for spiculated lesions.  Radiology,  191:331-337, 1994.  [56] R. C . Gonzalez and R. E . W o o d s .  Digital Image Processing.  Addison-Wesley  Publishing C o m p a n y , M e n l o P a r k , C A , 1992. [57] B . R. G r o s h o n g and W . P. Kegelmeyer, J r . E v a l u a t i o n of a H o u g h transform method for circumscribed lesion detection. Nishikawa, and R. A . Schmidt, editors,  In K . D o i , M . L . Giger, R.  Digital Mammography'96,  366. Elsevier Science B . V . , A m s t e r d a m , New Y o r k , [58]  N . Karssemeijer. A . G . Gale et  ai,  [59]  1996.  Recognition of stellate lesions in digital m a m m o g r a m s . editor,  Digital Mammography,  B . V . , A m s t e r d a m , New Y o r k ,  In  pages 211-219. Elsevier Science  1994.  G . M . te Brake and N . Karssemeijer. Detection of stellate breast abnormalities. In K . D o i , M . L . Giger, R. M . Nishikawa, and R. A . S c h m i d t , editors,  Mammography'96, York, [60]  M.  pages 3 6 1 -  Digital  pages 342-346. Elsevier Science B . V . , A m s t e r d a m ,  New  1996.  K . S. W o o d s and K . W . Bowyer. A . G . G a l e et  al,  editor,  C o m p u t e r detection of stellate lesions.  Digital Mammography,  B . V . , A m s t e r d a m , New Y o r k ,  In  pages 221-229. Elsevier Science  1994.  [61] H - P . C h a n , D . W e i , M . A . Helvie, B . Sahiner, D . A l d e r , M . M .  Goodsitt,  and N . Petrick. C o m p u t e r - a i d e d classification of m a m m o g r a p h i c masses and normal tissue: linear discriminant analysis in texture feature space.  in Medicine and Biology, [62]  Physics  40:857-876, 1995.  H . D L i , M . Kallergi, L . P. C l a r k e , V . K . J a i n , and R. A . C l a r k . M a r k o v random field for t u m o r detection in digital m a m m o g r a p h y .  Imaging,  IEEE Trans, on Medical  14(3):565-575, 1995.  [63] Y - H . C h a n g , B . Zheng, and D . G u r .  C o m p u t e r i z e d identification  cious regions for masses in digitized m a m m o g r a m s .  of suspi-  Investigative Radiology,  31(3):146-153, 1996. [64] Y - H . C h a n g , B . Zheng, and D . G u r . Robustness of computerized identification of masses in digitized m a m m o g r a m s . 1996.  117  Investigative Radiology,  31(9):563-568,  [65] B . Zheng, Y - H . C h a n g , and D . G u r . M a s s detection in digitized  mammo-  grams using two independent computer-assisted diagnosis schemes.  American  Journal of Roentgenology,  167:1421-1424, 1996.  [66] B . Zheng, Y - H . C h a n g , and D . G u r . C o m p u t e r i z e d detection of masses in digitized m a m m o g r a m s using single image segmentation and a multilayer topographic feature analysis.  Academic Radiology,  [67] H . K o b a t a k e , Y . Yoshinaga, and M . M u r a k a m i . lignant tumors on m a m m o g r a m . In  Processing, pages  2:959-966, 1995.  A u t o m a t i c detection of m a -  Proceedings of IEEE Conference on Image  407-410, 1994.  [68] H . K o b a t a k e and Y . Yoshinaga. Detection of spicules on m a m m o g r a m based on skeleton analysis.  IEEE Trans, on Medical Imaging,  15(3):235-245, 1996.  [69] N . Petrick, H - P . C h a n , B . Sahiner, and D . W e i . A n adaptive  density-weighted  contrast enhancement filter for m a m m o g r a p h i c breast mass detection.  Trans, on Medical Imaging,  IEEE  5 ( l ) : 5 9 - 6 7 , 1996.  [70] N . Petrick, H - P . C h a n , D . W e i , B . Sahiner, M . R. Helvie, and D . D . A l d e r . A u tomated detection of breast masses on m a m m o g r a m s using adaptive contrast enhancement and texture classification.  Medical Physics,  23(10):1685-1696,  1996. [71] N . Petrick, H - P . C h a n , B . Sahiner, M . A . Helvie, M . M . G o o d s i t t , and D . D . A d l e r . C o m p u t e r - a i d e d breast mass detection: false positive reduction using breast tissue composition. In K . D o i , M . L . Giger, R. M . Nishikawa, and R. A . Schmidt, editors,  Digital Mammography'96,  pages 373-378. Elsevier Science  B . V . , A m s t e r d a m , New Y o r k , 1996. [72] S. P o h l m a n ,  K.  A.  Powell,  S. G r u n d f e s t - B r o n i a t o w s k i . digitized m a m m o g r a m s .  N.  A.  Obuchowski,  W.  A.  Chilcote,  and  Quantitative classification of breast tumors in  Medical Physics, 23(8):1337-1345,  1996.  [73] R. M . Nishikawa. C o m m e n t on "Quantitative classification of breast tumors in digitized m a m m o g r a m s [Med. P h y s . 23: 1337-1345 (1996)]".  Medical Physics,  24(2):313, 1997. Letter to the E d i t o r . [74] K . A . Powell. Response to " C o m m e n t on ' Q u a n t i t a t i v e classification of breast tumors in digitized m a m m o g r a m s [Med. P h y s . 23:1337-1345 (1996)]"'.  ical Physics, 24(2):315,  1997. Letter to the E d i t o r .  118  Med-  [75] J . G . D i a h i , C . Frouge, A . G i r o n , and B . Fertil. Artificial neural networks for detection of breast cancer in mammography. Nishikawa, and R. A . Schmidt, editors,  In K . D o i , M . L . Giger, R.  Digital Mammography'96,  334. Elsevier Science B . V . , A m s t e r d a m , New Y o r k , [76] L. Miller and N . Ramsey. multiscale analysis. S c h m i d t , editors,  M.  pages 3 2 9 -  1996.  T h e detection of malignant masses by non-linear  In K . D o i , M . L . Giger, R. M . Nishikawa, and R.  Digital Mammography'96,  B . V . , A m s t e r d a m , New Y o r k ,  A.  pages 335-340. Elsevier Science  1996.  [77] T . C . P a r r , S. M . Astley, C . J . T a y l o r , and C . R. M . Boggis. classification of linear structures in digital m a m m o g r a m s . Giger, R. M . Nishikawa, and R. A . Schmidt, editors,  M o d e l based  In K . D o i , M . L.  Digital Mammography'96,  pages 351-356. Elsevier Science B . V . , A m s t e r d a m , New Y o r k , [78] T . C . P a r r , C . J . Taylor, S. M . Astley, and C . R. M . Boggis.  1996. A  statistical  representation of pattern structure for digital m a m m o g r a p h y . In K . D o i , M . L. Giger, R. M . Nishikawa, and R. A . Schmidt, editors,  Digital Mammography'96,  pages 357-360. Elsevier Science B . V . , A m s t e r d a m , New Y o r k ,  1996.  [79] W . E . Polakowski, D . A . Cournoyer, S. K . Rogers, M . P. D e S i m i o , D . R u c k , and J . W . Hoffmeister.  C o m p u t e r - a i d e d breast cancer detection  W. and  diagnosis of masses using difference of gaussians and derivative-based feature saliency. [80]  IEEE Trans, on Medical Imaging,  16(6):811-819, 1997.  R. M . R a n g a y y a n , N . M . E l - F a r a m a w y , J . E . Leo Desautels, and O . A . A l i m .  IEEE  Measures of acutance and shape for classification of breast tumors.  Trans, on Medical Imaging, [81]  16(6):799-810, 1997.  R. M . R a n g a y y a n , L . Shen, Y . Shen, J . E . Leo Desautels, H . B r y a n t , T . J . Terry, N . Horeczko, and M . S. Rose. Imporvement of sensitivity of breast cancer diagnosis with adaptive neighborhood contrast enhancement of m a m m o grams.  IEEE Trans, on Information Technology in Biomedicine, 1(3):  161-170,  1997. [82] J . Y . L o , J . A . Baker, P. J . K o r n g u t h , J . D . Iglehart, and C . E . F l o y d , J r . Predicting breast cancer invasion with artificial neural networks on the basis of m a m m o g r a p h i c features.  Radiology,  203:159-163, 1997.  [83] K . W o o d s and K . Bowyer. A general view of detection algorithms. In K . D o i , M . L . Giger, R. M . Nishikawa, and R. A . S c h m i d t , editors,  mography'96,  Digital Mam-  pages 385-390. Elsevier Science B . V . , A m s t e r d a m , New  1996.  119  York,  [84] D . B r z a k o v i c , N . Vujovic, and M . Neskovic. changes by m a m m o g r a m comparison.  SPIE,  E a r l y detection of cancerous  2308:1520-1531, 1994.  [85] D . B r z a k o v i c , N . Vujovic, M . Neskovic, and K . Fogarty. M a m m o g r a m analysis by comparison with previous screenings. In A . G . G a l e  Mammography,  et al,  editor,  Digital  pages 131-140. Elsevier Science B . V . , A m s t e r d a m , New Y o r k ,  1994. [86] N . Vujovic, D . Brzakovic, and K . Fogarty. Detection of cancerous changes in m a m m o g r a m s using intensity and texture measures. [87]  N . Vujovic, P. B a k i c , and D : Brzakovic.  SPIE,  2434:37-47, 1995.  Detection of potentially cancerous  signs by m a m m o g r a m followup. In K . D o i , M . L . G i g e r , R. M . Nishikawa, and R. A . Schmidt, editors,  Digital Mammography'96,  Science B . V . , A m s t e r d a m , New Y o r k ,  pages 421-424. Elsevier  1996.  [88] M . Sallam and K . Bowyer. Registering time sequences of m a m m o g r a m s using a two-dimensional image unwarping technique. In A . G . G a l e et al, editor,  Mammography,  Digital  pages 121-130. Elsevier Science B . V . , A m s t e r d a m , New Y o r k ,  1994. [89] M . Sallam and K . Bowyer. Detecting abnormal densities in m a m m o g r a m s by comparison to previous screening. In K . D o i , M . L. Giger, R. M . Nishikawa, and R. A . Schmidt, editors,  Digital Mammography'96,  Science B . V . , A m s t e r d a m , New Y o r k , [90]  pages 417-420. Elsevier  1996.  R. G . B i r d , T . W . Wallace, and B . C . Yankaskas. A n a l y s i s of cancers missed at screening m a m m o g r a p h y .  Radiology,  184:613-617, 1992.  [91] J . E . Harvey, L . L . F a j a r d o , and C . A . Inis. Previous m a m m o g r a m s in patients with impalpable  breast carcinoma:  retrospective vs blinded  American Journal of Roentgenology, [92]  interpretation.  161:1167-1172, 1993.  C . J . Savage, A . G . G a l e , E . E . Pawley, and A . R. M h u m a n , to compute divine? In A . G . G a l e et al, editor,  . Wilson.  Digital Mammography,  pages 405-414. Elsevier Science B . V . , A m s t e r d a m , New Y o r k ,  1994.  [93] J . W . B y n g , N . F . B o y d , R. A . J o n g , E . Fishell, and M . J . Yaffe. analysis of m a m m o g r a p h i c densities. 923,  T o err is  Automated  Physics in Medicine and Biology,  41:909-  1996.  [94] J . W . B y n g , J . P. C r i t t e n , N . F . B o y d , L. Little, G . L o c k w o o d , R. A . J o n g , E . Fishell, D . Tritchler, and M . J . Yaffe. A n a l y s i s of digitized m a m m o g r a m s for  120  the prediction of breast cancer risk. In K . D o i , M . L . Giger, R. M . Nishikawa, and R. A . Schmidt, editors,  Digital Mammography'96,  Science B . V . , A m s t e r d a m , New Y o r k ,  pages 185-190. Elsevier  1996.  [95] J . W . B y n g , M . J . Yaffe, G . A . L o c k w o o d , L . E . Little, D . L. Tritcher, and N. F . B o y d . A u t o m a t e d analysis of m a m m o g r a p h i c densities and breast carcinoma risk.  Cancer,  80(l):66-74, 1997.  [96] J . N . Wolfe. Breast patterns as an index of risk for developing breast cancer.  American Journal of Roentgenology,  126:1130-1139, 1976.  [97] J . N . Wolfe. Risk for breast cancer development determined by m a m m o g r a p h i c parenchymal pattern.  Cancer,  37:2486-2492, 1976.  [98] C . E . M e t z . R o c methodology in radiologic imaging.  Investigative Radiology,  21:720-733, 1986. [99]  C . E . Metz.  Evaluation of digital m a m m o g r a p h y by roc analysis. In K . D o i ,  M . L. Giger, R. M . Nishikawa, and R. A . Schmidt, editors,  raphy'96,  Digital Mammog-  pages 61-68. Elsevier Science B . V . , A m s t e r d a m , New Y o r k , 1996.  Information and Control, 8:338-353,  [100]  L . A . Z a d e h . F u z z y sets.  [101]  M . Sameti and R. K . W a r d . A fuzzy segmentation algorithm for m a m m o g r a m partitioning. editors,  1965.  In K . D o i , M . L. Giger, R. M . Nishikawa, and R. A . Schmidt,  Digital Mammography'96,  pages 471-474. Elsevier Science B . V . , A m -  sterdam, New Y o r k , 1996. [102]  J . N . K a p u r , P. K . Sahoo, and A . K . C . W o n g . A new method for gray-level picture thresholding using the entropy of the histogram.  Graphics and Image Processing, [103]  Computer, Vision,  29:273-285, 1985.  C . G . Y . L a u . Neural networks, theroretical foundations and analysis.  IEEE  Press, 1992. • [104]  M . Sameti, R. K . W a r d , B . Palcic, and J . M o r g a n - P a r k e s . Texture featrue ex-  Proceedings of the 1997 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM'97), pages 831-834, V i c t o r i a , B C , C a n a d a , A u g u s t  traction for tumor detection im m a m m o g r a p h i c images. In  1997. [105]  F . A g h d a s i , R. K . W a r d , and B . Palcic. Restoration of m a m m o g r a p h i c images in the presence of signal-dependent noise.  121  SPIE,  1905:740-751, 1993.  [106]  Digitization and analysis of mammographic images for early detection of breast cancer. P h D thesis, University of British C o l u m b i a , Vancouver,  F . Aghdasi.  B C , C a n a d a , 1994. [107]  D . Nesbitt.  A u t o m a t e d detection of microcalcifications in digitized  mammo-  gram film images. Master's thesis, University of British C o l u m b i a , Vancouver, B C , C a n a d a , 1995. [108]  M . Sameti, R. K . W a r d , J . M o r g a n - P a r k e s , and B . Palcic. A method for detection of malignant masses in digitized m a m m o g r a m s using a fuzzy segmentation  Proceedings of the 19th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pages C D - R O M , C h i c a g o ,  algorithm. In  IL, O c t o b e r - N o v e m b e r [109]  1997.  A . D o u d k i n e , C . M a c A u l a y , N . P o u l i n , and B . Palcic. Nuclear texture measurements in image cytometry.  [110]  Pathologica,  87:286-299, 1995.  Fundamentals of Digital Image Processing.  A . K. Jain.  Prentice-Hall, Engle-  wood Cliffs, New Jersey, 1989. [Ill]  M . Unser.  S u m and difference histograms for texture classification.  Trans. Pattern Anal, and Mach. Int., [112]  IEEE  8:118-125, 1986.  C . B . Caldwell, S. J . Stapleton, D . W . Holdsworth, R. A . J o n g , W . J . Weiser, G . C o o k e , and M . J . Yaffe. Characterisation of m a m m o g r a p h i c parenchymal pattern by fractal dimension.  Physics in Medicine and Biology,  35(2):235-247,  1990. [113]  C . M a c A u l a y and B . Palcic. F r a c t a l texture features based on optical density surface area.  Analytical and Quantitative Cytology and Histology,  12:394-398,  1990. [114]  W . J . D i x o n (chief editor).  BMDP Statistical Software Manual.  University of  C a l i f o r n i a Press, Berkeley, C A , 1992. [115]  A . A . Afifi and V . C l a r k .  Computer-aided Multivariate Analysis.  Lifetime  Learning Publications, B e l m o n t , C A , 1990. [116]  M . S a m e t i , J . M o r g a n - P a r k e s , R. K . W a r d , and B . Palcic. Classifying image features in the last screening m a m m o g r a m s prior to detection of a malignant mass. In N . Karssemeijer, editor,  Digital Mammography'98.  Publishers, T h e Netherlands, 1998.  (in press).  122  Kluwer A c a d e m i c  Appendix A  A stepwise discriminant analysis T h e following notation is used for description of the stepwise discriminant analysis:  p =  number of variables (features) available  q =  number of variables (features) entered at a given step  t =  total number of groups  9  =  number of groups used to define the discriminant functions  =  number of cases in group i  n =  = h = hki Pi  total number of cases in the g defining groups value of variable r in case j of group i number of hypotheses coefficient for group i in hypothesis k  —  prior probability of group i  A s s u m e , for simplicity, that the first g of the t groups are used to define the classification functions [114]. T h e following steps of the algorithm intend to find the best variables and the  123  best discriminant functions that can distinguish between the g groups. T h e r e are g groups (out of the total t groups) which are used for the discriminant analysis. In group i (of these g groups), there are ra,- cases. the d a t a , whose group is known.  1  E a c h case is a sample from  T h e r e are p variables available for each group,  and at a given step, q of them are entered into the set of variables for the final discriminant function. D u r i n g this analysis, first, variables are selected according to their calculated F-values, and then in the backward stepping, some of those variables may be removed. T h e stepping continues until the o p t i m u m variables are obtained. Some pre-conditions or hypotheses (/i and pi) can also be involved in the process.  A.l  Step 1  T h e method of provisional means is used to compute the group means,  71,  %ir — £ ^ijr/^i 3  =  % — 1, . . . , t (A-1)  1  r=l,...,p group standard deviations, 1/2 —  I  £  (%ijr  2-t'r)' / ( ^ j  \i=i  1) )  i — 1, . . . , £  J  1  (A.2)  r=l,...,p and pooled within groups sums of cross-product deviations:  '  9  W  rs  — £  n  £  {^ijr  •Eir)('Ejjs  %is)  f — 1) • • • i P  i=  2=1  l  s=l,...,p  (A.3)  In our application, classification of mammographic regions, t and g are both equal to 2 , the normal and abnormal groups. J  124  T h e latter are used to compute the within group correlations: =  Wij/(wtiWjj) !  i =  1 2  l,...,p  (A.4)  3=  Step 2  A.2  Let H =  (hki)  be the h X g matrix of hypothesis contrasts.  If no contrasts are  specified, h is set to g — 1 and 1 hki=  {  i < k  -k  i = k + 1  0  otherwise  These contrasts test the equality of all g group means. T h e stepwise procedure is defined in terms of the matrices  W  =  (A.S)  (W  r  and M  = W +  X'H (HN /  l_ - 1 L T ^ - l - l 1  H')  (A.6)  HX  where X = (x,>) is a g X p matrix, and N is the diagonal m a t r i x \ n i , . . .,n \ g  of  group sizes. T h e entry and removal of variables is defined in terms of the results of sweeping on the diagonal elements of W and M . A s s u m i n g , for simplicity, that the first q variables have already been swept, write r  w  =  W u W12 w  where W  n  2 1  w  2  M i l M12  M  M  2  2  M  i  2  2  and M n are q X q. A t each step let  A  -Wj/  = w  2 1  w 1  w 1  1 1  w  2 2  125  (-w  2 1  1 1  w  2 2  )w 1  (A.7) 1 1  w  1 2  -MJo  =  M-/M22  1  (A.S)  |  M  2 1  M-  M (-M )M- M  1  1  2 2  2 1  1  1 2  B is not actually c o m p u t e d , since only diagonal elements are needed. These diagonal elements are computed from the matrix  A  T  T'  C  W  X  X'  0  L  which is defined at step zero to be  r  and is updated at each step by sweeping or reverse sweeping the diagonal elements of A . the diagonal elements of B are computed using the fact that  B = Q'Q +  A  (A.9)  where Q = (H(N  _ 1  -  C)H')~  1 / 2  H'T'  T h e following statistics are computed at each step:  1. F values for testing differences between each pair of groups: F- —  (n- g- q + l)ninj •Dl/in-g) q(ni + rij) 2  i,j =  1, • • • . ;  (A.10)  where  B% = (n - g)(Xi - X ^ W r ^ X i - X , ) is the (squared) M a h a l a n o b i s distance between groups i and j and where X ; is the vector of means for group i for the q variables that have been entered.  126  2. F values for each variables, if variable r has been entered,  t  a  rr  —  r  - b  rr  b  •  n-g-q  h-  rr  with h and n — g — q+1  +1  -m l - J  /• AA  i i  degrees of freedom. If variable r has not been entered,  a  rr  r  ~  -b b  n- g - q h  rr  rr  (A.12)  with h and n — g — q degrees of freedom.  3. Wilks' A statistic for the hypothesis defined by H , A =  with (q, h,n  det(Wu)/ d e t ( M n )  (A.13)  — g) degrees of freedom. A is computed initially setting it equal to  one and updating it at each step by multiplying its previous value by  a /b , rr  rr  where r is the index of the variable entered or removed at the step.  4.  The F approximation to A  [115],  where  m  q-h+1  — n— q  2  y  h? + q = 5 2  1 T h e numbers of degrees of freedom for  F  are /ig and  (m*s +  1 — hq/2). T h e  approximation is exact if either h or q is 1 or 2.  5. Tolerance values, t = a /w r  rr  rr  , 127  r = q+l,...,p  (A.15)  A.3  Step 3  T o move from one step to the next, a variable is removed or added according to the first of the following rules which applies.  Rule 1 In one method, if one or more entered variables are available and have F values less than the F-to-remove  threshold, the one with the smallest F value  is removed.  Rule 2 In the other method, if one or more entered variables are available, the one with the smallest F value is removed if by its removal W i l k s ' A will be smaller than it was when the same number of variables were previously entered.  Rule 3 If one or more non-entered variables are available, with tolerance above the tolerance threshold, and are either at a forced level or have F values above the F-to-enter threshold, the one with the highest F value is entered.  Variables in forced levels are available only if their level is equal to the working level, if they are not entered, and if they have tolerance above the tolerance threshold. Variables in non-forced levels are considered available only if their level is less than or equal to the working level and, for non-entered variables, if their tolerance is above the tolerance threshold. T h e working level begins at 1 and moves to the next level when none of the rules apply. If the specified m a x i m u m level had already been reached, the specified m a x i m u m number of steps has been reached, or the specified m a x i m u m number of variables has been entered; the stepping terminates.  128  Step 4  A.4  W h e n the stepping is complete, or when the number of variables entered is equal to a specified number, the following are c o m p u t e d .  (1) Group classification function coefficients, f3i = (n - g)W^Xi  (a q x  which are defined as  i = l,...,g  1 vector)  ( A . 16)  and the corresponding constants  &i = l n p i - (n - g)X'^X /2  i=  %  where pi is the prior probability for group i. sub-matrix of  T  a n d X'iW^Xi  —-c,;.  1,...,  g  (A.17)  Note that W T / f X , - is c o m p u t e d as a  (See below.)  (2) The squared Mahalanobis distance of case j in group i from the mean of group k i=l,...,t Q  i D  lik = (  n  9) E E  _  Ou'r  -  x )a (x kr  rs  - x ),  lJS  j = l,...,rii  ks  (A.18)  r = l s=l  k = l,...,g (3) The posterior probability that case j from group i came from group k i-l,...,t P  l3k  = exp(-D? /2) Pk  jk  IJZVr r= l  exp(-D?. /2), r  j  = 1 , . . . , n,-  k=  A.5  l,...,g  Step 5  T h e eigenvalue problem  H X W ' / x ' H ' i i , = H ( - C ) H V = AjHN H'ti,_1  129  (A.19)  where X is a g X q matrix of variables means for all groups, is solved for eigenvalues A i > • • • > \h and eigenvectors ui,...,  Uh normalized so that  njHN-'H'tii = 1  T h e coefficients 7 ; of the i t h canonical discriminant function defined by H are given by  j = W"  1  /  t  2  x'H'^y^^,  i = 1,..., h  (A.20)  T h e canonical variables (vales of the canonical functions)  fijr  —  ^ ^ T r s JXjjs  3-s)  s=l  are also c o m p u t e d .  Classification functions T h e classification functions can be used to classify new cases into the groups used in the analysis. T h i s can be done by defining a new group a n d rerunning the analysis specifying that only the original groups be used to define the classification functions. These functions can also be applied directly on new data. T h e classification scores can be also converted to posterior probabilities. L e t g be the number of groups and s j be the classification score for the i t h case for the 8  j t h group, then the posterior probability that case i belongs to group j is _  exp(s,-j)  E  exp(s )  130  lk  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0065347/manifest

Comment

Related Items