UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A statistical classification of breast cancer patients by degree of nodal metastases Wilson, Sandra Lee


Recently the traditional primary method of treatment for breast carcinoma — the Halsted radical mastectomy — has been challenged. It is felt by some people that other methods may be more appropriate for certain women. Quality of life and the patient's preferences are being considered in addition to the strictly medical aspects of the problem. One procedure that attempts to increase the quality of life for certain women is the selective biopsy. Women who are proven to have lymph node metastases at the biopsy are spared a mastectomy and treated by radiation since surgery cannot remove all of the cancer. A study was undertaken at the British Columbia Cancer Institute of selective biopsy patients diagnosed between 1955 and 1963 in order to assess the procedure in British Columbia. After studying survival for selective biopsy patients and others, it was concluded that the procedure should continue to be recommended. Since only 14% of the patients now referred to BCCI have had a selective biopsy, I decided to try to find a statistical method for assessing the probability of nodal metastases. The problem is one of statistical classification. The literature on the theory of several statistical models was reviewed. Two models were chosen for the problem: linear discriminant analysis and logistic regression. The classification procedure most often used is discriminant analysis. However, the linear discriminant model assumes a normal distribution and common covariance matrix for the vector of observations. Medical data is often non-normal and even discrete. The logistic probability model works well with such data. Both models were then used to study the selective biopsy problem. The patients of the BCCI study were used as a training set to estimate the parameters of the discriminant function and the logistic probability function. Then each estimated function was used to classify the patients as a measure of the goodness of fit of the models. The logistic regression correctly classified slightly more of the patients than the discriminant analysis did. Because of the iterative nature of the logistic regression, the execution time for the logistic regression was longer than for discriminant analysis, but not beyond practical limits. .The variables that were significant in the statistical analyses could be used to help the physician make a clinical assessment of the lymph nodes of a woman with breast carcinoma. The variables indicate areas where further research would be useful.

Item Media

Item Citations and Data


For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.