UBC Theses and Dissertations
Quality filtering and normalization for microarray-based CGH data Khojasteh Lakelayeh, Mehrnoush
Altered copy numbers of DNA sequences are a characteristic of solid tumors. Microarray-based Comparative Genomic Hybridization (CGH) has emerged as a promising technology that has the potential to identify minute genomic changes, in the order of single DNA copy number changes, at the gene level. The data to be extracted from the two microarray images of a 2-color microarray experiment, in the image analysis step, are the ratios of the fluorescent intensities of each spot of the microarray in one image and that of the corresponding spot in the other image. Without identifying the sources of experimental error, and correcting for these errors or removing the data corrupted by significant errors, microarray results can lead to incorrect experimental conclusions. This research focuses on improving the "image analysis" step of array-CGH experiments. The aim is to reduce the variability and increase the validity of the experimental results. Two issues are addressed in this work: 1) identifying spots likely to be of poor quality, and 2) normalization of the data to remove systematic errors. In this work, we present a novel approach to quality filtering of microarray spots. We use a variety of shape and image texture measures and design a binary decision tree to discriminate between the spots likely to produce meaningful data and the ones with unreliable measurement data. The proposed procedure is shown to reduce the variability of the data resulting from the low quality spots. In addressing the second issue, possible sources of systematic variations are examined and accordingly a three-step normalization scheme is used to remove these systematic variations. The normalization scheme we used consists of the following steps. The spatial bias of the ratio of each spot is estimated using a sliding window centered on each spot and the median of the ratios of the spots inside the window is calculated. The spatial bias is then removed from the data. In the next step, microplate effects are removed from the data. In the final step, the intensity dependent bias is estimated by fitting a LOESS curve to the logarithm of ratios of spots as a function of the intensities of spots. This bias is then subtracted from the log ratios. This normalization scheme was shown to increase the accuracy and precision of microarray data.
Item Citations and Data