UBC Theses and Dissertations
Multi-label learning for image analysis with limited annotation Yu, Tianze
Deep learning, especially through Convolutional Neural Networks (CNNs), has revolutionized image analysis. Image classification, which involves assigning labels to images based on content, has seen significant advancements due to CNNs' ability to autonomously extract image features. Real-world images, spanning natural scenes, social media, medical imaging, and aerial views, often contain multiple objects, necessitating multiple annotations for precise deep learning-based classification. However, comprehensive image annotation is challenging. The sheer volume of images makes exhaustive labeling impractical. Specialized fields like medical imaging or remote sensing demand expert knowledge for accurate annotation, making the process lengthy and expensive. Given these constraints, there's a pressing need for learning with limited annotations, where only a few labels are available per image. To tackle this, researchers are gravitating towards semi-supervised and weakly supervised methods. These techniques utilize available labels to predict missing ones, ensuring minimal performance degradation due to absent labels. This thesis delves into multi-label learning for image analysis under limited annotation, offering insights into this crucial research area. Firstly, we explore multi-label image classification with partial annotations. We introduce an innovative method where annotators label only the most prominent features they're confident about in multi-label images. This reduces potential annotation errors and speeds up the process. Our research suggests that using partial labels can be beneficial, especially in areas where full annotation is challenging or costly. Next, we tackle the challenge of learning from incomplete annotations by examining scenarios where only one positive label is annotated per image. We assess the effects of various label selection strategies and offer practical annotation guidelines. Furthermore, considering that aerial images often cover vast areas with multiple labels, but available datasets are single-labeled, we introduce a self-correction integrated domain adaptation technique. This method leverages abundant single-label images for weakly supervised learning. Lastly, we extend partial annotation learning to hand pose estimation, highlighting that more annotations don't always equate to better results. Annotation quality and the balance between image count and annotation number are pivotal factors.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International