UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Continual foreground segmentation using limited data Osman, Islam

Abstract

Current state-of-the-art foreground segmentation models have achieved great success when trained with massive labeled datasets. However, these models have limitations including 1) they can not capture the fine details of the exterior silhouette of objects. 2) The performance of these models degrades when trained with limited data. 3) The performance of these models is limited in the scope of the problems they were trained on. When those models are trained using a broader scope of problems (i.e., multiple datasets), their performance degrade due to the catastrophic forgetting. This thesis proposes novel deep-learning models that overcome the problems mentioned above. For the first problem, deep-learning models are proposed: MODY-Net, REFNet, and TransBlast. The results show that REFNet solved the problem of missing fine details and achieved state-of-the-art performance. The f-measure performance of REFNet is higher than other foreground segmentation models by approximately 4%. To overcome the second problem of learning with limited data, three novel few-shot learning techniques KTNet, ORIF-TR, and FeSh-Net, are proposed. KTNet combines few-shot and self-supervised learning in both in-domain and out-of-domain image classification. Then, ORIF-TR and FeSh-Net reformulated the few-shot to foreground segmentation. FeSh-Net uses few-shot learning to learn from an exemplar frame to segment the current frame, boosting the current state-of-the-art performance by 7.7%. To solve the last problem (catastrophic forgetting), three novel lifelong-learning techniques with server-client architecture are proposed: TBPI, KDNet, and DGT. These learning techniques allow a large model (server) to learn from continuous data without forgetting. In contrast, during inference time, a small model (client) is instantiated from the larger model (server) to segment foreground objects with high performance and low inference time. DGT performance is higher than state-of-the-art (REFNet) by 4.65% and inference time of 11.4 FPS. Finally, a single model called UFSeg is proposed to overcome all four problems. UFSeg is a combination of the best model proposed in each chapter. These models are REFNet, FeSh-Net, and DGT. UFSeg outperforms state-of-the-art foreground segmentation models by 1.34%. Also, UFSeg can be used on out-of-domain real-world examples such as frames from webcams or public IP cameras.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International