UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Data-efficient deep learning for predictive modelling of conventional single slope solar stills : leveraging transfer learning and tailored data augmentation strategies Migaybil, Hashim

Abstract

Conventional single-slope solar stills are essential for decentralized freshwater production, yet their performance optimization is limited by small datasets and the nonlinear dynamics of desalination. This doctoral thesis addresses these constraints by developing and evaluating data-efficient supervised machine learning frameworks to predict freshwater productivity (Pstd, L/m²·day). The study integrates a novel high-performance solar still design with two complementary learning paradigms: Transfer Learning (TL) and tailored Data Augmentation (DA). The research begins with the design and MATLAB/SIMULINK simulation of a photovoltaic-assisted single-slope solar still engineered for improved thermal performance. The hybrid system achieved a peak efficiency of 45%, and its 730-sample dataset served as the “source” domain for TL. The first paradigm introduces a cross-design TL framework. A source Artificial Neural Network (ANN) was pre-trained on the hybrid system simulation data, and its learned weights were transferred and fine-tuned to model a conventional solar still using only 365 experimental samples. The optimized TL-based ANN (5-64-64-1) outperformed both randomly initialized ANNs and Multiple Linear Regression (MLR), achieving an Overall Index of Model Performance (OIMP) of 0.872 and demonstrating superior predictive accuracy and generalization. The second paradigm develops a tailored DA strategy to directly expand the conventional still’s limited dataset. Gaussian noise–based jittering was applied to sequential inputs within a 7-day look-back window to generate synthetic training data for a one-dimensional Convolutional Neural Network (CNN-1D). The optimized CNN-1D model—comprising three 128-filter convolutional layers—substantially outperformed baseline CNN and Support Vector Regression (SVR) models, achieving an RMSE of 0.045 and an OIMP of approximately 0.97. A threshold-based classification method was also introduced to translate raw predictions into interpretable productivity categories. Overall, this thesis provides a comparative evaluation of TL and DA approaches, validating their effectiveness in addressing data scarcity in solar still modeling. Key contributions include a novel cross-design TL framework, a specialized DA technique for time-series solar still data, and highly accurate predictive models. The findings provide practical, cost-effective tools for optimizing conventional solar stills and underscore the broader potential of advanced machine learning in renewable energy–driven desalination.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International