Prediction and anomaly detection in water quality with explainable hierarchical learning through parameter sharing

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Prediction and anomaly detection in water quality with explainable hierarchical learning through parameter sharing Mohammad Mehr, Ali

Abstract

Decisions made on water quality have high implications for diverse industries and general population. In a 2020 study, Guo et al. report that the current literature on modeling spatiotemporal variabilities in surface water quality at large scales across multiple catchments is very poor. In this thesis, we introduce a simple, explainable, and transparent machine learning model that is derived from linear regression with hierarchical features for efficient prediction and for anomaly detection on large scale spatiotemporal datasets. Our model learns offsets for various features in the dataset while utilizing a hierarchy among the features. These offsets can enable generalization and be used in anomaly detection. We show some interesting theoretical results on such hierarchical models. We built a water pollution platform for exploratory data analysis of water quality data in large scales. We evaluate the predictions of our model on the Waterbase - Water Quality dataset by the European Environmental Agency. We also investigate the explainability of our model. Finally, we investigate the performance of our model in classification tasks while analyzing its ability to do regularization and smoothing as the number of observations grows in the dataset.

Item Metadata

Title	Prediction and anomaly detection in water quality with explainable hierarchical learning through parameter sharing
Creator	Mohammad Mehr, Ali
Publisher	University of British Columbia
Date Issued	2020
Description	Decisions made on water quality have high implications for diverse industries and general population. In a 2020 study, Guo et al. report that the current literature on modeling spatiotemporal variabilities in surface water quality at large scales across multiple catchments is very poor. In this thesis, we introduce a simple, explainable, and transparent machine learning model that is derived from linear regression with hierarchical features for efficient prediction and for anomaly detection on large scale spatiotemporal datasets. Our model learns offsets for various features in the dataset while utilizing a hierarchy among the features. These offsets can enable generalization and be used in anomaly detection. We show some interesting theoretical results on such hierarchical models. We built a water pollution platform for exploratory data analysis of water quality data in large scales. We evaluate the predictions of our model on the Waterbase - Water Quality dataset by the European Environmental Agency. We also investigate the explainability of our model. Finally, we investigate the performance of our model in classification tasks while analyzing its ability to do regularization and smoothing as the number of observations grows in the dataset.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2020-09-08
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NoDerivatives 4.0 International
DOI	10.14288/1.0394253
URI	http://hdl.handle.net/2429/75910
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2020-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Prediction and anomaly detection in water quality with explainable hierarchical learning through parameter sharing Mohammad Mehr, Ali

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights