- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Faculty Research and Publications /
- Developing an Ensembled Machine Learning Prediction...
Open Collections
UBC Faculty Research and Publications
Developing an Ensembled Machine Learning Prediction Model for Marine Fish and Aquaculture Production Rahman, Labonnah Farzana; Marufuzzaman, Mohammad; Alam, Lubna; Bari, Md Azizul; Sumaila, Ussif Rashid; Sidek, Lariyah Mohd
Abstract
The fishing industry is identified as a strategic sector to raise domestic protein production and supply in Malaysia. Global changes in climatic variables have impacted and continue to impact marine fish and aquaculture production, where machine learning (ML) methods are yet to be extensively used to study aquatic systems in Malaysia. ML-based algorithms could be paired with feature importance, i.e., (features that have the most predictive power) to achieve better prediction accuracy and can provide new insights on fish production. This research aims to develop an ML-based prediction of marine fish and aquaculture production. Based on the feature importance scores, we select the group of climatic variables for three different ML models: linear, gradient boosting, and random forest regression. The past 20 years (2000–2019) of climatic variables and fish production data were used to train and test the ML models. Finally, an ensemble approach named voting regression combines those three ML models. Performance matrices are generated and the results showed that the ensembled ML model obtains R2 values of 0.75, 0.81, and 0.55 for marine water, freshwater, and brackish water, respectively, which outperforms the single ML model in predicting all three types of fish production (in tons) in Malaysia.
Item Metadata
Title |
Developing an Ensembled Machine Learning Prediction Model for Marine Fish and Aquaculture Production
|
Creator | |
Publisher |
Multidisciplinary Digital Publishing Institute
|
Date Issued |
2021-08-14
|
Description |
The fishing industry is identified as a strategic sector to raise domestic protein production and supply in Malaysia. Global changes in climatic variables have impacted and continue to impact marine fish and aquaculture production, where machine learning (ML) methods are yet to be extensively used to study aquatic systems in Malaysia. ML-based algorithms could be paired with feature importance, i.e., (features that have the most predictive power) to achieve better prediction accuracy and can provide new insights on fish production. This research aims to develop an ML-based prediction of marine fish and aquaculture production. Based on the feature importance scores, we select the group of climatic variables for three different ML models: linear, gradient boosting, and random forest regression. The past 20 years (2000–2019) of climatic variables and fish production data were used to train and test the ML models. Finally, an ensemble approach named voting regression combines those three ML models. Performance matrices are generated and the results showed that the ensembled ML model obtains R2 values of 0.75, 0.81, and 0.55 for marine water, freshwater, and brackish water, respectively, which outperforms the single ML model in predicting all three types of fish production (in tons) in Malaysia.
|
Subject | |
Genre | |
Type | |
Language |
eng
|
Date Available |
2021-09-16
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
CC BY 4.0
|
DOI |
10.14288/1.0402160
|
URI | |
Affiliation | |
Citation |
Sustainability 13 (16): 9124 (2021)
|
Publisher DOI |
10.3390/su13169124
|
Peer Review Status |
Reviewed
|
Scholarly Level |
Faculty
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
CC BY 4.0