- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Applying modern machine learning to the number of latent...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Applying modern machine learning to the number of latent variables problem in principal components analysis and principal axis factoring Draper, Zakary Andrew
Abstract
The related questions of how many components to retain in principal components analysis and how many factors to retain in principal axis factoring have been the subject of many studies over the past hundred years. Retaining too many—or too few—components or factors may lead to the development of constructs based on erroneous findings. There are many component and factor retention rules; however, because the validity of these rules is often dependent on the characteristics of the data being tested, no single rule is valid for all datasets. This paper presents a new approach to component and factor retention: using machine learning to incorporate information from several previously developed retention rules—including parallel analysis, the minimum average partial test, and others—into a single, classification function. Four classifiers were trained to predict the number of components or factors in simulated datasets. Three of these classifiers provided the highest overall accuracy of the rules tested and were unbiased in their predictions across 129,600 samples. The best classifier showed an absolute increase in accuracy of 10.9% compared to the most accurate traditional retention rule. These results suggest that use of machine learning classification could substantially improve confidence in exploratory factor analysis findings.
Item Metadata
Title |
Applying modern machine learning to the number of latent variables problem in principal components analysis and principal axis factoring
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2019
|
Description |
The related questions of how many components to retain in principal components analysis and how many factors to retain in principal axis factoring have been the subject of many studies over the past hundred years. Retaining too many—or too few—components or factors may lead to the development of constructs based on erroneous findings. There are many component and factor retention rules; however, because the validity of these rules is often dependent on the characteristics of the data being tested, no single rule is valid for all datasets. This paper presents a new approach to component and factor retention: using machine learning to incorporate information from several previously developed retention rules—including parallel analysis, the minimum average partial test, and others—into a single, classification function. Four classifiers were trained to predict the number of components or factors in simulated datasets. Three of these classifiers provided the highest overall accuracy of the rules tested and were unbiased in their predictions across 129,600 samples. The best classifier showed an absolute increase in accuracy of 10.9% compared to the most accurate traditional retention rule. These results suggest that use of machine learning classification could substantially improve confidence in exploratory factor analysis findings.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2019-10-03
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0383243
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2019-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International