- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Spline Gaussian Cluster-Weighted Models
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Spline Gaussian Cluster-Weighted Models Xue, Ling
Abstract
Cluster-weighted models (CWMs) are a class of finite mixture models that jointly model the distribution of covariates and the conditional distribution of a response variable given those covariates. They are effective in uncovering latent subpopulations where both the marginal covariate structure and the regression relationship vary across clusters. However, standard Gaussian CWMs require specifying a parametric regression form within each cluster, limiting their ability to capture nonlinear relationships commonly observed in practice.
To address this limitation, we propose a spline-based extension of the Gaussian CWM that uses B-spline functions to model the within-cluster regression structure. A roughness penalty on the spline coefficients is introduced to control smoothness and prevent overfitting, and this penalization is incorporated into estimation through a fully unsupervised Expectation-Maximization (EM) algorithm. The EM procedure updates spline coefficients via weighted penalized least squares while estimating variance and mixing proportions. In addition, we introduce a mixed-type kernel spline Gaussian CWM that extends the framework to datasets with both continuous and categorical covariates. Although not empirically evaluated here, the model is fully formulated, and an EM-based estimation scheme is provided.
We evaluate the spline-based model using simulated and real datasets that capture a range of cluster-specific relationships with varying degrees of overlap and noise. The results indicate that polynomial Gaussian CWMs perform well when the regression structure is globally simple, such as in the Linear Crossover and the real NPreg dataset, where quadratic polynomials recover the generating form effectively. Spline Gaussian CWMs provide greater flexibility in settings with local nonlinearities, such as the Nonlinear Single and Double Crossover datasets, where they align better with the true group-wise regression functions. Penalized spline CWMs show the most stable performance, mitigating overfitting in challenging cases such as the Parallel and Half-Overlapping datasets, and offering improved classification accuracy relative to unpenalized splines. On fluctuating designs,spline-based approaches capture localized variation better than high-degree polynomials, which risk instability at boundaries.
Overall, this thesis develops nonparametric extensions of Gaussian CWMs by integrating spline regression, penalization, and kernel-based methods. These extensions broaden the applicability of CWMs to complex data structures and provide a foundation for future work on mixed-type and nonparametric mixture modelling.
Item Metadata
| Title |
Spline Gaussian Cluster-Weighted Models
|
| Creator | |
| Supervisor | |
| Publisher |
University of British Columbia
|
| Date Issued |
2025
|
| Description |
Cluster-weighted models (CWMs) are a class of finite mixture models that jointly model the distribution of covariates and the conditional distribution of a response variable given those covariates. They are effective in uncovering latent subpopulations where both the marginal covariate structure and the regression relationship vary across clusters. However, standard Gaussian CWMs require specifying a parametric regression form within each cluster, limiting their ability to capture nonlinear relationships commonly observed in practice.
To address this limitation, we propose a spline-based extension of the Gaussian CWM that uses B-spline functions to model the within-cluster regression structure. A roughness penalty on the spline coefficients is introduced to control smoothness and prevent overfitting, and this penalization is incorporated into estimation through a fully unsupervised Expectation-Maximization (EM) algorithm. The EM procedure updates spline coefficients via weighted penalized least squares while estimating variance and mixing proportions. In addition, we introduce a mixed-type kernel spline Gaussian CWM that extends the framework to datasets with both continuous and categorical covariates. Although not empirically evaluated here, the model is fully formulated, and an EM-based estimation scheme is provided.
We evaluate the spline-based model using simulated and real datasets that capture a range of cluster-specific relationships with varying degrees of overlap and noise. The results indicate that polynomial Gaussian CWMs perform well when the regression structure is globally simple, such as in the Linear Crossover and the real NPreg dataset, where quadratic polynomials recover the generating form effectively. Spline Gaussian CWMs provide greater flexibility in settings with local nonlinearities, such as the Nonlinear Single and Double Crossover datasets, where they align better with the true group-wise regression functions. Penalized spline CWMs show the most stable performance, mitigating overfitting in challenging cases such as the Parallel and Half-Overlapping datasets, and offering improved classification accuracy relative to unpenalized splines. On fluctuating designs,spline-based approaches capture localized variation better than high-degree polynomials, which risk instability at boundaries.
Overall, this thesis develops nonparametric extensions of Gaussian CWMs by integrating spline regression, penalization, and kernel-based methods. These extensions broaden the applicability of CWMs to complex data structures and provide a foundation for future work on mixed-type and nonparametric mixture modelling.
|
| Genre | |
| Type | |
| Language |
eng
|
| Date Available |
2025-12-15
|
| Provider |
Vancouver : University of British Columbia Library
|
| Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
| DOI |
10.14288/1.0451012
|
| URI | |
| Degree (Theses) | |
| Program (Theses) | |
| Affiliation | |
| Degree Grantor |
University of British Columbia
|
| Graduation Date |
2026-02
|
| Campus | |
| Scholarly Level |
Graduate
|
| Rights URI | |
| Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International