UBC Theses and Dissertations
Hierarchical structure and ordinal features in class-based linear models Wang, Wan Shing Martin
In many real world datasets, we seek to make predictions about entities, where the entities are in classes that are interrelated. A commonly studied problem, known as the reference class problem, is how to combine information from relevant classes to make predictions about entities. The intersection of all classes that an entity is a member of constitutes the most specific class for that entity. When seeking to make predictions about such intersection classes for which we have not observed much (or any) data, we would like to combine information from more general classes to create a prior. If there is no data for the intersection, we would have to rely entirely on the prior. However, if data exists but is scarce, we seek to balance the prior with the available data. We first investigate a model where we assign weights to classes, and additively combine weights to make predictions. The use of regularisation forces generalisation; the signal gets pushed up to more general classes. To make a prediction for an unobserved intersection of classes, we would use the weights from the individual classes that comprise the intersection. We introduce several variants that average the predictions, as well as a probabilistic mix of these variants. We then propose a bounded ancestor method, which balances the creation of an informed prior with observed data for classes varying amounts of observations. When dealing with ordinal properties, such as shoe size, we can dynamically create new classes and subclasses in ways that are conducive to creating more informative priors. We do this by splitting the ordinal properties. Throughout, we test on the MovieLens and UCSD Fashion datasets. We found that a combination of the three bounded ancestor method variants resulted in the best performance, and the best combination varied between datasets. We found that a simple model that assigns weights to classes and additively makes predictions slightly outperformed the bounded ancestor method for supervised classification. For the bounded ancestor method, we found that splitting ordinal properties in different ways had minimal impact on the error metrics we used.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International