UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Partially-observed graph abstractions for authorship identification and process interference prediction Plesch, Rudolf


Pairwise interactions between objects can be modeled as a graph, which is a set of nodes and edges that each connect a pair of nodes. We consider the problem of predicting whether edges exist between nodes based on other pairs of nodes that we have observed. From a partially-observed graph we extract several neighbourhood and path features, then evaluate several machine learning algorithms for predicting whether edges will exist between unobserved nodes. K-Nearest Neighbours was found to be the best classifier. We propose the novel use of path on a weighted graph as a feature used for prediction. We apply this abstraction to predicting collaboration between authors in an on-line publication database. The unweighted graph contains an edge if two authors collaborated and the weighted graph encodes the number of collaborations. Prediction error rates were less than 3% under ten-fold cross-validation. We also apply this abstraction to predicting whether processes running on the same hardware will compete for resources. The unweighted graph contains an edge if a process executed in more time than when running with another than by itself. The weighted graph had an edge weight that was the increase in execution time. Prediction error rates were less than 14% under ten-fold cross-validation.

Item Media

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International