UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Classification of real and falsified narratives in a repeated-measures text corpus Ran, Wei


A great deal of research has been devoted to finding reliable ways of detecting deception. The current deception literature has recognized that human deceptive behavior is highly complex; behavioral differences due to deception (deception cues) are small and probably context dependent. In general, using a repeated-measure design to control variabilities at the individual level is an effective way of amplifying small behavioral effects; however, no study so far has explored the benefit of adding individual-level random-effects in deception detection models. This dissertation focused on a developing field of language-based deception detection research that utilizes natural language processing (NLP; e.g., Fitzpatrick, Bachenko, & Fornaciari, 2015; Heydari, Tavakolia, Salima, & Heydari, 2015). We tested a novel NLP-based deception detection scheme that utilizes multiple language samples from the same individual. A repeated-measures truthful-and-fabricated text corpus (4 truthful and 4 fabricated statements per individual) from 152 individuals was collected. Truth-telling and fabrication scenarios were created using video recordings of real-life negative events as stimuli. Various sets of cues including n-grams, POS tags, and psycholinguistic cues were extracted from the text sample using NLP techniques. Our results showed that mixed-effects variations of popular classification models including logistic regression, decision tree/random forest, and artificial neural network have better cross-context generalizability than their regular fixed-effects counterparts. This research should encourage further development of repeated-measure deception detection schemes and classification models that can fully utilize such a data structure.

Item Media

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International