Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

An empirical analysis of lexical polarity and contextual valence shifters for opinion classification Longton, Adam 2008

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2008_fall_longton_adam.pdf [ 1.63MB ]
Metadata
JSON: 24-1.0051409.json
JSON-LD: 24-1.0051409-ld.json
RDF/XML (Pretty): 24-1.0051409-rdf.xml
RDF/JSON: 24-1.0051409-rdf.json
Turtle: 24-1.0051409-turtle.txt
N-Triples: 24-1.0051409-rdf-ntriples.txt
Original Record: 24-1.0051409-source.json
Full Text
24-1.0051409-fulltext.txt
Citation
24-1.0051409.ris

Full Text

AN EMPIRICAL ANALYSIS OF LEXICAL POLARITY AND CONTEXTUAL VALENCE SHIFTERS FOR OPINION CLASSIFICATION  by  Adam Longton B.Sc., University of British Columbia, 2003  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  MASTER OF SCIENCE  in  The Faculty of Graduate Studies  (Computer Science)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  October 2008  © Adam Longton, 2008  Abstract This work is concerned with the automatic understanding of evaluative text. We investigate sentence level opinion polarity prediction by assigning lexical polarities and deriving sentence polarity from these with the use of contextual valence shifters. A methodology for iterative failure analysis is developed and used to refine our lexicon and identify new contextual shifters. Algorithms are presented that employ these new shifters to improve sentence polarity prediction accuracy beyond that of a state-of-the-art existing algorithm in the domain of consumer product reviews. We then apply the best configuration of our algorithm to the domain of movie reviews.  11  Table of Contents Abstract  .  Table of Contents  ii iii  List of Tables  v  List of Figures  vi  Acknowledgements  vii  1  Introduction  1  2  Literature Review  5  3  2.1  Subjective Language Classification  5  2.2  Opinion Analysis  7  2.3  Evaluation and Appraisal in Linguistics  8  Framework 3.1  Overview  10  3.2  Corpus Processing  12  3.2.1  4  10  Part of Speech Tagging  12  3.3  Baseline Algorithms  13  3.4  Lexical Polarity and Lexicon Expansion  14  3.4.1  WordNet  14  3.4.2  Log Likelihood  15  3.5  Contextual Valence Shifters  15  3.6  Coverage and Accuracy  16  Experiments and Analysis  17  4.1  Lexicons and Corpora  17  4.2  Iterative Failure Analysis  18  4.3  Baseline Experiments  20  4.3.1  Hu-Liu Algorithm  20  4.3.2  Feature Based Effective Opinion  21  4.3.3  Polysemy and Bipolarity in WordNet Expansion  23  4.4 4.4.1  Lexical Polarity Analysis Intelligent HM Lexicon  24 24  111  5  4.4.2  Polarity in other Parts of Speech and the GI Lexicon  26  4.4.3  Combined Lexicon  27  4.4.4  Nonpolar and Bipolar Words and the Robust Lexicon  29  4.4.5  The Augmented Lexicon  30  4.5  Contextual Valence Shifters  31  4.6  Remaining Language Features Affecting Polarity  37  4.7  Default Scoring Algorithms  38  4.8  Application to Another Domain: Movie Reviews  39  4.9  Discussion of Results  43  Conclusions and Further Work  45  References  48  Appendix Project Code  51  -  iv  List of Tables Table 1: Hu-Liu Implementation  21  Table 2: Hu Lexicon Analysis  22  Table 3: HulexWN Coverage for HL Corpus  22  Table 4: HM Lexicon with WordNet Expansion  25  Table 5: HM Lexicon without WordNet  25  Table 6: HM Coverage for ilL Corpus  25  Table 7: XGI Lexicon Adjectives  26  Table 8: XGI Lexicon Other Parts of Speech  27  Table 9: XGI Lexicon All Parts of Speech  27  Table 10: XGI Coverage for HL Corpus  27  Table 11: Combined Lexicon HX  28  Table 12: HX Coverage for HL Corpus  28  Table 13: Robust Lexicon myHX  29  Table 14: MyHX Coverage for HL Corpus  30  Table 15: Augmented Lexicon myHX+  31  Table 16: MyHX+ Coverage for HL Corpus  31  Table 17: MyHX+ with Negation Tuning  32  Table 18: Augmented Lexicon MyHX+ with Shifters  34  Table 19: MyHX+ with Shifters Coverage for HL Corpus  35  Table 20: HX and myHX with Shifters Coverage for HL Corpus  35  Table 21: Hu Liu Lexicon with WordNet on Pang Corpus  39  Table 22: HM Lexicon with and without WordNet  40  Table 23: Part of Speech Analysis in Combined and Robust Lexicons  40  Table 24: Robust Lexicon with Negation Shifter  41  Table 25: Augmented Lexicon on Pang  41  Table 26: MyHX+ with Shifters  42  Table 27: MyHX+ with Shifters and WordNet Expansion  42  Table 28: MyHX+ with Shifters, WordNet, and Log-Likelihood Expansion  42  -  -  -  -  -  -  -  -  -  -  -  v  List of Figures Figure 1: Opinion Classification System Data and Modules  11  Figure 2: Opinion Classification System Flow  11  vi  Acknowledgements I would like to thank Giuseppe Carenini for his continual guidance, encouragement, and inspiration. I would also like to thank Kevin Leyton-Brown for his insightful suggestions. Many thanks go to those who have helped me along the way, including David Kirkpatrick, Will Evans, Nando de Freitas, Patrice Belleville, Ian Cavers, and Gregor Kiczales. Finally I would like to thank my family for all their help and my wife Sarah for her endless support.  vii  1 Introduction An important problem in the field of natural language processing is the automatic understanding of evaluative text. By evaluative text we mean text in which the author is expressing an opinion or sentiment toward a topic. The internet contains a large volume of this kind of writing and in a given domain there is usually more information than an individual or party could possibly manually process. One example domain is that of consumer reviews. It is commonplace for businesses to provide a mechanism on their website for consumer feedback. Both the business and the consumers value the information found here for making decisions but because of the potentially large amount of information there is a need for fast interpretation and a summary of general trends in the opinions presented. Opinions about businesses and their products and services can also extend beyond their own websites, into blogs and other news media. This information in turn can influence investors and consumers. Reviews are important to many industries. Consider the entertainment industry where opinions about movies and music are highly influential. The service industry is another important domain where for example reviews of hotels and restaurants can be a determining factor in people’s decision making. There is also value in the automatic analysis of public opinion from a political and governmental perspective. Detecting trends in opinions of policies and platforms could help shape the political and social landscape.  The goal of this work is to investigate current methods for discovering the polarity of people’s opinion in text and to present various techniques for improving these methods. By polarity of opinion, we mean whether an author is expressing a positive or negative opinion toward a subject of discussion. This feature of text can be viewed from different levels of discourse. One might be interested in opinions found at the word, phrase, clause, sentence, document, or multi-document level. This depends on the application. However,  1  information from lower levels of discourse might be useful in making predictions about the opinions expressed at higher levels. Methods fall into two broad categories: the lexical and grammatical levels of analysis. The lexical task is to identify polarity information at the lexical or individual word item level. This is commonly done by starting with a lexicon (“seed set”) of known polarity words, sometimes expanding this set with various techniques, and then assigning a polarity score to words in a test corpus using these polar words. Once the level of discourse has been chosen, the grammatical or contextual task is to compute a score for the items at that level using local lexical information and possibly other contextual information.  For example, say we are interested in sentence level opinions. We might assign the words in the sentence a score, such as +1 or -1 (the lexical task), and then add the word scores to obtain a sentence score (the contextual task). To improve results, we might consider the effects of grammatical constructions that affect polarity such as negation and modality. In this case, we must identify these grammatical features in the sentence and compute an adjusted sentence polarity score based on their effect on polarity information contained at the lexical level. For example, negation tends to invert the polarity that would otherwise be derived from the lexical entries alone. Determining exactly how various grammatical constructions can shift polarity is a task in itself.  Our current work is primarily concerned with sentence level polarity prediction. We perform this task by following the commonly applied unsupervised strategy of starting from a set of seed words of known polarity. Our distinguishing contributions include the validation of using polarity words with parts of speech beyond adjectives and the demonstration of iterative failure analysis to refine our seed set and prune bipolar words. Furthermore, we contribute to the classification of various contextual shifters that adjust predictions based on linguistic context. We feel that although the important applications include document classification (reviews, editorials, and so on) or document and multi document opinion summarization, the atomic components of polarity are the lexical items, and a bottom-up approach to computing overall polarity is appropriate. We have chosen the sentence level because it is a convenient and relatively localized context that contains  2  both lexical items and grammatical constructions. We hypothesize that discoveries made at this level may contribute to tasks dealing with higher levels, or at least provide insight into the interaction between polarity-bearing components and polarity-shifting constructions in general.  This level of polarity prediction has also been looked at by Yu and Hatzivassiloglou [YHO3] and Hu and Liu [HLO4]. However, the amount of linguistic knowledge that is employed in both cases is sparse. An important contribution of our work is the integration of knowledge drawn from linguistics to provide a more solid theoretical foundation.  We present various basic algorithms in Chapter 3, and then through an iterative process of failure analysis using a corpus of consumer product reviews, we develop context-based improvements to produce a more accurate algorithm.  To support the external validity of our findings, we have also applied the best configuration of our algorithm to another domain  —  that of movie reviews. We found that  our algorithm performs significantly better than the baseline in this context.  To summarize the key contributions of this thesis:  •  we validate the use of polarity words from parts of speech other than adjectives, including nouns, adverbs, and verbs, for sentence opinion classification  •  we demonstrate the use of iterative failure analysis as a means to efficiently refine polarity word sets by eliminating words that are bipolar (both positive and negative in different contexts) or those which have nonpolar (neutral) senses  •  we show the limitations of opinion classification that uses feature words for additional sentiment information  •  we develop a set of contextual valence shifters that adjust sentence polarity using our test corpus as well as a methodology for discovering further shifters  3  •  we provide a more solid theoretical basis for the use of shifter constructs in inferring overall sentence opinion by correlating our findings with linguistics literature  4  2 Literature Review Recently there has been a growing body of work related to evaluative language processing. The broader area of research that includes this task is concerned with detecting and classifying subjective language. Subjective language, in contrast to objective or purely factual language, can be defined as that which contains evaluations, emotions, and other content that is not factually verifiable but is rather a form of personal expression. Although our concern is primarily with opinions, this work is relevant in that similar techniques can be found across the field and detecting subjective language more generally is needed if opinions are to be extracted from large collections of different kinds of text. Within the area of opinion classification specifically there is also a growing body of research. However, it tends to be light in the application of linguistic features and structure and focuses on more statistical and atomic properties of text. It is our intention to determine the limitations of these approaches and the point where richer linguistic features become necessary to understand opinion of text more accurately.  2.1  Subjective Language Classification  Work has been done by Wiebe et a!. [W+99] to determine what the appropriate categories of subjectivity are and what linguistic features might be predictive of subjectivity. Machine learning has been applied to this task and the need to develop appropriate and accurate classifiers was shown by Wiebe et a!. [W+99] and Riloff et al. [R+03]. To perform supervised learning experiments, text annotated for subjectivity is required. [W+99] have developed such corpora. These corpora constitute a first attempt at this  5  kind of annotation, and are relatively small, but larger corpora have also been recently developed by Wilson and Wiebe [WWO3I.  One of the first issues at hand is the level of classification. Classification can be done at the document level, where entire texts such as news articles are classified as subjective or objective (overall seeming to express opinions or just stating facts). This was investigated for instance in Wiebe et al. [W+O 1]. Alternatively, classification can be done at the sentence level as in [W+99], or within sentences, at the clause or even expression and individual word level, as in the work of Turney and Littman [TLO3].  In addition to the level of classification, the categories of classification must also be decided upon. The simplest and coarsest category set is the binary classification  -  subjective or objective. But because this can be difficult to decide for some sentences, or sometimes a finer level of detail is desired, other categorizations have been looked at. Yu and Hatzivassiloglou [YHO3] look at the polarity of subjectivity, that is, whether an opinion is positive or negative in sentiment. Gordon et al. [G+03] looks at particular kinds of attitudes. [W+99] uses subjective and objective categories but with levels of certainty attached, on a scale from 1 to 4.  One of the first attempts to do classification at the sentence level was the work of Wiebe et al. [W+991. They developed two disjoint, manually annotated corpora consisting of complete articles randomly selected from the Wall Street Journal Treebank Corpus. Four judges each independently tagged the non-compound sentences and conjuncts of compound sentences for whether they were subjective or not.  Once the corpora were tagged, machine learning experiments were performed. Probabilistic classifiers for subjectivity were used that were developed by Bruce and Wiebe [BW99]. These were based on the class of probability models known as decomposable models and were used to find text features that are probabilistically indicative of the subjective tag. Five part-of-speech features, two lexical features, and a paragraph feature were found and used in the experiments. They considered the naive  6  Bayes, full independence, and full interdependence probabilistic models as well as models generated from those using forward and backward search. The model then chosen was the one with the best accuracy for each training set.  Work has been since done to identify richer linguistic features to be used in classifiers. Wiebe [WOO] uses clustering methods, and Riloff et al. [R+03] has looked at extraction patterns. Specific parts of speech have also been more carefully looked at in their usefulness for subjectivity tagging in [WOO] and [R+O3j.  2.2 Opinion Analysis  There has also been work involving the task of predicting the polarity of opinions present in subjective language. One of the main contributions to this analysis is the work by Yu and lzlatzivassiloglou [YHO3j. This was motivated by earlier work done by Hatzivassiloglou and McKeown [HM97], where they took an initial seed set of positive and negative adjectives and then grew this list by looking at the participation of these words in conjunctions. If a known positive word occurs in an “and” with another word, then this word can be inferred to be positive too. This is the same for negative words. Similarly, if it occurs with “but” and another word, the other word has the opposite polarity.  Pang et al. [P+O2] used different machine learning techniques to classify movie reviews as positive or negative in sentiment. They examine the Naïve Bayes, Maximum Entropy, and Support Vector Machines classifiers, and discuss differences between sentiment and topic classification. Turney and Littman [TLO3J have also looked at review classification, but used web search hits to calculate polarity scores. They infer polarity from association with known polarity words using two measures of association  —  Pointwise Mutual  Information and Latent Semantic Analysis. Although not concerned with opinions, Cilibrasi and Vitanyi [CVO4] have recently developed the notion of Google distance to  7  measure semantic relationships between words. This could be useful in performing web based classification tasks like that of [TLO3].  There has also been a line of research that involves opinion polarity detection that is specific to given subject targets. These could be the names of the company or product in question, the movie being reviewed, or a s.pecific actor in a movie, depending on the desired task. Yi et al. [Y+04] have developed a system called Sentiment Analyzer that first extracts what it thinks are the relevant subject terms, and then assigns an opinion polarity to statements about them. This is done using a sentiment dictionary of words and phrases that have been labeled with a known sentiment value. A similar approach has been taken by Hu and Liu [HLO4]. They give scores to sentences that include specific product features based on counting positive and negative constituent words. They also add the context sensitive step of inverting the polarity in the presence of negation words like ‘not’.  2.3 Evaluation and Appraisal in Linguistics  In the field of linguistics, the use of language to express opinions is variously known as evaluation, authorial stance, and appraisal. Linguists organize types of evaluative language into hierarchies and categorize the terms and forms associated with different levels of appraisal into markers of stance. We are interested in modeling some of these markers of authorial stance and applying them to our automatic opinion classification scheme.  Martin’s article in Text [M03] breaks down language used for appraisal into three classes: engagement, attitude, and graduation. Some of these contain features that can shift a lexical item’s contribution of positive or negative opinion to the text. Engagement involves positioning one opinion in relation to another and includes features like projection, modality, polarity, and concession. These will evidently provide a context  8  within which a lexical item can be interpreted and allow for possible adjustments to the item’s polarity contribution to the sentence. For example, projection might involve a report of a polarized statement but not be making such a statement. Concession might be similarly affirming some fact, like “He seemed angry”, while modality might be acknowledging a possibility without committing to an opinion on it, in for example, “It might be really fun”. Graduation involves gradability and includes features that raise or lower the degree of an evaluation, which can again shift the contribution.  The collection of linguistic papers compiled in the book Evaluation in Text: Authorial Stance and the Construction of Discourse, edited by Hunston and Thompson [HTOO], contains discussions of contextual situations that convey or affect polarity. Biber and Finegan’ s stance markers are discussed in [THOOJ and these include the following categories:  1. adverbs indicating affect, certainty, and doubt (e.g. definitely) 2. adjectives indicating affect, certainty, and doubt (e.g. happy) 3. verbs indicating affect, certainty, and doubt; (e.g. enjoy) 4. hedges (vague language e.g. about, sort of) 5. emphatics (e.g. for sure, really) 6. modals indicating possibility, necessity, and prediction (e.g. could be, should have been)  These too indicate features that have the ability to contribute or modify the opinion being expressed in evaluative text. Moon’s Fixed Expressions and Idioms in English: A Corpus-Based Approach {RM98] also has a section on evaluation, and describes situations that cause a shift or “subversion” of evaluative orientation. We will refer to these when compiling lists of polarity sensitive linguistic patterns.  9  3 Framework To achieve the sentence-level polarity prediction task, we have formulated a generalized algorithm that reflects the basic approach taken in the related work and have compiled a set of improvements to this algorithm that we investigate in Chapter 4. The set of improvements includes techniques developed by other researchers, modifications of existing techniques, and techniques we have invented.  3.1 Overview The simplest baseline we can use for comparison is to take the polarity of the most frequent class (e.g., positive) as the polarity of all our test sentences. Thus any algorithm must at least achieve better accuracy than the frequency of the most frequent class. The common aspect of the various sentence polarity prediction techniques in the related work seems to be the use of a polarity lexicon. This is a precompiled list of positive and negative words. These words are either used to assign polarity scores to other words or to simply contribute to the score of a sentence. This is what we mean by a generalized algorithm. Beyond the most-frequent-class baseline, a simple algorithm is to compare the sets of positive and negative lexicon words found in a sentence and to take the class of the larger set as the prediction. This will be taken as the basic algorithm upon which various improvement techniques can be applied.  10  Part of Speech Taggers  Corpora •  •  Hu-Liu Product Reviews Pang-Lee Movie Reviews  • •  NLProcessor Stanford Tagger  Lexicons • • • •  HatzivassiloglouMcKeown General Inquirer Hu-Liu Seed Set Longton-Carenini Refined Lexicons  • •  Lexicon Expansion Algorithms WordNet LogLikelihood  Contextual Valence Shifter Modules • •  Hu-Liu Algorithm Longton-Carenini Algorithm  Figure 1: Opinion Classification System Data and Modules  Our framework consists of the following components: corpora, part of speech taggers, lexicons, lexicon expansion algorithms, baseline sentence scoring and default (tie or no polarity words) scoring, and contextual valence shifters. We have implemented a script pipeline that performs opinion classification with these components. It tags and processes an opinion sentence corpus, expands a polarity word lexicon, and then scores  Figure 2: Opinion Classification System Flow  11  the sentences of the corpus. It does this using the lexicon and contextual valence shifters that modify the polarity contribution of the lexicon words based on various linguistic constructs. Figure 1 lists the data and modules available in our system and Figure 2 describes the flow of inputs and outputs in our script pipeline.  In Figure 2, the opinion sentence corpus is part of speech tagged and then formatted. The formatted corpus is then scored using a polarity word lexicon. There is an optional step of feeding the formatted corpus into one of the polarity lexicon expansion algorithms before performing scoring. These components are detailed in the next sections.  3.2 Corpus Processing  Our experiments require a corpus of text annotated at the sentence level for polarity. Our scoring script takes a part-of-speech tagged corpus in XML format where each sentence is given attributes for its annotated score and its feature words. The first phase in our script pipeline is responsible for this processing.  3.2.1 Part of Speech Tagging  Part-of-Speech tagging is a common operation in natural language processing {JMOOI. It involves assigning a part of speech (e.g. adjective, noun, adverb, verb, preposition) to every word in a corpus, based on a syntactic analysis of the sentence. We experimented with two third party part-of-speech tagging tools, namely NLProcessor [NOO] and the StandfordTagger [S04], which produced similar behavior showing high accuracy. This allowed us to identify lexicon words with the correct part of speech in our corpora.  12  3.3 Baseline Algorithms Our basic algorithm is based on the one proposed by Hu and Liu [HLO4J. We start with a lexicon of polar adjectives scored as +1 or -1 (eg. amazing has the score +1, bad has the score -1). Sentences are scored by taking the average of the scores of all the lexicon words found in the sentence. The polarity of the opinion expressed in the sentence is the sign of this score. Beyond this basic algorithm, Hu and Liu employ a lexicon expansion step to their 30 hand picked adjectives, adjust word orientation in the presence of negation words (like no, not, and yet), and add heuristics to break ties. The first heuristic applied is to take the average effective opinion of the sentence and relies on the fact that Hu and Liu work on product reviews. This means averaging over the counts of polarity words that are closest to the feature terms in the sentence, rather than all the polarity words. If there is still a tie, the polarity of the previous opinion sentence is taken.  As mentioned above, one aspect that distinguishes Hu and Liu’s work from ours is that they are considering names of product features (which they extract before the polarity classification task) as items within the sentence that have a special role (related to the opinion being expressed). This is because the corpus they are using consists of customer reviews of commercial products (like cameras and cell phones), and their task is to recognize opinions toward specific products and their features. Identifying the target of an opinion may prove to be an important factor in opinion analysis, but we have separated this task from that of detecting the polarity of a context already known to express an opinion. We do however include this operation in our set of improvement tools.  The work of Yu and Hatzivassiloglou [YHO3] provides a different algorithm. They also start with a polarity lexicon (this time the HM lexicon), and expand this list. For all the nouns, verbs, adjectives, and adverbs in their corpus (which consists of 8000 Wall Street  13  Journal articles), they compute a polarity score using a log-likelthood equation. Once this expanded lexicon of real-number valued polarities has been generated, it is used to predict the polarity of a sentence by taking the average score for that sentence. We have implemented this algorithm as a component of our infrastructure.  3.4 Lexical Polarity and Lexicon Expansion One of the components of our algorithm is the automatic expansion of an initial seed lexicon to include a greater number of words. This improves the coverage of the algorithm (i.e., more sentences will contain words in the lexicon and will therefore be scored).  3.4.1 WordNet WordNet is a semantic lexicon for English [M+90]. It is a database of dictionary entries along with lists of connections between words. These relationships are of various types, and we are interested in the relations for synonymy and antonymy (sets of words with similar and opposite meaning). Hu Liu describe an algorithm for taking a seed list of polar words and expanding it by adding words found in the corpus that are similar and opposite to the known polar words. The assumption is that similar words carry similar polarity. In WordNet, a given adjective will have a list of senses. Each sense will be associated with a set of synonyms called a synset, and sometimes also two other more loosely associated sets  —  the see-also set and the similar-to set. For example, the  adjectivefast has ten senses. Sense eight has quick as a synonym, is similar to hurried, and says to see also firm. We include options in our infrastructure for varying which of these sets to draw synonyms from.  14  3.4.2 Log Likelihood  The log-likelihood equation of [YHO3] computes polarity based on a collocation assumption: that a positive word co-occurs with other positive words more frequently that with negative ones, and likewise for a negative word. The score for a word W 1 with part of speech POSJ (j can be adjective, noun, adverb, or verb) is /Fre(wiPosiAdin)+e 11 POS Adj) Freq(W ,  ,  /  011 POS Adj) Freq(W ,  ,  where Freq( Wait, POSJ, Adj) is the collocational frequency of all words Wait of part of speech POS, with Adj and e is a smoothing constant (Yu and Hatzivassilogou use a value of 0.5). The sign of the score is the word’s polarity. The outer fraction within the log is a ratio of two values for the word: its relative collocation with positive words and its relative collocation with negative words. When the fraction is greater than one, the log will be positive, corresponding to a stronger association with positive words. Likewise, when it is between zero and one, the log will be negative, corresponding to stronger association with negative words. This process acts as a lexicon expansion step in that it includes all the corpus words in the set, with varying polarity strengths. We can also cut off inclusion below a threshold to avoid deviations due to noise.  3.5 Contextual Valence Shifters One of our goals is to apply more contextual information in calculating a sentence’s polarity. Polanyi and Zaenen {PZO4] provide motivation for this task. They discuss a number of linguistic phenomena, from negation to irony to discourse structure, which could shift the polarity that would have otherwise been predicted by only considering the  15  lexical level. They provide some preliminary ideas on how to implement some of these, but do not report any experimental results.  We would like to better justify their suggestions, and add to this list, by consulting the linguistics literature. Our experimental results section details how we identify some of these shifters in our development corpus and includes descriptions and examples of each.  3.6 Coverage and Accuracy Since our algorithms are based on the presence of scored polar words, not every sentence  we evaluate will have a nonzero average score with a defmite positive or negative orientation. Both null sentences that contain no scored polar words and zero-score sentences where the average of the scored words is zero are possible. But for results to be comparable across different configurations, we need a way to increase coverage of our scoring to 100 percent. Therefore we employ a series of default scoring algorithms to classify the sentences that do not have a nonzero score. These are detailed in the experimental results section.  16  4 Experiments and Analysis In this section we perform a series of experiments that show how we arrive at an improved sentence polarity prediction algorithm.  4.1 Lexicons and Corpora There are two main kinds of data we use in our experiments: polarity lexicons and corpora. Polarity lexicons consist of positive and negative words. We have two main collections: the Hatzivassiloglou and McKeown (HM) lexicon, and the General Inquirer (GI) lexicon. HM contains 1336 (657 positive, 679 negative) adjectives. These were compiled manually by [HM97]. The lexicon we are referring to as GI is actually two subsets of a collection of words known as the General Inquirer database developed by Stone et al.[S+66]. The two subsets we have extracted from the database are those with tags Pos and Neg, corresponding to positive and negative words. In its raw form, our extracted GI lexicon contains 4207 (1914 positive, 2293 negative) words. They are not explicitly part of speech tagged. They instead come with a variable number of other tags (besides Pos and Neg), some of which imply part of speech membership (such as the Noun tag, or the verb tags DAV (verb descriptive of an action), IAV (verb interpretive of an action) and SV (state verb)). Some but not all also have a “comments” section appended to the entry. However, not all words have tags that imply part of speech. Some might have multiple allowed parts of speech that are only suggested by the comments section, and those tags that imply part of speech membership are not given to all words in that category. Because of these inconsistencies, additional processing must be done if the part of speech tags are required, as they are in our work.  17  We have a number of corpora that will potentially be useful in our polarity prediction task. The first is the “customer review data” of {HLO4] (the ilL corpus). This is a collection of commercial product reviews extracted from amazon.com. They are divided among five products (two digital cameras, a cell phone, a dvd player, and an mp3 player), and there is a total of 3908 sentences. There are 1700 of these that contain opinions about features of the products under review (like “digital zoom” or “size”), and they are annotated with scores representing the polarity and strength of the author’s opinion toward each feature reference.  Other corpora that we have collected include the movie review database of Pang et al. [P+02] which consists of sentence summaries of movie reviews annotated with a score, derived from www.rottentomatoes.com.  4.2  Iterative Failure Analysis  Our plan is to first determine what combination of existing techniques provides the best accuracy in the sentence polarity prediction task. We start with reproducing the Hu and Liu algorithm, using their customer review corpus. From there we add various techniques to improve accuracy. In particular, we vary the polarity lexicon and the word scoring method, which involves counting adjectives or using the log-likelihood equation. We also try expanding the lexicons. We can do this by using WordNet synsets or by applying the log-likelihood collocation equationand extracting words with sufficiently polar scores. From these variations we determine the best combination of algorithms and parameters.  In this work we adopt an empirical methodology offailure analysis. This involves the careful analysis of some of the sentences whose opinion polarity we fail to correctly predict. The failures in a subset of the corpus are categorized into basic classes with the  18  help of our linguistic references and the largest class motivates the addition of a feature to the system in the next iteration. For example, many sentences fail because of a lack of adjectives and the presence of unaccounted-for polar words of other parts of speech. Examples of these are given below. This motivated the investigation of the General Inquirer lexicon which contains non-adjective polar words.  This approach is similar to the technique known as boosting. Boosting is a general method for improving the accuracy of any given learning algorithm. It is in this sense called a metalearner, or ensemble technique [MOO]. The idea is to combine many simple and only weakly accurate classifiers (just better than random) into a single highly accurate classifier. The classifiers are trained sequentially, and on the examples most difficult to classify on previous rounds. Boosting was developed by Freund and Schapire [FS96] and they presented it in the form of the algorithm AdaBoost. It has since been proven successful in many learning tasks such as routing, image retrieval and medical diagnosis [SO2] and in natural language processing tasks such as part-of-speech tagging and word-sense disambiguation [E+OO]. Our approach differs from boosting in that we seek to discover the classifiers in our data, which may include those that have been previously identified as well as new classifiers. We want to establish what the important classifiers of opinion are first before combining them in a more optimal way.  When we analyze the misclassifications, we separate the lexical from the contextual failures that is, failure caused by polar words and those caused by language that shifts -  the contribution of a polar word. Using the failures that resulted from ignoring context information as a guide, we hypothesize a collection of contextual valence shifters, based on linguistic knowledge, that are responsible for shifting the polarity being expressed at the lexical level. We then retry the experiment accounting for these contextual adjustments. After obtaining the configuration that gives the greatest increase in accuracy, we apply the system with and without the contextual valence shifters to another corpus to test the range of applicability across domains and to prevent overfitting.  19  4.3 Baseline Experiments  4.3.1 Hu-Liu Algorithm  In the following sections, we present tables of results across product review corpora, details of coverage for the concatenated Hu-Liu corpus (all five products), and lists of example failure sentences. We report our experimental procedure as a sequence of interations of failure analysis from which we measure incremental improvements.  In the first phase of our sequence, we implement the Hu-Liu algorithm. We contacted Ming Hu to ask about acquiring their initial seed set of polar words, but were told that the orgininal 30 were unavailable. Instead they were able to provide an intermediate set of 82 words, which we took as our seed set. With this set we perform the WordNet expansion algorithm but find that this produces results significantly under their reported accuracies [HLO4]. We attribute this to sensitivity on the seed set, given that we did not use the same input to the WordNet expansion algorithm as they did. These results are presented in Table 1. In addition to scoring the five sub-corpora separately and computing the flat average as Hu and Liu have done, we compute the weighted average which gives more weight to the accuracies for sets with more sentences. We also run the algorithm on the corpus formed by concatenating the five sets of product reviews for comparison. This is referred to as the ilL corpus. The Always Positive column shows the proportion of sentences in that corpus that are annotated as having positive opinion. This is equivalent to the results obtained running the algorithm with an empty lexicon.  20  Table 1: Hu-Liu Implementation Accuracy Always Positive Product Canon digital camera 183/236 0.775 Nikon digital camera 129/159 0.811 0.72 1 Nokia cell phone 186/258 Creative mp3 player 419/706 0.593 Apex DVD player 148/34 1 0.434 Average Weighted Average  1065/1700  0.667 0.626  HL Corpus  1065/1700  0.626  Hu-Liu 0.927 0.946 0.764 0.842 0.730  Longton HL 174/236 126/159 196/258 468/706 202/34 1  0.737 0.792 0.760 0.663 0.592  1 166/1700  0.709 0.686  1142/1700  0.672  0.842  4.3.2 Feature Based Effective Opinion  The Hu-Liu corpus consists of sentences that contain an opinion about particular features of the products being reviewed. These features are attached explicitly to the sentences, and one of the components of their algorithm is to use these feature words in breaking ties. When a sentence score is found to be zero, the effective opinion of the sentence is computed. This is the average of the polarities of the lexicon words that are closest to the feature words in the sentence. Furthermore, there is an additional set of logic that is used in the presence of but-conjunctions. The effective opinion of the but-clause takes priority over the overall sentence score. We separate out these two levels of effective opinion usage as separate options in our implementation.  To see how the various pieces of the Hu-Liu algorithm contribute to the overall accuracy, Table 2 shows the accuracies achieved when sequentially enabling different parts of the algorithm. Each configuration is specified by an abbreviated term of the lexicon used followed by the set of options given to the scoring script. The Hu-Liu lexicon expanded with WordNet is abbreviated hulexWN. Optionj refers to adjectives being the part of speech that are candidates for scoring (later we will see n, r, and v stand of noun, adverb,  21  and verb). Option g means using the negation adjustment and d means using a list of stop words across which negation is inactive. This helps increase accuracy by effectively establishing syntactic boundaries. Options e and b refer to the effective opinion tie breaking and effective opinion but-clause components of the Hu-Liu algorithm. The configuration hulexWN -jdgeb corresponds to our implementation of the reported Hu-Liu experiment (called Longton HL above). As seen in Table 2, negation usually improves accuracy, while only some products benefit from the effective opinion options.  Table 2: Hu Lexicon Analysis Accuracy hulexWN Product Apex player 184/341 0.5396 Canon camera 180/236 0.7627 Creative mp3 457/706 0.6473 Nikoncamera 131/159 0.8239 Nokia cell 194/258 0.75 19 —.  Average W.Average HLCorpus  1146/1700 1135/1700  0.705 1 0.6741 0.6677  hulexWN -jdg  hulexWN -jdge  hulexWN -jdgeb  186/341 179/236 486/706 134/159 199/258  0.5455 0.7585 0.6884 0.8428 0.7713  180/341 178/236 478/706 135/159 191/258  0.5279 0.7542 0.677 1 0.8491 0.7403  202/341 174/236 468/706 126/159 196/258  0.5924 0.7373 0.6629 0.7925 0.7597  1184/1700 1169/1700  0.7213 0.6965 0.6877  1162/1700 1151/1700  0.7097 0.6835 0.6771  1166/1700 1142/1700  0.7089 0.6859 0.6718  Table 3: HuIexWN Coverage for IlL Corpus Configuration hulexWN -j hulexWN -jdg -  Accuracies Total Nonnull Nonzero Zero Null NonzeroReal NonzeroEff Coverage Nonnull Nonzero  hulexWN -idge  hulexWN -idgeb  1135/1700 758/1037 716/957 42/80 377/663  0.6677 0.7310 0.7482 0.5250 0.5686  1169/1700 782/1037 728/950 54/87 387/663  0.6877 0.7541 0.7663 0.6207 0.5837  1151/1700 779/1037 764/1019 15/18 372/663 728/950 36/69  0.6771 0.7512 0.7498 0.8333 0.5611 0.7663 0.5217  1142/1700 761/1037 739/1009 22/28 381/663  0.6718 0.7339 0.7324 0.7857 0.5747  1037/1700 957/1700  0.6100 0.5629  950/1700  0.5588  1019/1700  0.5994  1009/1700  0.5935  Table 3 provides a breakdown of the scoring based on lexicon coverage. This breakdown is given for the concatenated HL Corpus as a summary and this pattern is continued in  22  subsequent sections. In the Accuracies section, Total is the sum of three mutually exclusive subsets  —  Nonzero, Zero, and Null. Null sentences are those that do not contain  any lexicon words. Nonzero and Zero are sentences that contain lexicon words and have a nonzero or zero score, respectively (their union forms Nonnull). When the e option is used, tie breaking is attempted on Zero sentences. If a nonzero effective opinion score is found, the sentence counts as Nonzero. In this case Nonzero is further divided into those that were nonzero initially and those that were due to effective opinion tie breaking. This breakdown is reported by the NonzeroReal and NonzeroEff sets. There are two coverage measurements listed. Nonnull is the fraction of sentences containing lexicon words and Nonzero is the fraction of sentences with a nonzero score. Nonnull coverages that are the same as the configuration to their left are omitted to highlight that the configurations contain the same lexicon words.  4.3.3 Polysemy and Bipolarity in WordNet Expansion  Investigating the failures, we find that many are due to wrongly scored WordNet generated adjectives. Often these are words that are either polysemous as both polar and neutral words or even as both positive and negative words. Here are some examples with the scored words marked.  •  The screen is large(-), defined, and easy to read, and the silver unit is naturally cool.  •  The locations of various(+) buttons on one side or the other is somewhat illogical.  In the first sentence, large is being used to express a positive opinion, but was added to our lexicon via its synonym big, which was added as a synonym of our negative seed word bad. The third sense of bad listed in WordNet is a synonym of big, as in “a bad storm  “.  The second sentence is negative, but the neutral word various was incorrectly  given a positive score via its synonym versatile.  23  We try variations on the parameters of the WordNet algorithm. These include the different possible sets to draw synonyms from (the see-also, synset, and similar-to sets), and the number of words to use from the set starting at the beginning (since earlier listed ones often are the more common and sometimes less ironic synonyms).  These didn’t improve results significantly so we consulted Ming Hu on their choices. They confirmed that they used only the synset with no restriction on the number of senses. Our results with these parameters don’t improve our accuracy.  4.4 Lexical Polarity Analysis  WordNet expansion introduces a significant amount of noise to our lexicon. To see if the small seed set provided by Hu and Liu was simply too sparse or if WordNet expansion helps at all for a larger seed set, we investigate the use of a larger manually constructed lexicon of polarity words.  4.4.1 Intelligent HM Lexicon  We run our scoring algorithm on the Hu-Liu corpus this time using Hatzivassiloglou and McKeown’s manually constructed lexicon of 1336 positive and negative words. This intelligent lexicon outperforms Hu-Liu’ s WordNet expanded seed set, as seen in Tables 5 and 6. We even find that WordNet expansion negatively affects HM (adding mostly wrong-scored words). This is reported in Table 4. We find that with the intelligent lexicon, effective opinion tie breaking does not improve results over the default previous sentence score. To measure statistical significance of the improvement in accuracies across our subcorpora, we perform a two-tailed paired t-test (described in [C95J) on our  24  Ui  zzOzzz1zzI—  99999  9  —  .—-.  ——  •—  J • ——  CC CC  cU’oooo DjC(C  .  —  -  L’-)  J1  C  C C  — —1  c  00  C  I’J  CCCCCCC  C  —uC—  —0000C D0000C  CCCCC  -00\-  Ct-J  00JL  —C—C--J—  oobo-  (—_.)  c CCD  LC00L) 00 D.C-. • CJ—  C C  —  .-. —  c C  -  oo  C -‘.-I  C C C C C  C  —  CC— 00” UiUi—  CCC  CC CC  ——  .—  —  CC  —— Jt)  UiL t) L.) 00  999  CC CC  ‘-  ——  .—-•..  I)  —— J (..)ç)  CC CC  —1  C C  C  rj  oo C  —  —  CCC C4 C 00 ‘.0  oooob  CCCCC  00 ‘,Q  uuCt_  00 00 •••• t’) —  —  —•  ..  ‘-..  Ui4  -. —.  .-. —  bb  CCCCC  C’..)4• oo’.0a—  —  C—c0o  ———  —‘.C  99999  00’.a  “..  ‘.cJC00— C)\O0000  4C00C  CCCCC  oo’.aa—  tJ — 00 — 000000Q  C. CD  z  C’ CD rO  ro  C’  -.  ::-::;:; ‘.0  CC( J(C Ji Ui C  CCC  CC  ——  00\C.  —  —  z  a CD  CD  —  a  0  C  —. CD  :::  ooo-ja’i’j  ——  -o  —  \C ,C  —  —  C.D4U  LLIi00—-  CD  f-’>  -I  00  E’JQ  —  c4  (I(M  c  cc cc C  —  0  ooc -‘ j  -  (D  o—cc  ooocccccr  — — .Ui  —  —  5—  -00’.0 00—-00Q.Ui  000  CC CC  —--  ——  •5-55  —— —— Ui0  CCC —--a--C— CCUi Q’0C  CC CC  -_)-5  ——  —5-5-  —— )  ‘.0—00 4t)L  ‘.c—i.  99 o-.—;—i  CC CC  — — —--  -5-.  OOC  —  -  -  —00C )— ‘.00°C  999  CC CC  —— —‘-.D  —-  4 ‘.0  —  —  0<’  ()It  r•  —  —  —  Ui  -.-‘  ,-.--  —S  •S  5--  5-—  ‘.0—  Ui00LJ  C———)  CCCCC  UiUiCUi-  5--  —C4Ui’.0  ——4—  00I—)  CCCCC ccUi0C4 Ui4rJ  00’.0CC-  ———  ‘.c’.coo—  0JUit’J—-  000’.0-J UiC00——  CCCCC  — —a .) Ui Ui C Ui 00’.-  -•-  — — — .0.)\000—  00’.0C-..O.  Ui—1UiC’00  ‘  99999  UiUiCt.  ‘—  J—JUi  —  ‘.0 00 ‘.0 Ui-—C’.0  —  —  CD  —.  C 0 CD  Z  -  CD  Z  -  .!.. C-  Z  ‘C  -  —.  C  0  ICD  Z  CD  -  —  CD  CD  C) CD  C)  I-.  I-.  CD  “  I  CD  C  C  CD —  CD  c’ ri  C)  C)  CD  c,  C)  CD  c,  C  c’  C • C  CD C  4.4.2 Polarity in other Parts of Speech and the GI Lexicon  Failure analysis of the results using HM reveals the next major source of error: failure due to sparsity of adjectives and the non-scoring of clearly polar non-adjective words. The following are some examples:  •  It would only transfer 30 or so songs, and then come up with an error(-).  •  I gave it only 3 stars due to the fact that the 1st one broke(-) when I dropped it from a fairly short distance (less than 2ft).  •  The “scene” mode works well(+)for the remainder of shots that are not going to be in a “regular” setting.  These sentences would be scored correctly if the marked noun, verb, and adverb were included in our lexicon. To address this, we test the General Inquirer (GI) lexicon. We develop the best possible tagged form we can manage automatically, which we call the “XGI” (extended GI). Each word is given all of the adjective, noun, adverb, and verb parts of speech, at the risk of some error due to polarity-disagreeing polysemy across parts of speech for a given word. The results show an increase in coverage as well as the potential of all of the added parts of speech helping to improve accuracy. The adjectives from XGI are tested alone for comparison to HM in Table 7, and Table 8 shows our tests of the different parts of speech in isolation and in combination. All four produces the best accuracies so far (the best being 0.786) and this is reported in Tables 9 and 10.  Table 7: XGI Lexicon Adjectives Accuracy xgi —j HLCorpus 1185/1700 I 0.6971 -  xgi -jdg 1230/1700  I  0.7235  xgi -jdge 1218/1700  I  0.7165  26  CD  —.  —.  C)  i-..  0  CD  CD CD  CD  CD  o  CD  CD  CD  c  C)  —  CD  0  CD  )  CD  o o  CD  C  o  CD  CD  .  ‘-‘  CD  C  —.  —.  .  ZZOZZZNZZ-3  —  —  —  © — — —  ,-  —  r -  —  —-D ) 4 — - —  —.) .) 4  C .)  —) C C -  —  -.-  —  L.  —  00  —  C  :•  c  c  .-)  —  —  )  ——  ----)  0CC  CC CC  —— —-  C 0  — — (-  —  —tC  —  LC  CCCCCCC  — C’ \CUi  4.L1\C Ui — — — — — (.) . —  t.4CCt’J  c  -.  -.  44C  CC© -.-J UiUiIX ‘.C”CU  CC CC  --.-  ——  ‘Jt’) CC ———  ——  00004 0000\C  C  —  9C9 —) — — 4 rC UiUic.  —  —  —  —  CC CC  —  —  —  -LIiJ00 4CQ0 >4 ro  -.  -.  ra  O  —  C)  LIiUiOO  -)c’J———  C  ro  I -.  ro  —.  >4  —— >4  —  rj  —  >  crji  Cc0C9  (-_  .)000—  —  — —  —  0©999  -  —--c c  c. o — r.. r—i  00  t  0—  D  McI  00  :L)  cc  cc  cc  —-•.:i  —  )L)  —  -  tD  CrQCO  ‘DNN  00 © O 00 CD  —  —  — —  —J\0 C-  CJC  0Q000-.1  COCOC  rJiUiCUi4. 00CC’C’—  t’J—Ui— CUiCC. CC4)  C)C)  ‘iDC’  0000—00C CC’JUi00  CCCCC  00C’C’—  tJ—LJi—’) C r — ‘C Ui 00 00 t 00 — L’J —. —)-JUi — .— UiUiCUiJ  00CC00  —00C’000 00 .I C’ Lj) — (J)\C00  C00CC  Ui Ui C LJ 00 O C’ 0  ‘  t) -l 00 00  C ) C’ \0  —  -i  C  -  cDrDC  CD  <  -.  I  -.  >4 ro  .  I  -.  —.  >4  -  —.  DO  >4  R  -‘  -  Cd)  C  —  —.  >4 rO  Ui  —.  >4 0  U Ui  L’J  -.-  C  —.  >4 JO  Uil  00  — —  >4  >4  0  I  0 —.  —  C  —a  --a  —  Ui  ‘.0 -  C  —a  C  C  —  C  Ui  C  >4  —.  —.  00 >4 00 JO  —  C’  9  C C  —1  U ‘.0  C ro  . --a C o C --a  —  00 Ui  ‘.0 >4  0  Iii  —.  >4  —  00  00  C  C  -‘  -i  I  —.  ‘.0  Ui  c  9  C C  —.  C —a  —  —  00  ‘.0  üi  C  c  -I  —)C C C  a  C’ U  9  C  C  —:i  —  —  —a ——  C 0  —  C’  D  —a  9  C C  —-a  00  —.  -J ro  C  —  C  C C  —a  C  C  —a  C  —I  ‘.0—.C—.”0—.  ‘.c)>4  —  ‘.0 -  C  —O-C C C C  Cl  Ui  —  — —>4 —>4 —OC000rO  ‘.0  c  C  .—) C C  UI  —. 00  —  —.  C  CI)  C  C  -  C  -  a positive and negative entry for help: V, and HM and GI had a conflict in polarity for flashy:J (positive in HM and negative in GI). We removed these words in constructing the union. Tables 11 and 12 report the results. Performing the two-tailed paired t-test this time between the HM and Combined lexicon results yields a p-value of 0.011, again showing a statistically significant increase in accuracy.  Table 11: Combined Lexicon HX Accuracy hx —jnrv Product Apex player 219/341 0.6422 Canon camera 201/236 0.8517 Creative mp3 482/706 0.6827 Nikon camera 134/159 0.8428 Nokia cell 205/258 0.7946 Average W. Average IlL Corpus  124 1/1700 1241/1700  hx -jnrvdg 246/341 197/236 530/706 134/159 210/258  0.7628 0.7300. 13 17/1700 0.7300 13 17/1700  Table 12: HX Coverage for IlL Corpus Configuration hx -jnrv hx -jnrvdg Accuracies Total 1241/1700 0.7300 1317/1700 Nonnull 1065/1416 0.7521 1135/1416 Nonzero 963/1242 0.7754 1028/1247 Zero 102/174 0.5862 107/169 Null 176/284 0.6197 182/284 NonzeroReal NonzeroEff Coverage Nonnull 1416/1700 0.8329 Nonzero 1242/1700 0.7306 1247/1700  hx -jnrvdge 0.7214 0.8348 0.7507 0.8428 0.8140  249/341 194/236 526/706 134/159 211/258  0.7302 0.8220 0.7450 0.8428 0.8178  0.7927 0.7747 0.7747  1314/1700 13 14/1700  0.79 16 0.7729 0.7729  -  hx -jnrvdge 0.7747 0.8016 0.8244 0.6331 0.6409  1314/1700 1134/1416 1116/1386 18/30 180/284 1028/1247 88/139  0.7729 0.8009 0.8052 0.6000 0.6338 0.8244 0.6331  0.7335  1386/1700  0.8153  28  4.4.4 Nonpolar and Bipolar Words and the Robust Lexicon  The next iteration of failure analysis reveals nonpolar and bipolar words in this combined lexicon. Polarity of words across their different senses varies and so given the context, some of our lexicon words are either used in a way that does not carry the labeled polarity or indeed carries the opposite polarity. Examples of these are as follows:  •  The main(+) problem with the Nomad Jukebox Zen Xtra 30GB is the software.  •  It looks very cool(-), and seems quite small(-) to me and very light.  In the first sentence main is neutral instead of positive, and in the second sentence both cool and small are positive instead of negative. We subjectively remove words from the subset found in the corpus that are deemed nonpolar or bipolar, but without direct reference to the failures so as to partially but not exhaustively prune the combined lexicon in a subjective but domain-independent fashion. This robust lexicon significantly improves performance to 0.835. The complete results are shown in Tables 13 and 14.  Table 13: Robust Lexicon myHX Accuracy myhx -jnrv Product Apex player 228/341 0.6686 Canon camera 200/236 0.8475 Creative mp3 510/706 0.7224 Nikon camera 135/159 0.8491 Nokia cell 2 19/258 0.8488 Average W.Average FIL Corpus  1292/1700 1291/1700  0.7873 0.7600 0.7594  myhx -jnrvdg  myhx -jnrvdge  245/341 209/236 570/706 140/159 228/258  0.7 185 0.8856 0.8074 0.8805 0.8837  249/341 2 13/236 565/706 137/159 225/258  0.7302 0.9025 0.8003 0.8616 0.872 1  1392/1700 1392/1700  0.8351 0.8188 0.8188  1389/1700 1389/1700  0.8334 0.8171 0.8171  29  Table 14: MyHX Coverage for HL Configuration myhx -jnrv Accuracies Total 1291/1700 0.7594 Nonnull 961/1154 0.8328 Nonzero 912/1069 0.8531 Zero 49/85 0.5765 Null 330/546 0.6044 NonzeroReal NonzeroEff Coverage Nonnull 1 154/1700 0.6788 Nonzero 1069/1700 0.6288 -  Corpus myhx -jnrvdg  myhx -jnrvdge  1392/1700 1028/1154 978/1072 50/82 364/546  1389/1700 1029/1154 1021/1143 8/1 1 360/546 978/1072 43/71  0.8171 0.8917 0.8933 0.7273  1143/1700  0.6724  1072/1700  0.8188 0.8908 0.9123 0.6098 0.6667  0.6306  0.6593 0.9123 0.605 6  4.4.5 The Augmented Lexicon  Failure analysis then reveals a number of remaining errors due to missing scored words. The following examples show words that are missing from the robust lexicon that are needed to predict the sentence’s opinion. They are marked with the needed polarity.  •  The included lens cap is very loose(-) on the camera.  •  The cool(÷) thing about the ad-2600 is that itplàys a lot of differentfile types.  We finally top-up the lexicon with these domain-dependent words in order to reduce our error set to those errors of the more rare but more sophisticated contextual type. Accuracies improve accordingly and are reported in Tables 15 and 16. We perform the t test between  the Combined and Augmented lexicon and obtain a p-value of 0.027,  showing another statistically significant increase in accuracy.  30  Table 15: Augmented Lexicon myHX+ Accuracy myhx+ -jnrv myhx+ -jnrvdg Product 235/341 Apex player 0.6892 255/341 0.7478 Canon camera 207/236 0.8771 214/236 0.9068 Creative mp3 521/706 0.7380 575/706 0.8145 Nikon camera 136/159 0.8554 141/159 0.8868 218/258 Nokia cell 0.8450 227/258 0.8798 Average W. Average HL Comus  13 17/1700 13 16/1700  0.8009 0.7747 0.7741  1412/ 1700 14 12/1700  0.8471 0. 83 06 0. 83 06  Table 16: MyHX+ Coverage for HE Corpus Configuration myhx+ -jnrv myhx+ -jnrvdg Accuracies Total 1316/1700 0.7741 1412/1700 0.8306 Nonnull 1019/1211 0.8415 1088/1211 0.8984 Nonzero 971/1128 0.8608 1039/1130 0.9195 Zero 48/83 0.5783 49/81 0.6049 Null 297/489 0.6074 324/489 0.6626 NonzeroReal NonzeroEff Coverage Nonnull 1211/1700 0.7865 Nonzero 1128/1700 0.7865 1130/1700 0.7865  myhx+ -jnrvdge 259/341 218/236 571/706 138/159 226/258  0.7595 0.9237 0.8088 0.8679 0.8760  1412/ 1700 1412/1700  0.8472 0. 8306 0. 8306  -  myhx+ -jnrvdge 1412/1700 1092/1211 1084/1201 8/10 320/489 1039/1130 45/71  0.8306 0.9017 0.9026 0.8000 0.6544 0.9195 0.6338  1201/1700  0.7865  4.5 Contextual Valence Shifters The next stage involves the study of the effectiveness of the valence shifting operations performed by the Hu-Liu algorithm [HLO4], an implementation of some of the shifters suggested by Polanyi and Zaenen [PZO4], and some newly classified valence shifters. Hu-Liu applies polarity shifting logic for negation and handles but-conjunction in a special way. Polanyi and Zaenen suggest the shifting capacity of modals and presupposition. Finally, we discover cases where contrastives, adverbs of excess and sufficiency, and hedging can shift polarity. We also note that the conjunction but is  31  common in reviews but we discover that its usage is complex and varied, and its other usages are more frequent than its negating behavior.  We experiment with various parameters to the shifters. One is the subset of the sentence that is affected, including attention to stop-words that form syntactic boundaries. Another is the operational word sets that are associated with each shifter context. Accuracies improve sequentially as we tune each shifter and apply them to our corpus with our various lexicons.  We begin with refining the negation shifter. By varying the keyword set and the window of activity, we tune these parameters to optimal values. Including the word no helps now with nouns present and a window of six words instead of Hu and Liu’s five optimizes the results. These improvements are summarized in Table 17.  Table 17: MyHX+ with Negation Tuning Configuration myhx+ -jnrv myhx+ -jnrvdg Accuracies Total 1316/1700 0.7741 1425/1700 0.8382 Coverage Nonnull 1211/1700 0.7124 Nonzero 1128/1700 0.6635 1128/1700 0.6635  myhx+ -jnrvdge 1426/1700  0.8388  1201/1700  0.70647  Analyzing the failures from the tuned augmented lexicon experiment, we note the presence of various polarity shifting constructs. One of these is where a comparison is made between the main subject of a sentence and another subject for the sake of contrast. Polar words used in the auxiliary clause are usually of opposite polarity to the opinion being expressed toward the main subject. These clauses are often marked with a contrastive keyword like although or despite, so we develop a shifter pattern called a contrastive to adjust sentence polarity in the presence of such clauses. The following examples benefit from modeling the contrastive shifter by inverting the polarity of the lexicon words within the contrastive clause.  32  •  Despite(c) this minor disappointment(-), I highly recommend the Canon G3 to anyone who is serious about digital photography.  •  Although(c) Ifind it more convenient(+) to use 1-touch dialing, this phone does not have voice dialing.  Another polarity shifting construct is found in the use of modality. This is where an evaluation of a subject is suggested if some other condition had been true, thus conveying the opposite opinion in that condition’s absence. These constructs are marked with modal words like would and should, and most commonly shift opinion when followed by be or have been. Predicting the polarity of the following sentences fails without modeling the inverting function of modal phrases.  •  A spare battery would have been(m) great(+).  •  At the very least, a sturdier more protective carrying case would be(m) nice(+).  The lexicon words are labeled with their polarity, which needs to be inverted by the modifying modal phrase.  Sentence polarity can also be shifted by the use of presupposition. This occurs when a polar word is modified by a word that presupposes an expectation that was not met. The following are examples that require an inversion of polarity based on the presence of presuppositional terms.  •  It’s way less(p) expensive(-) than the iPod.  •  The front panel refused(p) to clip in correctly(+), leaving a noticeable gap between the panel and base of the player.  •  Forget(p) about the sleek(+) looks if it can’t play some of your real dvds.  •  The remote is a little hard to(p) understand(+).  We developed shifter functionality into our algorithm for each of these constructs and the results are summarized by Tables 18 and 19. We see an increase in accuracy for each  33  construct individually and when used in conjunction with one another (accuracy is 0.868). Also we note that although effective opinion tie breaking improves coverage it does not improve accuracy in these cases. Performing the two-tailed paired t-test now on the results of the tuned Augmented lexicon with and without contextual valence shifters yields a p-value of 0.028. This shows another statistically significant increase in accuracy.  Table 18: Aurnented Lexicon MyHX+ with Shifters Accuracy myhx+ -jnrvdgcmp myhx+-jnrvdgcmpe Product Apex player 266/341 0.7801 268/341 0.7859 Canon camera 215/236 0.9110 218/236 0.9237 Creative mp3 593/706 0.8399 589/706 0.8343 Nikon camera 142/159 0.8931 140/159 0.8805 Nokia cell 236/258 0.9 147 23 1/258 0.8954 Average W. Average FIL Corpus  1452/1700 145211700  0.8678 0.8541 0.8541  1446/1700 1446/1700  0.8642 0.8506 0.8506  34  — 0 00 J — — — 00-_——_ 4 (it t’J - —  ——  00  040 oooc  -—0—L) 4L1iL.) — (11 00 0  0) oo  I’J.t  boooo—  JL4—00 LIt -cito  ‘J( 0—  —  99999  0CDO0O0  ‘C—O000  (LJtC—  4-C04 ‘.0 - ‘.0 L t)  99  QO c.)-_  000 000  ——  Lit  00000  ‘.0LitL)—00Jt-J —CLit4  (—.: 000 C00  -  .I0  -(M0  —--a  QJ  00  00  00  0—  <  <  —  - —  -t0  —(itO  -—  .1 0  00 00  —  0 —  —“0  — —  —oooo.  Li  0—  (-‘C——  0 0 0 0 9  bo  ——  -  <  -.  >t  <  -t  -.  -  — —  -—0 0(0  (J_-._  0 0  00 00  —-  ;:  4— 0 t)  00O000  ()(.  00  z  — —  99999  99  -  00 00  - —a  —0 —0  00 — 0 —  -.-  4—  t) .  —  — —  eD  zzOzzz000000  t  —  0  0  •I  (J  C  0  CD C’) CD  i  Vt ‘.0  0  —.  9  0 0  C  ‘.0  (it  0’ 0’.  9  0 0  —_1  —  (  — —  —I —  00  0 0  —1  0’00’.0’.000 —J)Lit  0 0  0  9999999  C.) L’J  ‘.0  00  C.) cit C C.) 0 0’.  —0---’.C-.  0’0’\C’.000  —  (—0 t’J—0  —J-  99999  00 ‘.0  003’..—Ut  C..)  (P  Lit  0  —.1  9  __  0  —1  —  ‘.0  — —  0’. .J ‘.0  0’.  9  0 0  —  —  I’-)  — —  ‘.0  Lit  S  C)  (.2.  t  S  S  C)  —<  -.  +  -.-  0  0 C  0’. ‘.0 0’. 00 ‘.0 ‘.0 00 Cit(j)0’.-•-——. ‘.Lit0000L1t  0 0  —  —  0  L’-)  0000000  0  C  -.  +  — —<  00 (0  ‘JJ-_  —  (it ‘J 00 .— —  —  0’. 0’— ( I’) (Ji4  99  0 0 0 0  -  00  () —  r  00  —  NtD  000  ZZO  0  9.  —  Cd)  00  +  — —<  C—0 (—0  0’.0’.000  ‘.0 -  L)  0  99999  0 0  —  Cit  OO€ 00 ‘. ----  — 00 —  -  (  0—  (000—  .— —  Ci  0  —  Lit.  C— 0’. — ‘0’.’.)  99  00 00  ---  z ——  —J () —  ——  o—cc  0  CD  —  )  CD  O  -  —.  CD  C,)  CD  CD  CD  C,)  o  NN  zzOzzzzz COCO 0 0C  —1 0 0  00 0’. 00 — ‘.OLitOO  0’.00o  00  ()  99999  c.)  I’) 0 00  (M  N  1’.)  —  0—0 00  t’JJ—  0’00’.0’.000  ‘.0—0  —— —  9999999  _  00  0C.)  — C.) — — — — 000J00—  00000000  0’.0’’.C\000  —‘.0  -.  +  C)  —<  -.  +  >  C.)  +  >  -.  +  >C  5  — —  ‘J—0 —0  —  99999  ‘.0  00  40(  0’’.00’—’.00 00 a’. t— — — 0000’.——  0000000  00  00  —‘.0  ‘.00Ci  -  0  N  ZZNZZ  00O0  I  0  O  CM  9  0  CD  0  CD  +  —  ‘4  (D  We note that accuracies are increasingly improved by the shifters as the lexicon becomes more refined.  There aie a number of other kinds of contextual valence shifters that we have discovered and experimented with in the Hu Liu corpus. One is where the conjunction but, or synonyms except and however, are used to contrast two subjects. When they are used for this function, they are similar to the class of contrastive shifters we identified but follow rather than precede the polar word they are shifting. We found that although these shifters are responsible for a number of failures, the negating function of these conjunctions is only one of their uses. They are even more often used in waS’s that do not directly shift the polarity of the sentence away from the preceding polar words. Thus they are not by themselves effective shifters in the Hu Liu corpus, producing more false positives than successful polarity shifts. These sentences are examples of but conjunctions acting as negating shifters, where the annotated sentence polarity is given at the beginning of the line:  • (-) Nice(+) machines, but(b) I consider their quality pretty low now. •  (+) I was hesitant(-) given the price, but(b) I’ve been extremely impressed since receiving it.  These sentences are examples where but is not shifting the sentence’s polarity and would result in a failed prediction:  •  (+) Basic usage is easy(+), but(b) the remote has a lot of buttons that I haven’t used.  • (-) A little weighty(-) (9 ounces... no biggie), but(b) otherwise fine. Another shifter construct responsible for failures is the use of a class of adverbs known as adverbs of excess and sufficiency. These are words like too and enough that express an excess or sufficiency of a quality of a subject that is described by a polar word. They are similar to our presupposition shifter but rather than negating, these words impose a  36  polarity themselves and remove the otherwise polar content of the polar words they modify. Here is an example containing both kinds of adverbs along with negation:  • (-) The scroll button is overly(x) sensitive(+) at times; not(g) sensitive(+) enough(x) at others.  The word overly imposes a negative polarity while enough imposes a positive polarity. Resolving precedence in the presence of other negating shifters however is beyond the scope of our current system and as this feature is sparse in our data, we omit this shifter presently. It is however another construct deserving further investigation.  There are also a number of failures due to the construct known as hedging. This is where the impact of an evaluation is reduced by also making a contrary statement with slightly less evaluative force. This can be noted in the following examples.  • (-) It doesn’t have ft rewire, not(g) a real complaint(-) since most windows users don’t generally have firewire cards themselves.  • (-) It could be a little bit bigger, but it’s easy(+) to get used to. Further experiments measuring strength of opinion in addition to direction could benefit from modeling this construct.  4.6 Remaining Language Features Affecting Polarity Finally we are left with a number of classifiable errors and anomalies. These include language features such as irony and idiom, as well as complex usage and multiple opinions. The GI lexicon has a tag for idioms, and this could form the beginnings of an idiom database that could be used as a component in the polarity classification algorithm. The following are some examples:  37  • (-) 3/4 of the way through the first disk we played on it (naturally(+) 31 days after purchase) the dvd playerfroze. •  (+) I bought itfor my trip to Buenos Aires, and also used it at the Iguazu Falls, and could not(g) have asked for more perfect(+) performance!  • (-) No games - it has a cool(+) screen - why not use it? •  (+) The small size is perfect(+)for my little hands, but may perhaps be uncomfortable(-) or awkward(-) for a bigger person.  The first sentence has an ironic use of a positive word; the second an idiomatic use of negation that doesn’t invert polarity. The third uses a complex structure that acts like an unmarked contrastive. The last sentence expresses multiple conflicting opinions.  4.7 Default Scoring Algorithms The default score used by Hu-Liu is the previous sentence’s score for a corpus consisting of multi-sentence reviews. To improve the cases where the default score is used, we explore the combination of our algorithm with the real-number score generating collocation-based log-likelihood algorithm of Yu-Hatzivassiloglou [YHO3]. This proves to be ineffective for the highly refined augmented lexicon, since the previous sentence default accuracy increases as the nonzero accuracy increases, and this default outperforms the relatively noisy collocation scoring in this case. Nevertheless the collocation algorithm provides a better than random default scoring method for corpora of isolated sentences where the previous sentence default does not apply.  38  4.8 Application to Another Domain: Movie Reviews We next apply our algorithm to another larger annotated sentence-level polarity test corpus (that of Pang et al. [P+02]) in the new domain of movie reviews. Improvements  are shown for most of our developed lexicons and discovered shifter features. We note that there are some important differences between the Hu-Liu corpus and that of Pang et al. One is that since the sentences are all isolated movie review summaries and not sequential sentences about the same subject, we cannot use the previous sentence’s score as our default score. Also, since the sentence polarity is not focused around particular feature words, we cannot use something like the effective opinion tie breaking scheme of Hu and Liu. Since there are an equal number of positive and negative sentences in the Pang corpus, the best default score we can give is that of the most frequent class, resulting in 50% accuracy in these cases.  We begin with the Hu-Liu lexicon with WordNet expansion. Results are shown in Table 21.  Table 21: Hu Liu Lexicon with WordNet on Pang Corpus Configuration hulexWN -jd hulexWN -j Accuracies Total 5746/10662 0.5389 5771/10662 0.5413 Nonnull 4266/7702 0.5539 4291/7702 0.5571 Nonzero 3635/6440 0.5644 3649/6419 0.5685 Zero 631/1262 0.5000 642/1283 0.5004 Null 1480/2960 0.5000 1480/2960 0.5000 Coverage Nonnull 7702/10662 0.7224 Nonzero 6440/10662 0.6040 64 19/10662 0.6020  We then try the HM lexicon with and without WordNet expansion. This dramatically improves accuracy from 0.54 1 to 0.6 13. Interestingly, WordNet expansion improves both accuracy (0.602 to 0.6 13) and coverage (0.62 1 to 0.792) in this corpus. These results are found in Table 22.  39  —.  t----C’  <  )C’  4C’C’  -5—  ——  5—  5—  9999 CCC’ CC00J C.)-1CC00  Cl’CMC’C’  —(C-t  CC.)  C.)CC  —3C’  -00  5-—  CD-J-ft’JC’  -5--  C)00Ui\DC  99  _)  C’C’ C’C’  CC <  CCC.)  —1 00 4CC -iC.)  99  C’C’ C’C’ J  CC  ._-J  C0Cl’  —3CC  CC—1C  C’ Ut Ci, C  C-34.C-.C  00  CCC’ —C’  -.UtC’C’0  CDCC4.0 00CCU,  —100  CC C’C’ C’C’  CDC—1  C)  -It--)  —-  CC  004  —-1CC  99  ‘j-1C’  C’C CC’  00—  .-—-1  LQ U,CJ,  00  ——  —-1bo C  cC  tSJ  CCCC©  C  CD  L’)  —  —  Cl’  -  C’  —.1 00 UiCC  CDCC  UtC’  CCC  C’00  99  C’C’ CC’ IJ)  CC  —  ‘.)W — ——  —C  —100 Cl)C’  CC00  CC00C  CCQ’  C— —C —C.)  9CCC  LJ  JC’  C’—  99  )  C  CC C’C’ C’C’  LJ ) I’) 4 C C00t’.)C’C’  C C C’C’ CC  44  —-100 Cl’Cc -1---1 C’t-)  —C’4CliC’ 00.Cl’  —1Ll, —100  ——  -00 t’JJC’  b  CC  4CM0  C—-4  0  C C’ C’  t’J  —  --  00  Cl’  C’  CC’0 C0C00C’  990Cc  Cl’  C’L1C 00 —C’  -  — 4 C’ 00I’JC’4 C— LJ-00C’00 00 (.)(J’C’C’  0— D  (l,4  ‘-)  C’  C’  C  C’ ‘J  C’  C’C’  CC’CCC’C’  Ut C’ C’ C’  CC4C CCC’JCC  t-J,  099CC  L.)00CCUC 00J4--JC’ tJ4C’  )CC 5 )C,  C’CC—CD-I  Ll-.1Cl’(l)C’  COCOC’  CC-tSJC CJC  99999  )C00C CCC’  —1)C  LC’—  -—Cl’C’ —C)C---14 ——LC’  CC—1J— CC——aC’C’  l’C’C’C’ OCCl’C—  000  S)  CDJCl’COC CC’-C’ C’’)C’  0o—-4Cl’C’ -—CDC’CJ ‘l’C)C—1) CC’—  ——CCC’ CC——  CC’  CCC9C  C .Ifl C’00 C’ C  C’  Cl’  CM  e  ZZOZNZZ-  00000 0  Cl’C —l’  29  -  C’C’ C’C’ t’J)  CC  ——  C’ C’ —C tJ-_  CrQ (D  =0  zzOzzz00C000  <  >  >  -.  >  I  0  CM  0  =  CM  0  S  0  CM CM  0  tD  CD  )  CD  r...  Cl)  o  -  Cl)  Cl)  0  CD  -t-I’-3  C’C’ C’C’ r-3)  CC  ---  ——  o  -  C  -  CD  <  C)  o  0’ 0 00  Cl)  0  -  o  o  CD  —  C  .  C  t,,  ‘_)  —  C C’ C’  Cl’ —3 U, CD  CC  UtC Cl)  —  UtC CCC’  C  b Cit  C  C’  C C  —  5—  C’ CC —3  .  CJ  C’CC  CC  k)  CC itb  CD  C  0  CD  C)  CD  —  ——  C C C’C’ C’C  —_  —Ut  C-  00  —  CD  CD  —.  0  C)  0  0  CD  CD  0— D  —-5-  -5—  5—  —-5  —5-  S_)  Ut—Cs  CCOOC’C  C—004-t)  CCCCC  Ut  4  —.  4C’ C(COCJCC’CD— C.5--U1L1t00 00 4..Cl’Ll’C’ C00—3C’C  —CC--1  CC—3LItCC CCCD—3--3  CCCCC  CCOOC’C 4 ——C’ Cl’ —1-1C’  CC’  CCCDC.)CJ  OOCC—aOc  CDC—34-  CCCC  --5—  00CC’4. —)4-.C’00— J—CC.C —---.c —1 UtU,C’  CC’CC)-1  —  CC’CC  C  —LnC’ —j-1-1-PUt  00  ‘-CC  CCC00 McCC’  -  t) C C 4 C —--1C’ —CitC’  —-5-  —CC’C.-  —5-  CJL’’C 00—Ut t—--00t---  =  N  CM  1 e  000C000  ZZOZNZZ’-3  C,)  -‘—‘  Cl)  CD  C  C)  0  ‘—  -  rj,  CD  CD  CD CD —. CD  CJD  C  CD  -..  ,-.  CD  -  S.  CD  z  Z  ro  Z  Z  0  r  z  0  0  =-  L  Z  We then examine the negation shifter using the robust lexicon and these results are reported in Table 24.  Table 24: Robust Lexicon with Negation Shifter Configuration myhx -jdg myhx jnrvdc Accuracies Total 6526/10662 0.6 121 6664/10662 Nonnull 4684/6977 0.67 14 5794/8923 Nonzero 4221/605 1 0.6976 5071/7478 Zero 463/926 0.5000 723/1445 Null 1842/3685 0.4999 870/1739 Coverage Nonnull 6977/10662 0.6544 8923/10662 Nonzero 605 1/10662 0.5675 7478/10662  0.6250 0.6493 0.6781 0.5004 0.5003 0.8369 0.7014  We move on to the augmented lexicon and find that it extends to some degree to this new domain by showing further improved accuracies overall. Likewise, we find that the rest of our shifters improve accuracies in isolation and in conjunction. Since WordNet was found to be effective in this corpus, we expand the augmented lexicon and obtain still more accurate results. These results are found in Tables 25, 26, and 27.  Table 25: Augmented Lexicon on Pang Configuration myhx+ -jnrv myhx+ -jnrvdg Accuracies Total 6611/10662 0.6201 6668/10662 0.6254 Nonnull 5757/8955 0.6429 58 14/8955 0.6493 Nonzero 5065/7570 0.6691 509 1/7509 0.6780 Zero 692/1385 0.4996 723/1446 0.5000 Null 854/1707 0.5003 854/1707 0.5003 Coverage Nonnull 8955/10662 0.8399 Nonzero 7570/10662 0.7 100 7509/10662 0.7043  41  —  C’ —  SCJtC  CCC  —---  —.1 C C C Cit C -  _-_t--  LC’-C  C’it  00D  COOO  OCC’C’ CCUI. CO—D\C  CC  —  1.L’ta-, —---‘-)  —---  —CCC 5  -  C CL44  C’O  k)  -  C15  Z  +  —  —  <  S  —  JO  5  <  -t  -  ;;-  4LIiC’ C’—J  CCC  4CL.)  C C C  CC C’C’  —5---  CO C’C -IM —--  ——  LItCJI  00Q  Op cc c  C’C’ C’C t’-)I-)  ——  --5—  C—_t  .  C C C’C’  CC’00C’CJt  ——  (JtUt  C4)t’)  CLliCC’C’  CCCCC  —  () Lit  CD C  00  ‘.CC  CC C\C  C’ )L’.) 5 t’  aC  CC  —5--  c—  4Lit  C C C  —  D  —: c—  ZZOZNZZCCCCCC  -  C  r  C  C  Z  .  o  0  —  o  r)  -t  0  =  cic  .  c,  o  0  =  -  0  —.  ‘S  CD  CD  0  <.  CD  S.  0  Cl)  c  .  cI  0  0  0  ZZOZNZZ-  —  —  C—.I  —-I4  --I(_.)  00 -o .I—  _)  C’0  C’O  C C  -5— ——  t-_)’.ct -.  —L.)  —\C  C’.L.) C’00  .-  CC IcC  t.Jj  C’.C’.  CC C’C  ——  ‘s—) C’-  —  tJt_ tL.) c00  CO cc  o o a. aa )  —  C—I 4—Ic.) .— -..  00  D  ND  o  c C’.  L’-)  L.)C’  —  —5-  5_._  Lit--I—C  00 C C Lit4C C’.C0O4-C,  )‘.CC’.  Li) C’ ‘.0 —.I--  —I---I’.0— —10  —5-  L)L.))00 00—c.)  -oou1aa-.  C’00-IL.)--I  ‘.0 C — .1 c.) C-f) L)—00c.)  —---I---I’C—  ‘.C00D—IC CC’. Oc.)O’. L) —  CJ00-I  LJtCLi)t’)—)  CC’CJ) ‘SCO3c.)L’S) utoc.)—)—  00000  —  00CJtC’C’ LJtLItC.) C’ ‘C00——-IL) \C-C’.L.) --. — —. — —) 00 C —  N  Ct  —  -  Cit  r  CcoccCr  53  <  Z  -  -  +  —  S  Z  +  -  -  + -  -  C  —.  .  C  Ci)  +  Lit  L.)’.0 C’.’..)  I0 CL.)  CC  CC C’C’. C’.a-. IJk)  ——  --5.—.  C4 )‘.0  (J.’.C  —.100  C’.Li)  oc. L.)’.0  —.100  CC  CC C’.C’. C’.C’. -It)  .5—----  .)‘.C ——  Lit ‘.0 C  —.100  C’.’-C  —.100 CL.) L.)”0  pp  CC C’.C’. C’C’ IJJ)  LJt’.O CU. JU.  —.100  Cc.) -l-’.C L,1’.C  pp —.100  JSJ_)  CC C’C’ C’.C  —  —  J.’.C  —-1 00  0 m  5--  ‘.CC00Ut  ‘.0C’.C —-I4LI.--IOC  CCCC it’.b’.  IN.)  —CM’.0C ..1C4.C’ J’.CC’  .—00-  ——-----5--.-  C’.4-Li)U.—  00-..1LltLItC’. UtJ.—0o--)  ‘.CC’. -..1--.1’.)—C  C0000 440’.C’Q’.  —4Lit’.0C U.—.IO4O’. ‘.00’.  —.1-1--)00--’  ———-.5-.--  U.004  C’L.)’.0—--.1  00—.IL’.U.C’. LI. L’) C 00 C’.  Li)-00C’U1  C’.C’.C’. -U.J CCC”  ppppp  CU.UI’.DC 1U.CL’t3 t)LMC’ J’)  Q’.CJ00 --CC’C  00—.1(MUtC’ L)—003’.  C’C’C’ CC00Ci.J CC—J---1 )0—1  ppppp  C(it\CC LMC’ —CJtC’.  - Lit (it C’. LJ—00C’ 4.t-I—Li)’.C — ‘.0 Li) 00  SC  —  -  -.  +  +  C)  S  5  -.  +  >  5  •  -“.  5  -0-  -  -  zzOz.1zzCCO.0CCtZ  —.  We have found that our accuracy increased with each of our various improvements. It went from 0.54 to 0.61 with the change to using the intelligent lexicon from the simple seed set that was expanded by WordNet. Accuracy further increased to 0.63 with the addition of the other parts of speech to the lexicon along with some refinements. It rose to 0.64 with the application of the other shifters along with WordNet expansion. Finally it reached 0.66 with the application of the log-likelihood expansion.  4.9 Discussion of Results  In this section we summarize main steps that led to improvements in the algorithm. We also discuss things that failed to improve accuracy like WordNet for the Hu-Liu corpus, effective opinion tie breaking in some cases, and but-conjunction for effective opinion in some cases.  Through the process of failure analysis, we iteratively discover the next most important feature to model after each subsequent improvement. Since each step involves dealing with the next most common source of failure, we make the largest jumps in accuracy first, and then find more and more subtle refinements. The first important finding was that each of the pieces of the original EIu-Liu algorithm was quite sensitive to the initial seed set and the full lexicon used to score the sentences. The WordNet expansion algorithm as it was described added a lot of noise to the polarity word set and hence introduced a lot of errors. Also the effective opinion tie breaking was inconsistent as it helped in some product domains and not others, and overall was often less accurate than the previous sentence default.  Moving to the intelligent HM lexicon reduced the noise in the data and increased accuracy, and led to the next important finding. This was the validation that nouns, adverbs and verbs are important carriers of polarity in addition to adjectives. Adding those words from the GI lexicon increased coverage and accuracy, and after pruning the  43  list and adding to it in the next two phases of analysis, we were able to expose a range of more subtle contextual constructs used in the expression of opinions. There are a number of contextual valence shifters used to modify the polarity inherent in the lexical items of a sentence. A number of these types of shifters, such as negation, contrastives, modals, presuppositions, and but-conjunctions, invert the polarity of the lexical items they operate on, sometimes inverting the same word’s polarity multiple times when they are present together. There are still other shifters like adverbs of excess and sufficiency and hedging that force polarity one way or the other, or reduce the strength of lexical polarity. For each of our main improvements to the algorithm, including lexicon refinement, parts of speech expansion, and contextual valence shifter adjustment, we were able to show a statistically significant increase in accuracy.  Like the Hu corpus, the Pang corpus shows increasing accuracy of prediction as we improve the lexicon and apply contextual valence shifters. There are however notable differences. One example is the relative domain dependence of lexicon words. A word like “complex” is generally negative for consumer products like cameras and dvd players but positive for movie plots and themes. This suggests polar words exist on a scale of broadness of application, where words like “annoying” and “enjoyable” are more universally applicable, while words like “complex” and “simple” have domain dependent polarity.  Because there are always cases where the predicted score is null or zero, default scoring algorithms are important. Effective opinion tie breaking is possible when there are feature words in the corpus but is found to be only marginally effective. Using the score of the previous sentence when the corpus consists of paragraphs of evaluative text rather than isolated sentences works well and is best with a refined intelligent lexicon. Default scoring using log-likelihood lexical scores for non-seed words based on their collocation frequency with seed words is the next best algorithm and the only option for isolated sentence corpora like the Pang corpus.  44  5 Conclusions and Further Work Understanding evaluative language is an important task in natural language processing with widespread applications. We have focused on the task of sentence level opinion classification in the hope of discovering and classifying the basic building blocks of evaluative discourse. Starting with a collection of existing algorithms, we have built a flexible framework capable of classifying opinions with accuracy exceeding current stateof-the-art results. We have discovered many of the important factors in performing opinion classification, including the parts of speech that carry polarity, the range of polarity attached to the different uses and senses of words, and some of the linguistic constructs that shift the polarity contribution that words make to a sentence.  There are a number of other algorithms and ideas suggested in the literature that were not applied to sentence level polarity prediction but that we feel could be integrated into our system as improvement tools. One technique is the semantic orientation equation (called SO-A, semantic orientation by association) of Turney and Littman [TLO3]. This is based on the idea of pointwise mutual information, and works in a way very analogous to the log-likelthood equation of Yu and Hatzivassiloglou [YHO3]. It however uses web search hits with a proximity operator instead of corpus collocation to compute a polarity score. The work was originally done using AltaVista’s NEAR operator, but that operator has since become defunct (upon Yahoo’s acquisition of AltaVista in 2003). Turney has since developed a Beowuif cluster containing a database of a terabyte of web pages and an associated query language that includes a proximity operator. So SO-A can now be computed using this static approximation of the internet. This could be used as an alternate method for computing lexical polarity scores, or could be used as a method to cross-check or filter out scores computed by other methods.  45  To improve the accuracy produced by the Yu and Hatzivassiloglou log-likelihood collocation algorithm, we would like to try augmenting our corpus with other datasets to increase the amount of collocation that will occur. This should produce a larger number of more accurate scored polarity words.  The work of Riloff and her colleagues [R+03] suggests a method for extracting linguistic patterns automatically. They have used a number of unsupervised machine learning algorithms, such as AutoSlog-TS, MetaBoot, and Basilisk that implement the technique they call “extraction pattern bootstrapping”. They use these algorithms to discover patterns indicative of subjectivity, but we would like to explore the possibility of applying these or similar techniques to discovering patterns to be used in polarity prediction. These might include patterns that carry polarity or ones that shift the polarity of the constituent polar lexical items. We will attempt to classify them based on linguistic phenomena, or possibly use them to propose new linguistic models of evaluative language.  To evaluate the patterns, we will explore the idea of measuring their consistency and strength using metrics of polarity like the log-likelihood and SO-A equations. One idea for doing this is to use the SO-A equation on a series of n-grams constructed by taking a given pattern and substituting in different lexicon words. By doing this we would like to be able to rank the patterns by consistency (how consistently they shift the polarity in the same direction) and strength (how far they shift the polarity).  To optimize the way we use our various lexical and contextual classifiers, we would like to experiment with boosting. Specifying our polarity features as classifiers for AdaBoost [FS96] would allow us to optimize how they are used on our dataset for classification.  We provide a first pass of various shifters in our implementation and prove their mechanics but the sets of marker words and phrases in each shifter category are subject to expansion. This could be achieved through further rounds of failure analysis on new corpora from other domains.  46  We would also like to investigate but-conjunction, adverbs of excess and sufficiency, and hedging in more detail. But-conjunction is a common form in evaluative text so finding ways to distinguish its shifter usage would be very helpful. Modeling adverbs of excess and sufficiency and hedging would add new layers to the system involving precedence and strength that could improve accuracies further. Finally, we would like to investigate more complex rhetorical devices such as idiom and irony. Starting with the General Inquirer, we would like to compile a database of idiomatic patterns that can effect polarity, and add this component to our system.  We have established a framework of tools and algorithms for opinion classification and a methodology for expanding this framework. With further research, the precision of this system will continue to grow as will its applicability to the task of understanding evaluative text.  47  References [BW99j Rebbeca Bruce and Janyce Wiebe. 1999. Decomposable modeling in natural language processing. Computational Linguistics, 25(2). [CVO4] Rudy Cilibrasi and Paul Vitanyi. 2004. Automatic meaning discovery using Google. http://xxx.lanl.gov/abs/cs.CL/04 12098. [C95]  Paul Cohen. 1995. Empirical Methods for Artificial Intelligence. MIT Press, Cambridge, MA.  [E+00]  Gerard Escudero, Lluis Marquez, and German Rigau. 2000. Boosting applied to word sense disambiguation. In Proceedings of the 12th European Conference on Machine Learning, pages 129—141.  [FS96]  Yoav Freund and Robert E. Schapire. 1996. Experiments with a new boosting algorithm. In Machine Learning: Proceedings of the Thirteenth International Conference, pages 148—156.  [G+03] Andrew Gordon, Abe Kazemzadeh, Anish Nair, and Milena Petrova. 2003. Recognizing expressions of commonsense psychology in English text. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL-03), pages 208—215. [HM97] Vasileios Hatzivassiloglou and Kathy McKeown. 1997. Predicting the semantic orientation of adjectives. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL-97), pages 174—181. [HLO4] Hu, M., and Liu, B. 2004. Mining Opinion Features in Customer Reviews. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI 2004). [HTOO] Susan Hunston and Geoff Thompson. 2000. (Eds.) Evaluation in Text: Authorial Stance and the Construction of Discourse. Oxford University Press. [JMOO] Daniel Jurafsky and James H. Martin. 2000. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall.  48  [MOO]  Lluis Marquez. 2000. Machine learning and natural language processing. LSI OO-45-R. Departament de Llenguatges i Sistemes Informatics (LSI), Universitat Politecnica de Catalunya (UPC). Barcelona, Spain.  [M03]  J. R. Martin. 2003. Introduction. Text 23(2). pages 171-181. Walter de Gruyter.  [M+90j George A. Miller, Richard Beckwith, Christiane Felibaum, Derek Gross, and Katherine J. Miller. 1990. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography, 3 (4):235-244. [RM98] Rosamund Moon. 1998. Fixed Expressions and Idioms in English: A CorpusBased Approach. Oxford University Press. [NO0]  NLProcessor Text Analysis Toolkit. 2000. http://www.infogistics.comltextanalysis .html —  [P+02] Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2002), pages 79—86. [PZO4] Livia Polanyi and Annie Zaenen. 2004. Contextual valence shifters. In Proceedings of the 2004 AAAI Spring Symposium Exploring Attitude and Affect in Text: Theories and Applications, pages 114-119. -  [R+03] Ellen Riloff, Janyce Wiebe, and Theresa Wilson. 2003. Learning subjective nouns using extraction pattern bootstrapping. In Proceedings of the 7th Conference on Natural Language Learning (CoNLL-2003), pages 25—32. [SO2]  Robert E. Schapire. 2002. The boosting approach to machine learning: an overview. In Proceedings of the MSRI Workshop on Nonlinear Estimation and Class,fication, Berkeley, CA.  [S04]  Stanford Natural Language Processing Group. 2004. Stanford Tagger. http://nlp.stanford.edulsoftware/tagger.shtml  [S+66] Philip J. Stone, Dexter C. Dunphy, Marschall S. Smith, and Daniel M. Ogilvie. 1966. The General Inquirer: a computer approach to content analysis. M.I.T. studies in comparative politics. MIT Press, Cambridge MA. [THOO] Geoff Thompson and Susan Hunston. 2000. Evaluation: An Introduction. In Susan Hunston and Geoff Thompson (Eds.), Evaluation in Text: Authorial Stance and the Construction of Discourse, pages 1-27. Oxford University Press.  49  [TLO3] P. Turney and M. Littman. 2003. Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS), 21(4):315—346. [WOO]  Janyce Wiebe. 2000. Learning subjective adjectives from corpora. In Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI-2000), pages 735-740.  [W+O1] Janyce Wiebe, Rebecca Bruce, Matthew Bell, Melanie Martin, and Theresa Wilson. 2001. A corpus study of evaluative and speculative language. In Proceedings of the 2ndACL SIGdial Workshop on Discourse and Dialogue. 2001. [W+99] J. Wiebe, R. Bruce, and T. O’Hara. 1999. Development and use of a gold standard data set for subjectivity classifications. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99), pages 246--253, University of Maryland. [WWO3]Theresa Wilson and Janyce Wiebe. 2003. Annotating opinions in the world press. In Proceedings of the 4th ACL SIGdial Workshop on Discourse and Dialogue (SIGdial-03), pages 13—22. [YNO4] J. Yi, T. Nasukawa, R. Bunescu, and W. Niblack. 2003. Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques. In Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM-2003). [YHO3] Hong Yu and Vasileios Hatzivassiloglou. 2003. Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2003), pages 129—136.  50  Appendix Project Code -  These scripts correspond to the framework presented in Chapter 3. There is one preprocessing script for for the NLProcessor (O-hu-preproc.pl) and there are two post processing scripts for the Standford Tagger (O-pang-prepend-stanford.pl followed by 0post-stanford.pl). The tagged xml corpus is then formatted with 1-proc-corp.pl. At this point there is the option to expand the lexicon using the WordNet algorithm (scripts wn init.pl, 2-wn-extract-adj.pl, and 3-wn-orient-pred.pl) or the log likelihood algorithm (scripts 2-log-lex-count.pl, 3-log-word-stats.pl, and 4-log-word-scores.pl). Finally, the formatted corpus is scored using 5-score-corp.pl (possibly across multiple corpora with run-5-score-corp.pl). Further usage notes are found in the script comments.  51  #! /home/bin/perl #123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ###filename: 0-hu-preproc.pl ###author: Adam Longton, longton@cs.ubc.ca #  *****************************************************************  #description, inputs and output: #takes a corpus file from the customer review data’ of Hu and Liu ‘04. #converts from native format to xml markup, for input to the NLProcessor #pos-tagger and chunker. **** * ****** * ******* ***************** * **** *** ***  use strict; use warnings; my $usage = “usage: pen $0 <corpusfile>\n”; die $usage if ($#ARGV < 0); my $line; #get input file my $corpfile = shift; open(CORP, $corpfile) or die “file error: $usage”; print “<FILE>\n”; while ($line = <CORP>)( chomp $line; #cleam up ampersands and less-thans to be valid xml input $line = --don’t need cuz not in corpus #$line =— s/</&\#60;/g; if ( $line =— /(.*)\#\#(.+)/ ){ #a sentence print “<S”; if ($1 eq ““)(print “>$2</S>\n”;} else (print “ info=\”$1\”>$2</S>\n”;} elsif ( $line /\[t\J 7(•*)/ ){ print “<TITLE>$l</TITLE>\n”; #else do nothing,  #a review title  iust drop it  print “</FILE>\n”; close(CORP); ###END  #! /home/bin/perl #123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ###filename: 0-pang-prepend-stanford.pl ###author: Adam Longton, longton@cs.ubc.ca #**** ****************** *** ********** * * *********** *** ****** * * ******  #description, inputs and output: #takes a corpus file from the pangi-lee movie review snippet/sentence polarity #corpus and prepends it with score[+1]## (or score[-ll##) for input into 0-post-stanford #script. #  *****************************************************************  use strict; use warnings; my $usage = “usage: perl $0 <corpusfile>\n”; die $usage if ($#ARGV < 0); my $line; #get input file ny $corpfile = shift; open(CORP, $corpfile) or die “file error: $usage’; while ($line = <CORP>)( chomp $line;  52  #clean up ampersands and less-thans to be valid xml input s/&([’\#))/&\#38;$1/g; #$line *not in corpus #$line =- s/</&\#60;/g; #for pang, rather than xml codas, usa own symbols: $lina =— s/\</LT/g; $line =— s/\>/GT/g; #could also remove non IJTF-8 chars here by char ascii # limit maybe print “scoret+l]##$line\n”; # for positiva sentences #print “score[-l]##$line\n”; #for negative setences  close(CORP); ##END  #! /home/bin/perl #123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ###filename: 0-post-stamford.pl ###author: Adam Longton, lomgton@cs.ubc.ca ***************** * ******** * * * ***************** **** * **************  #description, inputs and output: #takes a corpus file in the format of the ‘customer review data’ of Hu and Liu ‘04 that has been #stanford pos-taggad. #converts from native plus stanford pos-tagged with ‘/‘ separators to #xml markup (with ‘j separators), for input to phase 1 script.  use strict; use warnings; my $usage = “usage: pen $0 <corpusfile>\n”; die Susage if ($#ARGV < 0); my Slime; #get input file my $corpfila = shift; opan(CORP, $corpfila) or die “file error: $usage”; #denive corp name from filename: $corpfile =— my $corp = $1; my $sid = 0; my Stid = 0; my $info; my $sent; print “<FILE fid=\”$corp\”>\n”; while ($line = <CORP>)( chomp Slime; #don’t clean up ampersands and less-thsns to be valid xml input #cuz already done in preproc #Sline =— s/&([\#])/&\#38;$l/g; --not in corpus #$line =— sI</&\#60;/g; if  #a sentence ( $line = /(.*)\#\#(.÷)/ ){ $info = $1; $sent = $2; $sid++; #switch / to (note that it assumes lines end w/ a space) $sent =— s/\/([’\/1+) /_$l /g; —  primt “<S sid=\”$corp:$sid\””; if ($info eq ““){print “>$sent</S>\n”;} info=\”$imfo\”>$sant</S>\m”; else (print elsif ( $lime =— /A\[t\] ?).*)/ )( #a review title $tid++; print “<TITLE tid=\”$corp:$tid\”>$l</TITLE>\m”;  53  #else do nothing, just drop it else( print STDERR “$line\n”;  print “</FILE>\n”; close (CORP); ##END  #! /home/bin/perl #123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ###filename: 1-proc-corp.pl ###author: Adam Longton, longton@cs.ubc.ca **********  ***  #description,  * * *  inputs and output:  #this script is phase 1 of a multi-phase architecture that implements #the sentence-level sentiment-polarity calculating scheme of #[Hu and Liu 20041 #input: a corpus of POS-taged #hu and liu’s ‘customer review #0-hu-preproc.pl to convert it #by NLProcessor. (post proc’d #doesnt meed be done again.(  semtences. more precisely, a file from data’ corpus 2004, preprocessed by into XML, and pos-taggimg being dome for sent ids too i think, only so postag  #the command to pos-tag it looks like,  for the apex.xml corpus,  bin/nlp.sh -q P.*/FILr -qs .*/Su #cat .7.. /hu-corpus/apex.xml .*/W -p “\{#U_\(C\}” #-xml_f let -show_tags I bim/sgdelmarkup -q 4 > epex-tagged.xml 4 #the tagset is a modified penn treebank set. this script generalizes them #in this way: N’->N, V->V, J’->J, amd R*_>R umless it’s RP (particle). 4 #output: to standard output, each lime is a clause, of form: #<ann>, 0,0 (,<stem>: <pos>) * .  #that is, a polarity annotation <amm> which cam have the values #+ for only positive, - for only negative, and m for mixed opimioms, or #else a 0 for unammotated amd presumably neutral opinion, this assumptiom #is mot always true tho because they are subject targeted.. if a positive #semtimemt is being expressed but not toward amy particular feature them #it might be lablled neutral but positive words will want to score it #positively. am example is the sentence ‘awesome_JJ !_.‘ foumd #im the corpus. 4 we also have two more fields: semtence id and they are thrown into subfields of field 0 sep’d by semicolons. #graded score. #other fields continued: #2 intially zero slots for pstv and ngtv #lexicom-word counts (to be filled in later if using log-likelihood), and the word:pos pairs #for that clause. #There are option-flags for choosing which poe’s you want to filter for. 4-n -v -j -r for noun verb adjective adverb, respectively. #example: 4 #input line from cat.xml: 4<8 sid=”cat:23’ info=”cat[+21, dog[-i-3]”>The_BT cat_MN and_cc #especially_RB the_DT dog_MN are_yB very_RB very_RB good_JJ 4 #command-line execution: #>l-hu-proc-corp.pl -vjr cat.xml  ..  </8>  54  #output line: #cat:23;+;2.5,O,O,especially:R,are:V,very:R,very:R,good:J use strict; use warnings; my Susage = “usage: perl $0 <xmlcorpusfile>\n”; die Susage if (5#ARGV < 0); #process imfile my Scorpfile = shift; open(CORP, Scorpfile) or die file error: Susage”; #variables my Slime; my @limeArr; my $sent; my @outArr; my Sword; my Spos; my Spair; my $imfo; my @infoArr; my @annArr; my $sum; my Sm; # eff opinion - need the info field for ph5 scoring my @infoOutArr; my $number = 0; my Sroumded = 0; #proc file lime im lime out.. while($lime = <CORP>)( /<g[”>]* info=\([A\n>]+)\fl/ )( #if an ammotated semt, if( Slime Simfo = $1; -  @outArr= (0, 0, 0) @annArr= (0, 0, 0) #process bimary annotation if( Simfo =— /\R÷/ (C if( Simfo =— /\[—/ )( #mixed $anmArr[l]=’m’; else{ SemnArr[lJ=’+’;  ) #positive  elsif( Sinfo =— /\[—/ )( #megative $ammArr[l]=’-’; # other option is to take sign of graded ann #only proceed for + amd - (ignore m and 0, #cases where mo score provided by hu-liu.. if( 5annArr[l] eq ‘÷‘ SannArr[l] eq ‘-‘ if( Slime =— /<g(”>]*>\s*([’<3*)<\/g>/ $semt = 51;  0 including the two ){  (C  ##do delimiter switching here:*************** Ssent =— s/\,/CM/g; $semt s/\:/CL/g; $sent =— s/\;/SC/g; @limeArr  =  split / /,$semt;  #process other xml attrib (sid) if( Slime =— /<S[>]* sid=\”([”\”>]+(\”[ )( SanmArr{01 = $1;  #also put the info attrib into the output, for eff opinion scoring in pht. # split on commas, remove leading spaces, replace other whitespace with am It umderscore, join with coloms. clean up that occuremce of “;“ in It anomalous “at&#38;t” feature word. @imfoOutArr = split /,/,Simfo; for( my Si = 0; Si <= SitimfoOutArr; Si++ ){ #remove leading spaces SimfooutArr[5i] =— Itreplace other whitespace with underscore SinfooutArr[5i] =— #put “&“ ‘s hack in  55  SinfoOutArr[5i] =— s/&#38;/&/g; Sotherwise should remove any remaimimg semicoloms, but ummecessary for huliu corpus my SimfoOutStr = join ‘:‘,@imfoOutArr; SoutArr[2] = “FEATURES=” SimfoOutStr; #process graded annotation usimg imfo attribute @imfoArr = split /\[/,Simfo; #if( Simfo !— /“\[/ )( <--for safety but ummecessary shift @imfoArr; Spop off imitial part of string fall other parts start with a number  U Sm = 0; Ssum = 0; foreach my Schumk ( @imfoArr (( if( substr(Schumk,0,l) eq ‘÷‘ Ssum += substr(Schumk,l,l); 5m++; elsif( substr(Schumk,O,l) eq Ssum -= substr(Schumk,l,l);  if(Sm>O)( Smumber = Ssum/5m; Sroumded = sprimtf(”%.3f”, SamnArr[21 = Sroumded;  )(  ‘-‘  Smumber);  #imsert anmArr imto field zero of outArr SoutArr[O] = join ‘;‘,@amnArr; #process semtemce foreach Spair (@limeArr)( #extract words of valid pos if(Spair =— /({Fj÷)_([.%_]+)/ (f #disallows amd Sword = $1; Spos = $2; if( $pos =- /ANN*/ )( #matches NN[SP(PS)]? #if(SoptsHasht’m’fl( push @outArr, (Sword U —  —.  cases  elsif( Spos =- IVB./ (C #matches VB[DGNPZ]? #if(SoptsHash( ‘v’}( C push @outArr, (Sword “:V”); U elsif( $pos = /JJ*/ ){ #matches JJ[RS]? #if(SoptsHash{’j’fl( push @outArr, (Sword  elsif( Spos =- /‘RB.*/ (C #matches RB[RS]? #if(SoptsHash{’r’}) C push @outArr, (Sword “:R” (; #1 elsif( Spos =— I”RP.*/ (C #matches particle RP push @outArr, (Sword “:P”); else( push @outArr,  (Sword  “:Spos”  if(5#outArr > 2)( ###primt to outfile omly if momempty primt join(’, ,@outArr(;  56  print “\n”;  close (CORP); ###END  #! /home/bin/perl #123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ###filename: wn-init.pl ###author: Adam Longton, longton@cs.ubc.ca # this script inits some required environment variables for 3-wn-orient-pred.pl # paths depend on particular experimental environment ‘setenv WNHOME /cs/public/gemeric/lib/pkg/WordNet-2.l/’; #location of WordNet package ‘setenv PERLSLIB /.autofs/homes/ubccshome/l/longton/adam/proj/wordnet/’; # location of WordNet-QueryData-l.46 (WordNet perl interface) #END  #! /hone/hin/perl #123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ###filename: 2-wn-extract-adj .pl ###author: Adfl Longton, longton@cs.ubc.ca 4 this script is the first of two used to expand an adjective lexicon using wordnet. one extracts 4 the adjectives from the corpus.  #input: a corpus of stemmed and POS-tagged sentences, #of the output of l-proc-corp.pl  this  in the format  #example: #input line from lout.txt: #cat:23;+;2.5,O,O,cats:N,are:V,nice:J,very:R,very:R,good:J,funny:J S #command-line execution: #>2-hu-extract-adj.pl lout.txt > 2out-adjlist.txt #output: #fumny:J #good:J #nice : J use strict; use warnings; my $usage = “usage: perl $0 <corpusfile>\n”; die $usage if ($#ARGV < 0); my $corpfile = shift; open(CORP, Scorpfile) or die “file error: $usage”; #process infile, output an adj-list my Slime; my %adjHash = 0; my @lineArr; my Sword; my $adjCoumt = 0; my $wordCoumt = 0; while )Slime = <CORP>)( chomp $line; @limeArr = split /,/,$line; for)my $i3;$i<$#lineArr;$i++) ( Sword = $lineArr[$i]; $wordcount++; #throw adjectives into a hash  57  #val=count for reference if( Sword =— /\:J! (C $adjhash{ $word}÷+; $adjcount++;  close (CORP); my $disCount = scalar keys %adjl-Iash; #print adjs (for input into wordnet script foreach Sword (sort keys %edjHash) print “$word\n”;  (ph3()  print STDERR “Nun Distinct Adjs = $disCount\m”; print STONER “Nun Adjs = Sadjcount\n”; print STONER “Nun Words = $wordcount\n”; ###END  #! /hone/bin/perl #123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ###filenane: 3-wn-orient-pred.pl ###author: Adam Longton, longton@cs.ubc.ca # this script is the second of two used to expand an adjective lexicon using wordnet. this one does the wordnet # expnasion given the adjectives from the corpus.  #input:pstv and ngtv word lexicons and an adj-list #output: an (expanded( wordscore file, #of those fron pstv file given score ÷1, #adj list that were synonyms or antonyns #according to wordnet. in these cases a fan ant the opposite (if +1 then -1 for  (2out(.  this means a list of adjs consisting ngtv -1, and those adjs fron the of known psv and ngv words syn has the same score as its root and eg(.  #exanple: #input: #pv. txt: #good:J #quick:J #nv. txt: #bad:J #2out-adj list.txt: #fast : J #good:J #sl ow : J #nice:J #nonwerd : J #connand-line execution: #>3-wn-orient-pred.pl pv.txt nv.txt 2out-adjlist.txt  >  3out.txt  #output: #bad:J, -l #fast:J, 1 #good:J, 1 #nice:J, 1 #quick:J, 1 #slow:J, -l * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *  4* **setup÷proc options*** **** *** ** * * **** ** ****** ******** * ***** *** *** **** use strict; use warnings;  58  #wordnet stuff use WordNet: :QueryData; my $wn = WordNet: :QueryData->new; my $usage = “usage: pen $0 [-sa#] die $usage if ($#ARGV < 2); my %optsHash = (‘s’,O, ‘a’,O, n ,0); #process options -s : include ‘similar to’ words -a : include ‘see also’ words search only # -#, where # ±5 1-9 # -# is stored as n, where value, #also a count value  <pstvfile> <ngtvfile> <adjfile>\n”; #flags, default all off  senses l-# rather than being just a boolean,  is  my $optsStr = my $opts; my @optsArr; if($#ARGV > 2){ $opts = shift; $optsStr = $opts; $opts = /“-/ or die “file error: $usage”; if($opts eq ‘-‘((die “file error: $usage”;) $opts = substr($opts,l); if($opts = /Vsal-9]/){die “file error: $usage”;} @optsArr = split I/,$opts; foreach my $c (eoptsArr) ( if( $c =- /[l—9]/ )( $optsHash{ n else( $optsHash($c)=l;  #***file  processing  setup* **** ***** * ** * *** **************** * *** ************ **  #process infiles my $pfile = shift; my $nfile = shift; my $adjfile = shift; open(PSTV, $pfile) or die “file error: $usage”; open(NGTV, $nfile) or die “file error: $usage”; open(ADJ, $adjfile) or die “file error: $usage”; # append the outputfiles with the same suffix found in the adjfile (ie which corpus was it) my $corpTag = it) $adjfile =- /2out(—.+)\.txt/ ){ $corpTag = $1; open(SEEDY, “>seedyAdjs$corpTag.txt”) or die “error trying to open filel for writing”; open(ADDED, “>addedAdjs$corpTag.txt”) or die “error open file2 for writing”; open(SCORED,”>scoredAdjs$corpTag.txt”) or die “error open file3 for writing”; open(UNUSED, “>unusedSeeds$corpTag.txt”) or die “error trying to open filel for writing”; # put the command in the added file (which now also has the via words) my $commandAndArgs = “perl $0 $optsStr $pfile $nfile $adjtile”; print ADDED “command: $commandAndArgs\n”; #hashes to hold seed words (pv + nv) and unscored adjs my %seedHash = 0; my %adjHash = 0; #hash to hold adjs found in orig seeds my %seedyAdjs = 0; #hash to hold adjs with synonyms in (current) seeds list my %addedAdjs = 0; #mar08 hash to hold which synonym of an added word caused it to be added my %addedViaHash = 0; #aug2- hold unused seeds my %unusedSeeds = 0;  59  #hashes to hold already generated syns and ants arrays for words from adj list my %synHash = 0; my %antHash = 0; #***file processing*********** **************************** ************** my $line; while ($line = <PSTV>)( chomp $line; $seedHash$line}=l; close(PSTV); while ($line = <NGTV>) chomp $line; $seedHash($line}=-l; # make a copy of orig seed hash for compare at end.. my %origSeedHash = %seedHash; close (NGTV); while ($line = <ADJ>)( chomp $line; $adjHash{$line}=undef; close(ADJ); my @adjs = sort keys %adjHash; my $s = scalar adjs; # to see what’s getting cut, what’s getting used print ““, “START adjlist(N=$s): “, join(”, “, adjs),  “\n\n”;  #my step prune seeds from adjs foreach my $seed (keys %seedHash) if( exists($adjHash($seed}) )( delete ($adjHash$seed)); #but add it to holder hash with corrsp orientation $seedyAdjs{$seed} = $seedHash{$seed); —  else( $unusedSeeds($seed}  =  $seedHash($seed);  my $uCount = scalar keys %unusedSeeds; print “\nunused seeds count: $uCount\n°; foreach my $seed (sort keys %unusedSeeds){ print UNUSED “$seed, $unusedSeeds{$seed)\n”;  @adjs = sort keys %adjHash; $s = scaler @adjs; print “‘, “PRUNED (seedless) adjlist(N=$s):  “,  join)”,  “,  @adjs),  “\n\n”;  #HL’ 5 algorithm* * * * ** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * my $sizel; my $size2; my $iter = 1; do C $sizel = scaler keys %seedHash; oriemtationSearch(\%adjHash, \%seedHash, \%addedAdjs, \%addedViaHash); $size2 = scalar keys %seedHash; print “, “XX($iter)XX sizel=$sizel and size2=$size2\n\n”; $iter-f--I-; while($sizel $size2);  print ““, “\nEND (unused) %adjHash), “\n\m”;  adjlist(N=”  ,  scaler keys %adjHash,  “):  “,  join(”,  “,  sort keys  #***output* *** * ******** * ** ***** **** * * * * * * ******* ** *** * ****** *** ************  my $sCoumt = scalar keys %seedyAdjs; print “\nseedy adj count: $sCount\n”; foreach my $adj (sort keys %seedyAdjs)  60  print SEEDY “$adj , $seedyAdjs($adj )\n”;  my $aCount = scalar keys %addedAdjs; print “\nadded adj count: $aCount\n”; foreach my $adj (sort keys %addedAdjs(( print ADDED “Sadj,$addedAdjs($adj), \t\t\t$addedViaHash{$adj}\n”;  foreach my $adj (keys %seedyAdjs( $addedAdj s { $adj) = $seedyAdj s ($adj };  SaCount = scalar keys %addedAdjs; print “\ntotal scored adj count: $aCount\n\n”; #print output that will be used for scoring: foreach my $adj (sort keys %addedAdjs(( print SCORED “$adj,$addedAdjs($adj3\n”; ###END MAIN section*******************************  * *** * * ******************  S * * *subroutines* * * * * *** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * #note we prune first adjs that are seeds. then no need to #check for existence, instead, when one is added to seeds, remove it #from adjs. sub orientationsearch{ my ($adjRef, $seedRef, SaddedRef, $addedViaRef( = my @synonyms = ((; my Santonyms = (H my Sword; my $found; foreach my Sadj (keys %($adjReffl{ $found = 0; ((get synonyms ((synonyms = getSynonyms($adj); print ““, “--grabbed synonyms of $adj\n”; ((for each, try to find it in seedlist. if do, then add orig word ((with same polarity, and remove it from adjlist. LOOP: foreach my $syn (@synonyns( if( exists(SseedRef->($syn}( (C $seedRef->($adj 3=$seedRef->f$syn}; delete ($adjRef->($adj H; ((and add it to added hash $addedRef->{$adj )=$seedRef->($syn); ((mar08 and add it and its via to the addedVia hash $addedViaRef->(Sadj)=”via syn: $syn”; $found = 1; print ““, “-0-0-0-0-added $adj to seedslist via $sym\n”; last LOOP;  ((if not found yet, check antonyns.. if then, add w/ opp polarity. $found (C if( ((get antonyns ((antonyns = getAntonyns($adj(; print ““, “--grabbed antonyms of $adj\n”; LOOP: foreach my Sent (@antonyns( if( exists($seedRef—>{$ant}( (( $seedRef->($adj)= (-l( * SseedRef->{$ant}; delete($adjRef->{Sadj }(; ((add it to added hash $addedRef->{$adj}= (-l( * $seedRef->($ant}; ((add it and its via to the addedVia hash $addedViaRef->($adj}=”via anto: $ant”; print “, “-0-0-0-0-added $adj to seedslist via $ant\n”; last LOOP;  61  print  ““,  “\n”;  #takes a mysyntax word like good:J, #turns it into a wordnet:querydata word, like good#a. restrict in and out to #non-phrasal (single word) lexemes. #optimization: only generate syn(and ant) lists at most once sub getSynonyms( my ($adj) = #unless we’ve already generated the synonyms on a prey iteration.. unless( exists($synHash($adj}) )( my $word; my $w; my @senses = () my %syns = 0; my @currSyns = 0; my @currSims = 0; my @currAlsos = 0; #convert into wordnet syntax $adj /(t”\:]+)\:J/; $word = $1 .  #generate all synonyms: use synset, ‘similar to’ synsets, #and ‘see also synsets. @senses = $wn->querySense($word); #chop sense list down to ‘n’ values if that option is nonzero if( $optsHash(’n’} ){ if( $#senses >= $optsHash’n’} ) splice @senses, $optsHash{’n}; #chop from n onward  foreach my $sense (@senses)( @currSyns = $wn->querySense($sense, ‘syns”); foreach my $syn (@currSyns) { $syn =- /“(V\4t]+)\4t/; #grab the word before the first ‘#‘ = $1; #if not multiword lex, add it #expect only for multis, but be safe unless( $w = I[_ 1/ )( $syns(”$w:J” }=undef; —  if) $optsHash’s} )( @currSims = $wn->querySense($sense, “sin”); ‘similar to’ synsets:\n”; #print “ foreach my $sim (@currSims){ @currSyns = $wn->querySense($sim, “syns”); #print “ $sim: “, join(”, “, @currSyns), ‘\n”; foreach my $sym (@currSyns) { $syn =— /“([\#]+)\#/; #grab word before first ‘#‘ $w = $1; #if not multiword lex, add it unless ( $w = / [_ ] / ) ( #expect only —‘ but be safe $syns”$w:J”=undef;  if( $optsHash(’a’} )( @currAlsos = $wn->querySense($sense, “also”); #print “ ‘see also’ synsets:\n”; foreach my $also (@currAlsos){ @currSyns = $wn->querySense)$also, “syns”); #print “ $also: “, join(”, “, currSyns), ‘\n’; foreach my $syn (@currSyns) ( $syn = /“([“\#]-‘-)\#/; #grab word before first $w = $1; #if not multiword lex, add it  ‘#‘  62  unless( $w =- I[_ 1/ )( $syns(”$w:J”)=undef;  #expect only  —,  but be safe  #add our new synlist to the adj entry of synHash $synHash($adj)=[sort keys %syns]; synonyms of $adj: “, join)’, “, $synHash($adj}}), print ““, #now we have the synonyms in %syns, return @$syriHash{$adj));  “\n;  return them (the keys)  sub getAntonyms{ my ($adj) = #unless we’ve already generated the synonyms on a prey iteration.. unless( exists($antHash{$adj)) ){ my $word; my $w; my @senses = 0; my %ants = 0; my @currAnts = 0; my @currSyns = 0; #convert into wordnet syntax  I([\]+)\:J/;  $adj $word  =  $1  .  #gen all senses @senses = $wn->querySense($word); #aug3—chop sense list down to n values if that Option is nonzero if( $optsHash{’n’} )( if( $#senses >= $optsHash{’n’) ){ splice Psenses, $optsHash(’n’}; #chop from n onward  foreach my $sense (senses)( @currAnts = $wn->queryword($sense, ants”) #print “‘antonym synsets:\n”; foreach my $ant (@currAnts) @currSyns = $wn->querySense($ant, “syns”); #print ‘ $ant: “, join(”, “, @currSyns), “\n”; foreach my $syn (@currSyns) { $syn = /“([“\#J+)\#/; #grab the word before the first ‘#‘ $w = $1; #if not multiword lex, add it unless( $w = for multis, but be sate #expect only /[_ 1/ )( $ants{”$w:J’)=undef; —  #add our new synlist to the adj entry of synHash $antHash($adj}=[sort keys %ants]; print ““, “antonyrns of $adj: “, join(’, “, {$antHash($adj)}), #now we have the antonyms in %ants, return @($antHash{$adj } };  “\n;  return them (the keys)  ###END  #! /home/bin/perl #123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ###filename: 2—log-lex-count.pl ###author: Adam Longton, longtoncs.ubc.ca  63  # this script is 1 of 3 that do the log-likelihood lexicon expansion. #*****************************************************************  #description, inputs and Output: #input: a positive lexicon, a negative lexicon, and a corpus of stemmed #and POS-tagged sentences, in the format #of the output of phase 1 script #the pstv and ngtv lexicons are lists of word P05 pairs, one per #line (see eg.) # #there is an option, -n, that will only print lines with at least #one nonzero count. (this presumably reduces the data set by a large #fraction, down to clauses with only first generation collocations #with lexicon words.) 4 #output: to standard output, each line is a clause, of form: #<clause-id>, <PC>, <nc> ( <stem>, <pos>) + #that is, a clause id, pstv lexicon-word count, ngtv lex-word-count, #and the stem/pos pairs for that clause. ,  #example: #say you have these input files: #corpus .txt: #23,0, O,be:V,very:R,very:R, good:J,poor:J,nice:J #24,0, 0,Amy:N,be:V, funny:J #poS .txt: #excellent :J #good:J #nice:J #neg. txt: #poor : J #bad:J #nasty:J # #command-line execution: #>2-log-lex-count.pl -n posv.txt negv.txt corpus.txt #output: #23,2,l,be:V,very:R,very:R,good:J,poor:J,nice:J ******** ** * **** *** *** * ***** ** ***** *****************  use strict; use warnings; my $usage = “usage: perl $0 [-n] <pstvfile> <ngtvfile> <corpustile>\n”; die $usage if ($#ARGV < 2); ###process option my $zl; #default is to print limes with double zero counts my $opts; if($#ARGV > 2){ $opts = shift; $opts = /-n-t-/ or die “file error: $usage”; $z0; #meaning double zero is not permitted #process infiles my $pfile = shift; my $nfile = shift; my $corpfile = shift; open(PSTV, $pfile) or die “file error: $usage”; open(NGTV, $nfile) or die “file error: $usage”; open(CORP, $corpfile) or die “file error: $usage”; #hashes to hold lexicons- store items as strings in sane form as #input, ie “stem:pos” (with the colon), this poe info is needed to #(possibly) distinguish between homonyms of different poe. # #may incorporate sense number too if WSD is done in future... my %pHash = 0;  64  my %nHash = 0; my Slime; while (Slime = <PSTV>)( chomp Slime; SpHash(Sline) =umdef; close (PSTV) while (Slime = <NGTV>(( chomp Slime; SmHash(5 lime) =umdef; close (NGTV); #comstruct outlimes on the fly from corpus inlimes my Spc; my Smc; my @lineArr; while (Slime = <CORP>)( #reset current counts Spc=O, Snc=O; chomp Slime; @limeArr = split /,/,Slime; for (my 5i3;5i<5#lineArr;5i++) C Spc++ if( exists(SpHash(SlineArr[5i])) Smc++ if( exists(SnHash(SlineArr[5i)}) SlimeArr[lJ=Spc; #update counts SlimeArr[2)=Snc; Spc>O if(5z Snc>OY( #consider mom-zero option print joim(’, ‘,@lineArr(; primt “\n”;  close (CORP(; ###END  #1 /home/bin/perl #123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ###filemame: 3-log-word-stats.pl ###author: Adam Lomgton, lomgtom@cs.ubc.ca # this script is 2 of 3 that do the log-likelihood lexicon expansiom.  #descriptiom, imputs amd output: #imput: the output of phase 2 (see its comments(. it is a file #of clauses with counts of pstv and mgtv lexicon word occurences. #the options are just like phase 1 they allow you to filter for #specific parts of speech. this is a separate operation from the #filterimg in phase 1. here it is so you cam generate a stats file #om only those pos’s you’re interested in (presumably the ones that you #predict will carry polarity(. -  #output: to standard output each lime is stats of a word, of form: #<stem> : <pos>, <ic>, <pic>, <mic>, <pcc>, <ncc> #where #ic = imstance coumt, the number of occurences of that word in pstv #or ngtv comtexts. a pstv (mgtv( ‘comtext’ is a clause that #has at least ome pstv (mgtv) lexicon word in it. #pic = pstv instance count, number of occurences of that word in pstv #contexts #nic = same as pic, but for ngtv contexts #pcc = pstv collocation count, the number of pstv lexicon words #that co-occur in clauses with the word #ncc = same as pcc, but for mgtv lex words #note for the special case of the same word appearing multiple #tines in a clause, each instance is processed separately, so #imstances and collocations all get multiply counted in these cases -  65  #(even for the same instances of lexicon words). #the output will be in alphabetical order. # #the first four lines are totals for the 4 parts of speech #example: #say you have this input tile: #corpus. txt: #23,2,l,be:V,very:R,very:R,good:J,poor:J,nice:J #24,l,0,Amy:N,be:V,exceptionally:R,funny:J,nice:J # #command-line execution: #>3-log-word-stats .pl corpus .txt # #output: #N, 1, 1, 0, 1, 0 #V, 2, 2, 1, 3, 1 #J, 5, 5, 3, 8,3 #R,3, 3, 2, 5,2 #P,my:N, 1,1,0,1,0 #be :V, 2,2,1,3,1 #exceptionally:R, 1,1,0,1,0 #funny:J, 1,1,0,1,0 #good:J, 1,1,1,2,1 #nice:J, 2,2,1,3,1 #poor:J, 1,1,1,2,1 #very:R,2,2,2,4,2 *****************************************************************  use strict; use warnings; ##for now don’t impl onlyfile my $usage = “usage: pen $0 —nvjr <corpusfile>\n”; die $usage if ($#ARGV < 0); my %optsHash  =  (‘n’,l, ‘v’,l, ‘j’,l, ‘r’,l); #flags, default all on  ###process options my $opts; my @optsArr; if($#ARGV > 0)( $opts = shift; $opts =- /“-/ or die “file error: $usage”; if($opts eq ‘-‘((die “file error: $usage”;} $opts = substr($opts,1); if($opts =— /[“nvjr]/) (die “file error: $usage”;} @optsArr = split //,$opts; %optsHash = (‘n’,O, ‘v’,O, ‘j’,O, ‘r’,O); #reset foreach my $c (@optsArr) ( $optsHash($c}=1;  #process infiles my $line; #process corpus input file my $corpfile = shift; open(CORP, $corpfile) or die “file error: #word hash keys are stem:pos strings #values are rats to anon arrays of #size 5, for ic,pic,nic,pcc,ncc counts my %wHash = 0; -  $usage”;  (just like input)  and  #totals hash my %tHash = (‘N’=>[O,O,O,O,O], ‘V’=>(O,O,O,O,O] ‘J’=>{O, 0,0,0,0]  66  k =>[0, 0,0,0,01); #generate and update word hash entries fron corpus mimes #keep running global totals counts.. my $pc; my $nc; my @lineArr; my $pos; while (Slime = <CORP>({ chomp $line; @lineArr = split /,/,$line; $pc=$lineArr[l]; $nc=$limeArr[2]; for (my $i=3;$i<=$#lineArr;$i++) { $pos  =  substr($lineArr[$i],—l,l);  ##added may3o--need pos-filtering here, not im step 1 if( $pos eq ‘N’ && $optsHash(’n’) Spos eq ‘V && $optsHash(’v’} && $optsHash(’j’} Spos eq ‘R’ && $optsHash(’r’) (C  $pos eq ‘J’  unless( exists($wHash($lineArr[$i])) ( #imitialize the anon count array for curr word $wHash{$lineArr[$i]}=[0,0,0,0,0]; 5wHash($limeArr[$iJ}—>[0]÷-t-; SwHash($lineArr[$i]}->[l]÷+ if $pc; $wkash{$lineArr[$il)->{2]÷+ if $nc; $wkash{SlineArr[$i])—>[3] += $pc; $wI-{ash{$lineArr($i])—>[41 += $nc; #update running totals #$pos = substr($lineArr[$i],-l,l(; $tkash($pos)—>[0]-i-+; $tHash($pos)—>[l]+-i- if $pc; $tHash($pos)—>[2]-i-÷ if $nc; $tHash($pos)—>[3] += $pc; $tHash($pos)—>[4] ÷= Snc;  close(CORP(; #print totals as first 4 lines of output foreach my $pos (‘N’, ‘V’, ‘J’, ‘R’({ print $pos print join(’, ‘, @{$tHash($pos}}(; print “\n”; .  #print the words with their counts, in lexical order foreach my Sword (sort keys %wHash(f print Sword print join(’,’, @($wHash($word))(; print “\n”;  ###END  4! /hone/bin/perl #123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ###filenaxne: 4-log-word-scores .pl ###author: Adam Longton, longton@cs.ubc.ca # this script is 2 of 3 that do the log-likelihood lexicon expansion.  #description, inputs and output: #input: the output of phase 3 (see its coxnnents(. it is a file #of lines of 6 comma separated fields. they are sten:pos entries with  67  #5 various polarity word collocation counts. #output: word, score pairs (score defnd by [yuHatzo3] eqn) S #example: S #imput file, word—stats.txt: SN, 1, 1, 0, 1, 0 #V, 2, 2, 1, 3, 1 5, 5, 3, 8,3 #R,3,3,2,5,2 #Axny:N, 1,1,0,1,0 #be:V,2,2,1,3,l #exceptionally:R, 1,1,0,1,0 #funmy:J, 1,1,0,1,0 #good:J,l,1, 1,2,1 #nice:J, 2,2,1,3,1 #poor:J,l,1, 1,2,1 #very:R,2,2, 2,4,2 #connand-line execution: 5>4-log-word-scores .pl word-stats .txt S #output: #Amy:N, 0 #be:V, 0 #exceptionally:R, 0.310154928303839 #funny:J, 0.211309093667207 #good:J, —0.376477571234912 #nice:J, —0.040005334613699 #poor:J, —0.376477571234912 #very:R, —0.200670695462151 ** * * * * * * ******* * ******* ****** ***** * ** **********  use strict; use warnings; my Susage = “usage: perl 50 <word-stats-f ile>\n”; die Susage if (5#ARGV < 0); #define epsilon for smoothing (for zero counts) my Sep=0.5; #process imfile my 5wordfile = shift; opem(WFILE, Swordfile) or die “file error: Susage”; my my my my my  Sword; my Sic; my Spic; my Smic; my Spcc; my Smcc; Sscore; Slime; Spos; @tempArr;  #grab totals info from 1st 4 limes. my%tHash= (‘N’=>[],’V’=>[],’J’=>[],’R’=>[]); for(my 5i=0;5i<4;$i++){ Slime = <WFILE>; chomp Slime; @tempArr = split /,/,Slime; Spos = shift @tempArr; for(my 5j=0;5j<5;5j-i-+){ push( @(StHesh(Spos}}, StempArr[5j1  );  #precalc global score offsets for each pos. tack it onto the #anom array in tRash Smote that the offset is +totalmcc-totalpcc my Sos; my Stmcc; my Stpcc;  68  foreach my $pos (N, ‘V’, ‘J’, ‘R) $tncc=$tHash($pos)—>[4] $tpcc=$tHash($pos)->[31; #these cases shouldn’t occur in a full corpus, #$tncc = 0.5 if $tncc == 0; #$tpcc = 0.5 if $tpcc == 0; log($tpcc + $ep) $os = log($tncc + $ep) #$os = 0; ###txnp push @{$tHash($pos}), $os; #print $pos. ‘, exp($os) $os  only here for testing  -  .  .  ‘,  ‘  .  .  my $expscore; #print the words with their scores, in lexical order while ($line = <WFILE>){ chomp $line; ($word,$ic,$pic,$nic,$pcc,$ncc) = split /,/,$line; $pos = substr($word,-l,l); #last char $score = log($pcc + $ep) log($ncc + $ep) ÷ $tHash($pos}->[5J; $expscore = sprintf( “%.5f”, exp($score) ); print ‘$word, $score, $expscore, $ic, $pic, $nic, $pcc, $ncc\n”; -  close(WFILE); ###END  #! /home/bin/perl #123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ###filename: 5-score-corp.pl ###author: Adam Longton, longton@cs.ubc.ca # this script takes a scored lexicon and a formatted tagged annotated corpus and predicts the polarity of the sentences # in the corpus. #It has options j nvrpmcdgeandb. # j n v r are the parts of speech you want to score in the corpus. presupposition context shifter # p nodal” context shifter # m “contrastive” context shifter # c “def lag” option. these are “stop” words that signal a break in the negation context # d window, basically it # provides a coarse estimate of syntactic boundaries, as far as negation words are concerned, if a stop word # cones after a negation word, it “deflags” this negation word, reducing its effect to nothing. # g “negation” context shifter # e effective opinion processing (tie breaking and optionally but-logic) # b “but” context shifter or effective opinion but-logic (toggled in code) -  -  -  -  -  -  -  *  use strict; use warnings; my $usage = “usage: perl $0 -jnrvdgbcmpxe <scorefile> <corpusfile> [<negWindow>) [g<addNegWord>] [c<addConWord>] [def<defaultScoreCode>) \n”; my $minArgs = 3; # opts letters after the “-“ are manditory. this simplifies the optional params like negWindow. my $rninLastlndexOfArgv = $miriArgs - 1; die $usage if ($#ARGV < $rninLastlndexOfArg-v); my %optsHash = (‘n’,l, ‘v’,l, ‘‘,l, ‘r’,l, ‘g’,l, ‘b’,l, ‘d’,l, ‘c’,l, ‘m’,l, ‘p’,l, ‘x’,l, ‘e’,l); #process options my $optsStr = my $opts; my @optsArr; $opts = shift; $optsStr = $opts; $opts =- /“-/ or die “file error:  $usage”;  69  if($opts eq -‘)(die “file error: $usage”;) $opts = substr($opts,l); file error: $usage”;} if($opts =- /[nvjrgbdcmpxe]/) (die @optsArr = split //,$opts; %optsHash = (‘n,O, ‘v’,O, ‘j’,O, ‘r’,O, ‘g’,O, ‘b’,O, ‘d’,O, ‘c’,O, ‘m’,O, ‘p’,O, ‘x’,O, ‘e’,O); #reset foreach my $ch (@optsArr) $optsHash{$ch}=l;  #get input files and params #throw word scores file into a hash (word/score as key/value) my $scorefile = shift; open(SCOR, $scorefile) or die “file error: $usage”; my%wHash= 0; my @lineArr; my @annArr; my $line; while ($line = <SCOR>)( chomp $line; @lineArr split /,/,$line; #only add cand scoredwords (for nvjr subsets) if) &isCandidateString($lineArr[O]) )( $wHash($lineArr[O] )=$lineArr[1];  close(SCOR); # process other params: my $corpfile = shift; open(CORP, $corpfile) or die “file error:  $usage”;  #append the outputfile with the same suffix found in the input corpus file my $corpTag = if) $corpfile =— /lout(—.+)\.txt/ )( $corpTag = $1;  #some params of overrides. my $negWindow = my $butWindow = sentence...) my $conWindow =  the corp scoring loop further down, moved up to 6; #5 is better for j only 9999; #for advanced but as negation,  here for optional  inf window (ie to beginning of  8;  * optional overrides my $negWinStr = my $negWinArg = if($#ARGV >= O)( #means we have further args we haven’t shifted Out yet $negWindow = shift; $negWindow = /“\d-i-$/ or die “file error: $usage”; #must be an posy integer $negWinArg = $negWindow; $negWinStr = “-$negwindow”; my $addNegWordStr = my $addNegWordArg = if($#ARGV >=O)( #optionally add “no” to meg list. hurts adj only expts but helps jnrv. use 0/1 as bools and req prepend with g. $addNegWordArg = shift; $addNegWordArg =- /“g[Ol]$/ or die “file error: $usage, bad arg: $addNegWordArg”; $addNegWordStr = “-$addNegWordArg”; my $addConWordStr = my $addConWordArg = if($#ARGV >=O){ #optionally add “while” to contrastives list, use 0/1 as bools and req prepend with c. $addConWordArg = shift; $addconWordArg =- /“c[Ol]$/ or die “file error: $usage, bad arg: $addConWordArg”; $addConWordStr = “-$addConWordArg”;  70  my $defscorestr = my $defScoreArg = if($#ARGV >=O){ #optionally specify the default scoring algortihm for ties and nulls. 0alwaysNegv, l-alwaysPosv,2-UsePrevsentscore,3-alternate-t-and-. req prepend with def. $defScoreArg = shift; $defscoreArg =— /Adef[0123]$/ or die “file error: $usage, bad arg: $defScoreArg”; $defscorestr = “-$defscoreArg”;  #set up output files open (RES, “>results$(corpTag)${optsStr)$(negwinstr}$(addNegwordstr}$(addconwordStr}$(defScoreStr} t xt”) or die “error trying to open filel for writing”; open(ACC, “>acc.txt”) or die “error trying to open file2 for writing”; .  #reconstruct arguments of this script call and record this in the results output file for reference my $coninandAndArgs = “perl $0 $optsStr $scorefile $corpfile $negWinArg $addNegWordArg $addConWordArg $defScoreArg”; print RES “command: $comnandAndArgs\n”;  #now score each test line by averaging its words that are #in the word score hash, default score is zero (neutral polarity(. my $score; my Sn; #number of scored words found in the current sentence #counters my $nSuccNonzero = 0; ny $nSucczero = 0; my $nSuccNull = 0; my $nNonzero = 0; my $nzero = 0; my $nNull = 0; my $nSuccNonZeroEff = 0; my $nNonzeroEff = 0; my $nSucccandNull = 0; my $ncandNull = 0; my $cendFlag = 0; my $cends = #internal flag-switch for using nulls or counting null words as zero. my $usezeros = 0; #needed for formatted number output my $number = 0; my $rounded = 0; #previous binary score holder for default scoring in 0.000 and null cases my $prevScore = ‘+‘; #note the arbitrariness of + vs -  #neg and but flags my $negX; #negetion multiplier 1 or -l my $negcount; my $butX; my $butCount; my $butFlag = 0; my $conX; my $concount; my $conFleg = 0; my %deflagHash = 0; if( $optsHash(’d’} ){ %deflagHesh = (“but”,O, “except”,O, “however”,O, “only”,O, “elthough”,O, “though”,O, “while”,O, “wherees”,O,  my %contrastHash = 0; if( $optsHash(’c’} (C %contrastHash = (“although”,O,”despite”,O,”while”,O(; if($addConwordArg eq “cO”(( %contrastHash = (“although”,O,”despite”,O);  my %nodell-{ash = 0; if( $optsHash{’n’ (C %nodelHash = (“would”,O,”should”,O); #%modalHash = (“would”,O,”should”,O,”could”,O); my %presupHash = 0; if( $optsHash{’p’) (C %presupHesh = (“niss”,O,”forget”,O,”refused”,O,”assumed”,O,”hard”,O,”harder”,O,”less”,O);  71  # to be investigated: my %excessHash = U; if( $optsHash(’x’} H %excesshash = (“overly”,O); #%excessaash = (“overly”,O,”too”,O,”enough”,O); my %megxash = U; if( $optsHash{’g’} (C %megHash = (“not”,O,”never”,O,”n’t”,O,”doesmt”,O, “‘t” ,O,”cannot”,0,”nothing,O,”nor”,O, “dont”,O,”wou ldnt”,O,”no”,O); if($addNegWordArg eq “gO”)( %megHesh = (“not”,O, “never” ,0, “n’t” ,0, “doesmt”, 0, “‘t” ,0, “camnot” , 0,”nothing” ,0, “mor”,O, “dont”,O, “wou ldnt”,O); #redefine w/o no.  my %butHash = (1; if( $optsHash(’b’) (C %butHash = (“but”,O,”except”,O,”however”,O); #to be investigated: yet though whereas despite nevertheless nonetheless  my $negflistance; my $but]Jistance; my $conDistance; # but variables my $advancedBut = 1; #enables “but” shifter for but-as-negation my $butAsNot = 1; # 0 means just exclude don’t invert affected words my $butflefaultFlip = 0; # experimental my $effButEmabled = 0; # used wih effective opinion to do hu-liu but-conjunction logic. if this is 1 $advancedBut must be 0 and vice versa my @wordArr; my $advCon = 1; my $comAsNot = 1; #for the flip def score alg, my $zflip = ‘—‘ my $nflip =  for corps of isolated setences  S loop through the corp file and score each line while ($line = <CORP>)( chomp $line; @lineArr = split I,/,$line; $score = 0; = 0; my SnunEffopimions = 0; Swill be the number of found effective opinions that contributed to the score, for the sake of averaging later. my @predArr = (0,0,0,0); #fourth elem for eff tag, moved inside the while loop SnegX = 1; SnegCount = -1; $butX = 1; Sbutcount = -1; $butFlag = 0; #0 no 1 yes 5conX = 1; $comcount = -1; $conFlag = 0; my Sfeaturestr  =  SlineArr[2];  $candFlag = 0; $cands = &isCandidateString(Sline); if) $camds (C $candFlag = 1; $lineArr[2] = $limeArr(2] “;CANDS\=$cands”; #feature list support .  else {  72  $lineArr[2]  =  $lineArr[2]  .  #**********************compute predicted score #first left to right thru sent, for neg processing. #this sec also used for nost tagging (post-neg shifters) for(ny $i3;$i<””$#lineArr;$i++) ( #context shifter processing: #negcount if) $negCount >= 0 ){ #careful with $negCount - -; if) $negCount < 0 (C $negX = 1;  and  >=  #def lagging elsif) &isfleflagWord)$lineArr[$i]) $negX = 1;  <  )  #check for negation words, if so, flip neg flag if) &isNegWord)$lineArr[$i]) (C $negX = -lj $negCount = $negWiodow; ##reset neg window “NEG:$negwindow”; $lineArrl:$i] = $lioeArr[$i) “;“ -  .  #flag but words (no window for default) if( &isButWord($lineArr[$i]) (C if) $i == 3 (C #only set flag for def if at beg of sent $butFlag = 1; $lineArr[$i] = $lineArr[$i] “BUTS”; #special tag “;“ .  -  else( $butFlag = 1; #exluding helps All, hurts J $lineArr{$i] = $lineArr[$i] “BUT”; “;“ .  .  #contrastives if) &isContrastWord($lineArr[$i]) $conFlag = 1; if( $i == 3 (C SlineArr[$i] = $lineArr[$i] elsef $lineArr($i]  =  $lineArr[$i]  (C  .  “;“  -  “CONS”; #special tag  .  “;“  -  “CON”;  #tag nodals if( &isModalword($lineArr[$i]) (C #grab rest of sent, try to pattern natch.. my Srest = join)’ ‘,@lineArr[$i-i-l $#lineArrj(; ..  #note pattern accounts for stuff like: would not have ever been.. #night want other-than R’s, say ‘sort:? of:?’ /A)\S+\:R (*be\:/ if) $rest /A)\S+\:R (*have\:\S+ )\5+\:R (*been\:/ )C $rest $conFlag = 1; #just combine processing with cons for inverting shifters $lineArr[$i] = $lineArr[$i] “MOD”; “;“ .  .  if) &isPresupWord)$lineArr[$i]( ( if) $lineArr)$i] = /Ahard/ (C #hard, harder if) Si < $#lineArr (C /Aj;.\./ if) $lineArr[$i+l] (C SconFleg = 1; $lineArr[$i] = $lineArr[$i] “ORE”; “;“ -  elseC $conFlag = 1; $lineArr[$i]  =  $lineArr{$i]  -  “;“  -  -  “PRE”;  73  if( &isExcessWord($lineArr[$i]) $conFlag = 1; $lineArr($i] = $lineArr[$i]  ) “EXC”;  .  .  #only try to score candidate words (ie, of valid P08) if( &isCandidatestring($lineArr[$il) if( exists($wHash($lineArr[$i]}) (C $number = $negx * $wI-Iash{$lineArr[$i]); #neg $score = $score + $number; $rounded = sprintf(’%.3P, $numher(; $rounded; #if want individual word scores $lineArr[$i) = $lineArr[$i] .  .  appended if( $negX == —l (C $negDistance = $negWindow $lineArr[$i] = $lineArr[$i]  $negCount; “NG0:$negDistance”;  -  .  .  ##contrastives and other inverting shifters if( $advCon && $conFlag (C for (my $i=3;$i<=$#lineArr;$i++) if( $conCount >= 0 (C #careful with >= and $conCount- -; if( $conCount < 0 ){ $conX = 1; eisif( &isoeflagWord($lineArr[$i]( $conX = 1;  #nod if( $lineArr[$i] =- /CON/ II $lineArr[$i] =— /MOD/ II $lineArr[$iJ =— /PRE/ (C $conX = -1; $conCount = $conWindow; $lineArr[$iJ = $lineArr[$i]  .  <  (C  “:$conWindow”;  elsif( $lineArr[$iJ =— /\;-?\d/ (C #is a scored word if( $conx == —l (C $conDistance = Sconwindow $conCount; tICND:$conoistancefl; $lineArr[$i] = $lineArr[$i) #then flip sign of word score,adjust sentence score: @wordArr = split /;/,$lineArr[$i); $wordArr[l] = (-1) * $wordArr[l]; $score += $wordArr[l]; #nullifies original contribution if( $conAsNot (C $score += $wordArr[l]; #othe rdirection elset $wordArr[l] = 0; #annotate it as 0.000 (nullified) -  .  $wordArr[l] $lineArr[$i]  = =  .  sprintf(”%.3f”, $wordArr{l](; join(’;’,@wordArr(;  #now proc right to left for advanced but-as-not scoring.., comes after the Cons section cuz it has weaker precedence. if( $advancedBut && $butFlag (C for (my $i=$#lineArr;$i>=3;$i——( C if( $butCount >= 0 (C #csreful with >= and < $butCount - -;  74  if( SbutCount Sbutx = 1;  <  0 )(  if) $lineArr[5i] =— /BU’P/ ){ Sbutx = -1; SbutCount = Sbutwindow; SlineArr[5i] = SlineArr[5i)  .  “:$butWindow”;  elsif) SlineArr[Si] =— /\;—?\d/ ){ #is a scored word if) $butX == —l )( Sbutoistance = SbutWindow - Sbutcount; ;N SlineArr[5i] = SlineArr[Si] “BTD:SbutDistance”; #then flip sign of word score,adjust sentence score: @wordArr = split /;/,SlineArr[Si]; SwordArr[l) = (-1) * SwordArr)l]; Sscore ÷= SwordArr[l]; #nullifies original contribution if) SbutAsNot )( Sscore += SwordArr[l]; } #otherdirection else( SwordArr[l] = 0; #annotate it as 0.000 (nullified) SwordArr[l] SlineArr[5i]  = =  sprintfV’%.3f”, SwordArr[l]); join)’;’,@word.Arr);  apply effective-opinion tie breaking and but-conjunction requested if) Soptsl-{ash(’e’} && Sn > 0 && )Sscore == 0 H )SeffButEnabled && SbutFlag)) #for each feature word, try to find it in the sentence. if found, find the scored word. #add that score to the current score. if) Sfeaturestr =— /FEATURES=).-’-)/ ){ ny SeffScore = 0; #if this line contains a but word, consider the line to begin after the but word. ny 5firstButlndex = 3; if)SbutFlag)( #find the index of the word after the first but word for)ny Sn = 3; Sn <= 5#lineArr; 5n++)( if) &isButWord)SlineArr[5n]) H SfirstButlndex = Sm; last;  logic if )( closest  first  my SfStr = 51; my @featureArr = split /:/,5fstr; Ll: foreach my Sfeature ) @featureArr ){ 5feature =— s/1V\[1+)\[.*/5l/; #throw away the trailing score stuff 5feature =— sI”).+)_5/5l/; #just in case a feature ended with a my @currFeatArr = split /_/,Sfeature; # general case n-word feature, simple case is 1-word )ie this arr is length 1) my 5firstWordlndex if)SbutFlag) ( $firstWordlndex  =  3; # 3 is the index of the first word in the sentence  =  SfirstButlndex  1; )tstart after the first but word  +  #try to find index in sentence of ScurrFeatArrto] ny Si = 5firstWordlndex; L2: for) ; Si <= 5#lineArr; 5i++ ){ 5lineArr[5i] =— /“)[‘:]+):/; #grab word my Sword = $1; if) &stemNatch) ScurrFeatArr[0], Sword ) (C last L2 if) 5#currFeatArr == 0 ); #1-word feature found #if number words left in the feature > number words left in sentence: if) 5#currFeatArr > )5#lineArr Si) (C Si = 5#lineArr + 1; #fall off the end of the sentence -  75  last L2; #else attempt to match the whole phrase: 12: for( my 5j = 1; 5j <= 5#currFeatArr; Si++ H /([A:]+):/; #grab word SlineArr(5i-’-SiJ Sword = $1; next L2 unless( &stemMatch( ScurrFeatArr[Si], Sword ( last L2;  );  #n-word featura found  next Ll if( Si > 5#lineArr ); #feature not found #otherwise find its effective opinion (ie the score of the nearest scored search left and right, word). #tracking score of mm distamce scored-word, search left: my SleftDist = 0; my SleftEffOpimiom = 0; 1; Si >= SfirstWordlndex; Si—— H for( my Si = Si /[“:]+:[:J+;(—?[\dJ+\.{\d]--)/ )( #if word is scored, if( SlineArr[Si] grab score SleftEffOpinion = 51; SleftDist = Si Si; last; —  —  my SrightDist = 0; my SrightEffOpinion = 0; for( my Si = Si + 5#currFeatArr + 1; Si <= 5#lineArr; Si---- ){ if( SlineArr[5i] =— /‘[:]+:[:]+;(—?[\d]+\.[\d]+)/ )( #if word is scored, grab scqre SrightsffOpinion Srightflist = Si last;  unless( SleftDist  =  $1;  (Si  -  +  5#currFeatArr);  0 && SrightDist  ==  0 (C #if an effective opinion was  ==  0  =  0; 0 && ( SleftDist SrightEffopinion;  =  SleftEffopinion;  ==  found my SeffOpinion if( SrightDist Seffopinion elset SeffOpinion  = >  SrightDist  <=  SleftDist (  )  (  SeffScore += SeffOpinion; my Srounded = sprimtf(”%.3f”, SeffOpinion(; #we found an eff opinion and added it to our sentence score. mark the feature with it: for( my Si = Si; Si Si <= 5#currFeatArr; Si++ (( #should be bare (no “;‘(, but iust in case it was also scored or a marked shifter: my @wArr = split /;/,SlineArr[5i1; my Smarker = Si == 5#currFeatArr (( if( Si Smarker = Smarker ‘ : Srounded”; SnumEffOpinions++; —  -  .  push @wArr, Smarker; SlineArr[5i1 = ioim  “;  “,@wArr;  * update the line score with the effScore 0 (C if( Seffscore if($butFlag( C SpredArr(31 = “EFB:SmumEffopioions”; determined by the effective-opinion-but rule elset SpredArr[3] = “EFF:SmumEffopinions”; become nonzero via effective opinion tie breaking  #mark the sentemce if the score was  #mark the sentence if the score has  76  $score = $effScore; elsif($butFlag && $score != -1; #if no eff opinion found past the but word, we invert the $score originally found score $predArr[3] = “EFN:$numEffOpinions”; #mark the sentence if the score has was flipped because of an effective-opinion-free but-clause ‘=  #**********************set score, ##grab annarr @annArr = split /;/,$lineArr[Oj;  outcome and counts  #********null case if($n == O)( $predArr[2] = “null”; $predArr{l] = &nullflefaultScoreQ; #cmp to ann if( $annArr[l] eq ‘+‘ ){ if( $predArr[l] eq H-’ ){ $predArr[O] = “SUCCESS”; $nSuccNuil++; if( $cendFlag ){$nSuccCandNull++;} else( $predArrtOj = “FAIL”; $nNul 1 + +; if( $candFlag )f$nCandNull+÷;3 elsif( $annArr[l] eq ‘-‘ )( if( $predArr[lj eq ‘-‘ $predArr[O] = “SUCCESS”; $nSuccNull++; if( $candFlag ){$nSuccCandNull++;} else{ $predArr[O]  = “FAIL”;  $nNull++; if( $candFlag ){$nCandNull+-i-;) else{ #won’t happen udless have m and 0 anns $predArr[0] = “IGNORE”;  else{ #***********nonnull case, ie n>0 if( $numEffOpinions > 0 )( $score = $score/$numEffOpinions; else{ $score = $score/$n; # average, but could also use raw score $rounded = sprintf(”%.3f”, #set float score $predArr[2] = $rounded;  $score);  #********zero case — tie if( $predArr[2] == 0 ){ $predArr[l] = &zeroDefaultScore; #cnp to ann if( $annArr[l] eq ‘+‘ )( if( $predArr[l] eq ‘+‘ )f $predArr[0] = “SUCCESS”; $nSucczero++; else(  77  $predArr[OJ  FAIL;  =  $nZero++; elsif( $annArr[l] eq ‘-‘ H if( $predArr[l] eq ‘H $predArr[O] = “SUCCESS”; $nSucczero-l-+; elset $predArr[O)  “FAIL”;  =  $nzero++; elsef #won’t happen unless have m and 0 anns $predArr[0] = “IGNORE”;  #**********nonzero case else{ #+ or -? if( $predArr[2] > 0 ){ $predArrtl] = elsit( $predArr2] $predArr[lj =  <  0  else{ #unexpected print STOERR “ERROR,  )(  sent score was $pradArr[2]\n”;  #cmp to ann if( $annArr[l] eq ‘+‘ )( if( $predArr[l] eq ‘+‘ H $predArr[0] = “SUCCESS”; $nSuccNonlero++; if( $nunEffOpinions > 0 ){ $nSuccNonzeroEff++;  elset $predArr[0]  =  “FAIL”;  $nNonEero++; if( SnunEffOpinions > 0 )( $nNonzeroEff-l-+;  elsif( $annArr[l] eq ‘-‘ H if( $predArr[l] eq ‘-‘ H $predArr[0) = “SUCCESS”; $nSuccNonzero++; it( $numEffOpinions > 0 H $nSuccNonEeroEff++;  else( $predArr[0]  =  “FAIL”;  $nNonzero++; if( $nunEffOpinions > 0 H $nNonzeroEff+-t-;  else( #won’t happen unless have $predArr[0] = “IGNORE”;  iii  and 0 anna  #record current predicted binary score as prevScore $prevScore = $predArr[l];  78  #put predArr into lineArr[l] (lineArr[21 has FEAT and C1ND into) $lineArr[l] = join(;,@predArr); #reassemble line and print print join(, ,@lineArr); print close (CORP); #  *************************************************************************  #now stats info at end #composite counts my $nSuccTotal = $nSuccNonZero + $nSuccZero + $nSuccNu1l my $nTotal = $nNonZero + $nZero + $nNull; my $nSuccNonNull = $nSuccNonZero + $nSuccZero; my $nNonNull = $nNonZero + $nZero; my $nSuccCand = $nSuccNonZero + $nSuccZero + $nSuccCandNull; my $nCand = $nNonZero + $nZero + $nCandNull; my $nSuccNonZeroReal = $nSuccNonZero $nSuccNonZeroEtf; my $nNonZeroReal = $nNonZero $nNonZeroEtf; -  -  #accurac ies my $accTotal = 0; my $accNonNull = 0; my $accNonzero = 0; my $accZero = 0; my $accNull = 0; if($nTotal 0){$accTotal = sprintt(’%.Sf”, $nSuccTotal/$nTotal);} it($nNonNull 0){$accNonNull = sprintt(”%.5f”, $nSuccNonNull/$nNonNull);) if ($nNonZero 0) {$accNonZero = sprintf (‘% 5f’, $nSuccNonZero/$nNonZero);) if($nZero 0) ($accZero = sprintf(’%.5t’, $nSuccZero/$nZero) ;) if($nNull 0)($accNull = sprintf(%.5t”, $nSuccNull/$nNull) ;) my $accCamd = 0; my $acccandNull = 0; if($nCand != O)($accCand = sprintf(’%5f’, $nSuccCand/$nCamd);} if($nCandNull 0)($accCandJTull = sprintf(’%.5f”, $nSuccCandNull/$nCandNull);} my $accNonZeroReal = 0; my $accNonZeroEff = 0; if ($nNonZeroReal != 0) ($accNonZeroReal = sprintf ( ‘% 5f’, $nSuccNonZeroReal/$nNonZeroReal) ; } it ($nNonZeroEff != 0) ($accNonZeroEff = sprintf ( “% 5f’, $nSuccNonZeroEff/$nNonzeroEff); #compute coverages my $cvgNonNull = 0; my $cvgNonZero = 0; my $cvgCand = 0; if($nTotal != 0)( $cvgNonNull = sprintt(’%.Sf”, $nNonNull!$nTotal); $cvgNonZero = sprintf C Sf”, $nNonZero/$nTotal); $cvgCand = sprintf(°% 5f”, $nCand/$nTotal); *  print “nSuccTotal/nTotal \t= $nSuccTotal/$nTotal \t= $accTotal\n”; print nSuccCand/nCand \t= $nSuccCand/$nCand \t= $accCand\n”; print “nSuccNonNull/nNonNull \t= $nSuccNonNull/$nNonNull \t= $accNonNull\n”; print “nSuccNonZero/nNonZero \t= $nSuccNonZero/$nNonZero \t= $accNonZero\n”; print “nSuccZero/nZero \t= $nSuccZero/$nZero \t= $accZero\n’; print “nSuccNull/nNull \t= $nSuccNull/$nNull \t= $accNull\n”; print “nSuccCandNull/nCandNull = $nsuccCandNull/$ncandNull \t= $accCandNull\n”; if( $optsHash(’e’} )( print “\n”; print nSuccNonZeroReal /nNonzeroReal \ t= $nSuccNonZeroReal/ $nNonZeroReal \ t= $accNonZeroReal \n”; print “nSuccNonZeroEff/nNonZeroEtf \t= $nSuccNonZeroEft/$nNonZeroEtf \t= $accNonZeroEff\n”; print “\ncoverage:\n”; print nNonZero/nTotal print ‘nNonNull/nTotal print “ncand/nTotal  $nNonZero/$nTotal = $cvgNonZero\n”; $nNonNull/$nTotal = $cvgNonNull\n”; = $nCand/$nTotal = $cvgcand\n”; = =  print RES “nSuccTotal/nTotal \t= $nSuccTotal/$nTotal \t= $accTotal\n’; print RES “nSuccCand/nCand \t= $nSuccCand/$nCand \t= $accCand\n’; print RES “nSuccNonNull/nNonNull \t= $mSuccNonNull/$nNonNull \t= $accNonNull\n”;  79  print RES “nSuccNonZero/nNonZero \t= SnSuccNonzero/SnNonZero \t= SaccNonZero\n”; print RES “nSuccZero/nZero \t= SnSucczero/$nzero \t= Sacczero\n”; print RES “nSuccNull/nNull \t= SnSuccNull/5nNull \t= SaccNull\n”; print RES “nSuccCandNull/nCanclJNTull = SnSuccCandNull/SnCandNull \t= SaccCandNull\n”; if( SoptsHash{’e’} ){ print RES “\n”; print RES “nSuccNonZeroReal/nNonZeroReal \t= SnSuccNonzeroReal/SnNonzeroReal \t= SaccNonZeroReal \n”; print RES “nSuccNonZeroEff/nNonzeroEff \t= SnSuccNonZeroEff/SnNonZeroEff \t= SaccNonzeroEff\n”; print print print print  RES RES RES RES  “\ncoverage:\n”; “nNonZero/nTotal “nNonNull/nTotal “nCand/nTotal  = = =  SnNonZero/SnTotal = ScvgNonzero\n”; SnNonNull/SnTotal = ScvgNonNull\n”; SnCand/SnTotal = ScvgCand\n”;  print ACC “SaccTotal\n”; close (RES); close (ACC); ##############end nain section#######################lt###4t##############4t#  sten, word) neans is sten the sten of word #stemMatch( #do sinple plurals and ing (for eg “playing”) sub stemMatch{ ny )Ssten, Sword) = return 1 if) Ssten eq Sword ); #else attenpt to pull off plural and then conpare again /A))5/ if) Sword = )( return 1 if) Ssten eq $1 ); /A) if) Sword +)es5/ )( return 1 if) Sstem eq 51 );  if) Sword =— /“).+)ies5/ )( return 1 if) Sstem eq )51  “y”)  );  if) Sword =- /“).÷)ing5/ (C return 1 if) Sstem eq Sl ); return 0;  sub zerogefaultScore{ return &defaultScore)”z”); sub nulloefaultScore{ return &defaultScore) ‘n”); sub defaultScore( my SinputStr  =  shift;  if)SdefScoreArg ne ““(C if)SdefScoreArg eq “def 0”)) return if)SdefScoreArg eq “defl”)) return ‘+‘; #def 2 let fall through if)SdefScoreArg eq “def3”)) if)SinputStr eq “z”)) if) Szflip eq ‘+‘ )) Szflip = ‘—‘ return Szflip; else) Szflip = ‘+‘ return Szflip;  80  # else n if( $nflip eq ‘+‘ $nflip = -, return $nf lip;  )(  else{ $nflip = return Snflip;  if( SbutDefaultFlip && SbutFlag ){ #flip prey score if( $prevscore eq ‘+‘ ){return ‘-‘;} elsefreturn ‘+‘; else( return $prevscore;  sub isCandidateString( #applies to both sentences and single words ny Ss = shift; my Sans = if( (SoptsHash(’j’} && (5s =- /\:J/)) H Sans = Sans if( (Soptsl-{ash(’n’} && Sans = Sans “N’;  ($s  =  /\:N/))  H  if( ($optsNash(’v’} && Sans = Sans  ($s  =-  /\:V/))  )(  if( ($optsHash(’r’) && Sans = Sans  (Ss  =—  /\:R/))  H  return Sans;  sub isNegWord{ ny $s = shift; /A([A\.])\:/; #grab word Ss my Sw = $1; if( exists(SmegHash(5w)) ){ return 1;  else{ return 0;  sub isButWord{ my Ss = shift; /N[\:]+)\:/; #grab word my 5w = 51; if( exists($butHash(Sw}) ){ return 1; elseC return 0;  sub isueflagWord( my Ss = shift; Ss /N[’\:]+)\:/; #grab word my $w = $1; if( exists(Sdeflagl-Iash{5w)) H return 1; =—  else{ return 0;  81  sub isContrastWord my $s = shift; /([\:]+)\:/; #grab word my $w = $1; if( exists($contrastHash{$w)) )( return 1; else{ return 0;  sub isModalWord{ my $s = shift; $s = /NV\:]+)\:/; #grab word my $w = $1; if( exists($modalHash($w)) >1 return 1; else{ return 0;  sub isPresupWord( my $s = shift;  /“([“\:J+)\:/; #grab word $s my $w = $1; if( exists($presupHash{$w)) ){ return 1; else return 0;  sub isExcessWord( my $5 = shift; $s =— /“([“\:l+)\:/; #grab word my $w = $1; if( exists($excessHash($w)) )f return 1; elsef return 0;  ###END  #! /home/bin/perl #123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ###filename: run-5-score-corp.pl ###author: Adam Longton, longton#cs.ubc.ca # this script runs 5-score-corp.pl across multiple corpora and summarizes the results. # if WordNet expansion was used, switch on the commented out lines tagged with “# use this for WordNet version’ use strict; use warnings; my $usage = “usage: pen $0 opts scoredLexiconFile\n”; #this is for non WN, use the scored lex file directly. #my $usage = “usage: pen $0 opts pathofScoredCandsFilesFromPh3\n”; # use this for WordNet version die $usage if ($#ARGV < 0); my $opts = if($#ARGV > 0)( $opts = shift;  my $lex  =  shift;  82  # these paths to be setup for a particular experimental environment my $bline = “/ .autofs/homes/ubccshome/l/longton/adam/proj”; my $scripts = “$bline/exp/scripts’; my $here = “$bline/exp”; #experiment root my @corps = (“apex”,”canon”,”creative”,”nikon”,”nokia”); my $line; my $accavg = 0; my $accadd = 0; my $n = 0; my $succSum = 0; my $totalSum = 0; my $fracAvg = 0; open(ACCS, “>accs$opts.txt”) or die “error: couldn’t open accs$opts.txt\n”; print ACCS command: perl $0 $opts $lex\n”; # track how the accs output file was generated my $corp; foreach $corp (@corps)( ‘pen $scnipts/5-score-corp.pl $opts $lex $here/corp/lout-$corp.txt > tmp5out$corp$opts .txt’; #‘perl $scnipts/5-score-corp.pl $opts $lex/scoredAdjs-$corp.txt $here/corp/lout $corp.txt > 5out-$corp$opts.txt’; # use this for WordNet version open(ACC,”acc.txt’) or die “error: couldn’t open acc.txt for $corp\n”; $line = <ACC>; close (ACC); chomp $line; $accadd += $line; $n++; open(RES,’results-$corp$opts.txt”) or die “error: couldn’t open results $corp$opts txt\n”; $line = <RES>; $line = <RES>; #second line has the acc close (RES); chomp $line; print ACCS “$corp: $line\n”; ‘rm acc.txt’; # add up the nSuccs and the n’s to get a weighted fractional avg acc too: if($line =— m (\d+)/(\d+) )( $succSum += $1; $totalSun -‘-= $2; .  if($n>0)($accavg = $accadd/$n;} if($totalSum>O)($fracAvg = sprintf(”%.7f”,  $succSum/$totalsun);} #jyO8  print ACCS “Avg acc: $accavg, weighted: $succSum/$totalSum  =  $fracAvg\n”;  # also run on concatenation of the subcorpora for comparison $corp = “hu”; ‘pen $scnipts/5-score-corp.pl $opts $lex $here/corp/lout-$corp.txt > tnp5out$corp$opts .txt’; #‘penl $scripts/5-score-corp.pl $opts $lex/scoredAdjs-$corp.txt $here/corp/lout-$corp.txt > Sout-$corp$opts.txt’; # use this for WordNet version open(RES, “results-$corp$opts .txt”) or die “error: couldn’t open results-$corp$opts. txt\n”; <RES>; $line $line = <RES>; close (RES); chomp $line; print ACCS “$corp: $line\n”; ‘rm acc.txt’; close (ACCS); ‘rn tnp5*’; #cleanup all the big scored sentence files ##END  83  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0051409/manifest

Comment

Related Items