UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The development and evaluation of a new performance appraisal and training evaluation instrument: the behaviour description index Schwind, Hermann F.


This study had as a major objective the development and validation of a new behaviour-oriented appraisal instrument with increased information content and improved rating characteristics as compared to Behaviorally Anchored Rating Scales. The study was conducted in a sequential manner, beginning with a review of the literature on the problems of performance appraisal and training evaluation and the nature and characteristics of different performance evaluation methods, followed by several developmental phases relating to the development of critical incidents and the creation of the instruments. The research concluded with field studies for the purpose of establishing the construct validity of the instruments and determining their rating characteristics when applied in an organizational environment. The literature reviews revealed two major issues in the areas of performance appraisal and training evaluation: criterion and methodology. The methodology problem is open to much controversy, but on the criterion issue the tendency is to favour multiple and behaviour-oriented criteria. After determining a focal job for which an appraisal instrument had to be developed, workshops with five supporting organizations were organized. Each workshop was attended by five job incumbents, five superiors and five subordinates. The purpose of these meetings was to develop critical incidents, or samples of effective or ineffective job behaviour as observed by peers, superiors and subordinates, thus including every aspect of the job. The collected items were edited to conform to proper English and to avoid redundancies. The items were then listed in random order and submitted to judges (expert job incumbents) who made decisions on the validity of the items and the job dimension or category to which each item belonged. Only items on which 80% of the judges agreed were retained. Since the remaining item pool was still too large to be submitted to an inter-organizational body of judges, a panel of experts made up of training managers of the participating organizations selected 159 items according to agreed-upon criteria (lowest standard deviation and 100% agreement among judges) to ensure that only the best items were chosen. A list of these items was submitted to 200 judges of each participating organization. The judges were asked to decide whether each item was a valid sample of a job incumbent's job behaviour, and to rate it on a 1 to 7 scale as to the degree of effectiveness of the job behaviour it described. Items were retained when 80% of all judges and 60% of the judges of individual organizations agreed on their validity and their standard deviation did not exceed 1.5. One hundred and twenty items (or 75%) were retained. Two types of instruments were developed from the item pool, a seven-point behaviourally anchored rating scale (BARS), and a twenty-item behaviour description index with Yes/No/Uncertain responses (BDI). A third instrument, a seven-point graphic rating scale (GRS) was based on the official performance appraisal forms of the participating organizations. To assess the validity of the respective instruments, two field studies were undertaken. Two groups of superiors (N₁ =31,N₂ = 42) rated their subordinates using all three instruments. The construct validity of the instruments was assessed through the Campbel1-Fiske [1959] multi-trait-multi-method matrix, an analysis of variance, and correlations with (relatively) independent performance criteria. All instruments showed significant convergent validity but only BARS and BDI demonstrated significant discriminant validity. Correlations with the independent criteria of performance were highest for the BDI. A second goal of the field studies was to compare the three instruments on psychometric characteristics such as halo, leniency, central tendency, reliability, information content, and rater preference. The results indicate that in most comparisions the BDI demonstrated superiority. However, it has not been validated as yet for use in industry.

Item Media

Item Citations and Data


For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.