Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Factor analysis processing of inductively coupled plasma optical emission spectra recorded using a photodiode… Wirsz, Douglas Franklin 1990

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1990_A1 W57.pdf [ 11.09MB ]
Metadata
JSON: 831-1.0059806.json
JSON-LD: 831-1.0059806-ld.json
RDF/XML (Pretty): 831-1.0059806-rdf.xml
RDF/JSON: 831-1.0059806-rdf.json
Turtle: 831-1.0059806-turtle.txt
N-Triples: 831-1.0059806-rdf-ntriples.txt
Original Record: 831-1.0059806-source.json
Full Text
831-1.0059806-fulltext.txt
Citation
831-1.0059806.ris

Full Text

FACTOR ANALYSIS PROCESSING OF INDUCTIVELY COUPLED PLASMA OPTICAL EMISSION SPECTRA RECORDED USING A PHOTODIODE ARRAY SPECTROMETER by DOUGLAS FRANKLIN WTRSZ B.Sc.(Hons.), Simon Fraser University, 1982 M.Sc, University of British Columbia, 1985 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES Department of Chemistry We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA June 1990 © Douglas Franklin Wirsz, 1990 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of £ H £ MIS 7"/f Y The University of British Columbia Vancouver, Canada Date sePT 20J (J??c? DE-6 (2/88) Abstract Multivariate methods have been developed to assist in the interpretation of multiwavelength spectral data. When carrying out an elemental analysis by optical emission spectrometry, the analyst is faced with a choice of many spectral lines on which to base the analysis. The problem of choosing a suitable line depends upon other components in the sample, as some of these lines may suffer from spectral interferences, depending upon the nature of the sample matrix. The selection of the most suitable line for the determination of a desired component is conventionally accomplished by consulting tables of spectral emissions, and selecting a line on the basis of freedom from spectral overlap, taking into consideration the bandwidth of the spectrometer. Unfortunately, many weaker lines are not tabulated, although they may nevertheless interfere if the interfering element is at a high enough concentration. As well, emissions from sample-specific species, such as molecular species, will not be tabulated. As the possibility that a spectral line will cause a significant interference depends upon interferent concentration, the analyst has previously required detailed knowledge of the nature of other components in a sample. The methods developed in this thesis facilitate the determination of a desired component in binary, ternary, or more complex mixtures without prior knowledge of the nature of any interfering components. This automatic line selection allows the matrix-dependent tailoring of the lines chosen to the element or elements of interest. Factor analysis determines those wavelengths where an unidentified interferent ii contributes to the measured intensity. A multivariate analysis based on selected wavelengths gives the concentrations of all desired components while avoiding error in these concentrations due to interferents. Using a related method, the best analytical line for the determination of a specified analyte is selected from a set of lines on the basis of the least interference in a particular sample matrix, by several cycles of mathematical analysis. Unsuitable lines are rejected in the first few cycles, the best lines being retained until last. A multivariate analysis after each cycle provides an updated estimate of the analyte concentration. As with other methods in this thesis, this process is performed without reference to spectral tables. Application of these methods to a recently developed high resolution photodiode array based polychromator system is also discussed. This has consequences for the design and selection of spectral masks to assist in multielement analysis on this spectrometer. In the course of development of these multivariate methods, the need for improved dynamic range in the photodiode array was seen. An algorithm for the generation of dynamic range enhanced photodiode array spectra has been developed and implemented. iii Table of Contents Page Abstract ii Table of Contents . . . iv List of Tables . . . . ix List of Figures xii Acknowledgements xix Chapter 1: Introduction 1.1 Overview 1 1.2 History 3 1.2.1 Emission Spectroscopy 3 1.2.1.1 Elemental Spectral Fingerprints 3 1.2.1.2 Extending the Senses Beyond the Visible 4 1.2.1.3 Advances in Wavelength Dispersion 5 1.2.1.4 Development of Atomic Emission Spectroscopy 6 1.2.2 Multivariate Analysis 7 1.2.2.1 Origins of Factor Analysis 8 1.2.2.2 Applications 9 1.3 Multivariate Spectroscopy 10 1.3.1 Early Research 10 1.3.2 Expansion to Many Disciplines 11 1.3.3 Chemometrics 11 1.4 Multivariate Methods in Atomic Emission Spectroscopy 12 Chapter 2: Instrumentation 2.1 Introduction 13 2.2 The ICP as a Spectral Source 13 2.2.1 Fundamental Processes in the ICP 16 2.2.2 Spectral Interferences 18 2.2.2.1 Argon and Concomitant Background 18 2.2.2.2 Continuum Radiation , 19 2.2.2.3 Spectral Interferences and Resolution 20 2.2.2.3.1 Spectral Bandpass 20 2.2.2.3.2 Line Broadening 21 2.2.3 Line Selection 22 2.2.3.1 Selection Criteria and Wavelength Tables 22 2.3 The ICP Spectrometer 24 2.3.1 RF Generators and Stable Plasmas . . 24 2.3.2 Sample Introduction 26 iv 2.3.2.1 Nebulizers 27 2.3.2.2 Spray Chambers 29 2.3.3 Torch Design 29 2.3.4 Mono/polychromators 31 2.3.5 Detectors 35 2.3.5.1 Photomultiplier Tubes 35 2.3.5.2 Photodiode Arrays 36 Chapter 3: Advances in Photodiode Array Detection 3.1 Evaluation of Photodiode Arrays 38 3.1.1 Design Differences 38 3.1.2 Performance 38 3.1.2.1 Sources of Noise 39 3.1.2.2 Precision for a 4096 Diode Array 39 3.1.2.3 Wavelength Registration 44 3.1.2.4 Dark Current Response 46 3.2 Dynamic Range Enhancement of Photodiode Array Spectra 49 3.2.1 Collection of Sample Spectra 50 3.2.2 Algorithm 51 3.2.3 Comparison of Normal and Enhanced Spectra 54 3.2.4 Effect of Saturation and DC Offset Parameters 60 3.2.5 Advantages 65 3.3 A Unique Spectrometer Design 65 Chapter 4: Factor Analysis 4.1 Introduction 67 4.2 Requirements for Successful Factor Analyses 67 4.2.1 Spectral Overlap 67 4.2.2 Dynamic Range 67 4.2.3 Solutions for Multivariate Equations 69 4.3 Preparation of Data in a Form Suitable for Factor Analysis 69 4.4 Eigenvectors and Eigenvalues 71 4.5 Determination of the Number of Factors 76 4.6 Target Testing Factor Analysis 79 Chapter 5: Unidentified Components and Residual Vectors 5.1 Sources of Unidentified Components 84 5.2 Rejection of Diodes with Interferences 84 5.3 Problems in the Modelling of Unidentified Components 85 v 5.4 A Solution for Unidentified Components 88 5.5 Regeneration of Interferent Spectra 88 5.6 Summary 89 Chapter 6: Automatic Matrix-Dependent Wavelength Selection with Multi-Line Detection 6.1 Survey of Single Element Spectra from an ICP Spectrometer 90 6.2 Spectral Overlap 92 6.3 Data Pre-processing 93 6.3.1 Weighting 93 6.3.1.1 Justification for the Use of Weighting Schemes 93 6.3.1.2 Evaluation of Suitable Weighting Functions 95 6.3.1.2.1 Least Squares Fit with 100 diodes 95 6.3.1.2.2 Factor Analysis with 4096 Diodes 96 6.3.2 Spectral Smoothing 97 6.3.2.1 Smoothing by Fourier Transformation and Apodization 97 6.3.2.2 Smoothing using a Five Point Average 99 6.3.3 Thresholds 99 6.3.4 Effect of Slit Width 101 6.3.5 Recommendations for Data Pre-processing 102 6.4 Experimental Data Collection 104 6.4.1 Selection of Suitable Analytes and Wavelengths 106 6.4.2 Criteria for the Selection of Spectral Windows 106 6.4.3 Generation of Simulated Mixtures 107 6.5 Application of Methods 108 6.5.1 Single Element Determinations in a Binary Mixture 108 6.5.1.1 Co in a Binary Mixture 108 6.5.1.2 Fe in a Binary Mixture 110 6.5.2 Single Element Determinations in a Ternary Mixture I l l 6.5.2.1 La in a Ternary Mixture I l l 6.5.2.2 Co in a Ternary Mixture 115 6.5.2.3 Fe in a Ternary Mixture 118 6.5.3 Simultaneous Multielement Determinations 122 6.5.3.1 Co and Fe in the presence of an interferent 122 6.5.4 Regeneration of Spectra of Unidentified Interferents 123 6.6 Automatic Line Selection 125 6.7 Summary 129 Chapter 7: Integrated Spectral Windows and the Complexities of Spectra 7.1 Determination of a Single Component 131 vi 7.2 Simultaneous Determination of Several Components 132 7.3 Spectral Windows vs. Individual Diodes 132 7.4 Initial Selection of Suitable Diodes for Further Analysis 132 7.5 The First Cycle of Factor Analysis 135 7.6 Subsequent Cycles of Factor Analysis 137 7.7 Summary 137 Chapter 8: Multivariate Analysis of Mixtures Which Exhibit Spectral Overlap 8.1 Introduction 139 8.2 Comparison of Spectral Overlap in Real and Simulated Spectra 139 8.3 Experimental Data Collection 140 8.3.1 Instrumentation 140 8.3.2 Criteria for the Selection of Spectral Windows 140 8.3.3 Selection of Suitable Analytes 140 8.3.4 Preparation of Analyte Solutions 141 8.3.5 Identification of Analyte Lines 141 8.4 Application of Methods 143 8.4.1 Single Element Determinations 144 8.4.1.1 Co in the Presence of Interferents 144 8.4.1.2 Fe in the Presence of Interferents 150 8.4.1.3 Cr in the Presence of Interferents 153 8.4.1.4 La in the Presence of Interferents 158 8.5 Summary 162 Chapter 9: Automated Analytical Line Selection for a Unique Polychromator Design 9.1 Introduction - A Unique Spectrometer Design 163 9.1.1 Polychromator 163 9.1.1.1 Echelle Dispersion 163 9.1.1.2 Wavelength Preselection 164 9.1.1.3 Wavelength and Diode Position 165 9.1.2 Masks 168 9.1.2.1 Scanning Masks 168 9.1.2.2 Fixed Masks 168 9.1.2.2.1 Single Element Masks 169 9.1.2.2.2 Multielement Masks 169 9.2 Spectral Features 169 9.2.1 Buddy Lines 170 9.2.2 Leak Lines 171 vii 9.2.2.1 Mask Elements 171 9.2.2.2 Other Elements 174 9.2.3 Order Overlap: An Additional Spectral Interference 174 9.3 Performance of the Photodiode Array 176 9.3.1 Sensitivity 176 9.3.2 Resolution 177 9.3.3 Instrumental Limitations 179 9.4 Data Pre-processing 179 9.4.1 Experimental Data Collection 179 9.4.2 Selection of Suitable Spectral Windows 180 9.5 Application of Methods 180 9.5.1 Determination of Co 180 9.5.2 Comparison with a Univariate Analysis 186 9.6 Summary 189 Chapter 10: Conclusions 190 References 194 Appendix 1: APL as a Development Language for Multivariate Spectroscopic Applications 199 Appendix 2: Listing of APL programs used in this study 218 viii List of Tables Table 6.1. Major Spectral Features of Survey Spectra 92 Table 6.2. Effect of the Weighting Function X N on Precision 97 Table 6.3. Concentrations of Co and Fe used for Data Pre-processing Evaluation 98 Table 6.4. Weighting with the Function XN. Standard Deviation for the Determination of Co 99 Table 6.5. Weighting with the Function XN. Standard Deviation for the Determination of Fe 100 Table 6.6. Weighting with the Function l.lexp(X). Standard Deviation for the Determinations of Co and Fe 101 Table 6.7. Smoothing with a Five Point Average. Standard Deviation for the Determinations of Co and Fe 102 Table 6.8. Factor Analysis for several thresholds and weightings. Mean and Standard Deviation for Determinations of Co and Fe 103 Table 6.9. Multivariate analysis of Co in a binary Mixture of Co and Fe I l l Table 6.10. Results of a multivariate analysis on the 235 best diodes for Co in binary mixtures of Co and Fe 112 Table 6.11. Multivariate analysis of Fe in a binary mixture of CoandFe 113 Table 6.12. Results of a multivariate analysis on the 56 best diodes for Fe in binary mixtures of Co and Fe 114 Table 6.13. Results of a multivariate analysis on 4096 diodes for La, Co and Fe in ternary mixtures. 115 Table 6.14. Multivariate analysis of La in a ternary mixture of Co, Fe and La 116 ix Table 6.15. Results of a multivariate analysis on the 670 best diodes for La, in ternary mixtures of La, Co and Fe 117 Table 6.16. Multivariate analysis of Co in a ternary mixture of Co, Fe and La 118 Table 6.17. Results of a multivariate analysis on the 70 best diodes for Co in ternary mixtures of La, Co and Fe 119 Table 6.18. Multivariate analysis of Fe in a ternary mixture of \ Co, Fe and La. 120 Table 6.19. Comparison of results using 27 and 20 diodes to determine Fe in ternary solutions containing Co, FeandLa 121 Table 6.20. Intermediate results in the determination of Co and Fe in the presence of La interferent; 295 diodes weighted by the Co and Fe spectra cubed 123 Table 6.21. Multivariate analysis of Co and Fe in a ternary mixture of Co, Fe and La 124 Table 6.22. Final results for the determination of Co and Fe in the presence of La interferent; 44 diodes weighted by the Co and Fe spectra cubed 125 Table 6.23. Diodes remaining after the rejection of those with high deviations in the residual vector, and the intensity contributions for the residual, 500 ppm of Co, 500 ppm of Fe and 500 ppm of La 127 Table 8.1 Wavelengths of analyte lines used 142 Table 9.1. Prominent lines of Co used in analysis 170 Table 9.2. Prominent leak lines of Co through a Co mask 171 Table 9.3. Comparison of Multivariate vs. Univariate Results 188 x List of Figures Figure 2.1. Major Components of an Inductively Coupled Plasma Optical Emission Spectrometer 25 Figure 2.2. Meinhard nebulizer geometry and associated spray chamber 28 Figure 2.3. Cross sectional view of an ICP torch 31 Figure 2.4. Monochromators and Polychromators. a) A simple Czerny-Turner mount monochromator for the isolation of a single wavelength. b) A direct reader polychromator. c) A sequential scanning monochromator. d) A polychromator employing a photodiode array detector 33 Figure 3.1. A set of ten 30 diode regions surrounding each of 10 prominent lines in a 100 ppm Co aqueous solution. a) Mean intensities, averaged over ten spectra. b) Standard deviation of the diode intensities displayed in (a). c) Mean/Standard deviation for the same diodes 40 Figure 3.2. Mean/Standard deviation vs. Mean for 300 diode intensities in the regions surrounding each of ten prominent lines in a 100 ppm Co aqueous solution 42 Figure 3.3. Dependence of standard deviation on the number of diodes integrated across a peak 43 Figure 3.4. Relationship between wavelength and diode position on a 4096 pixel photodiode array 45 Figure 3.5. Difference between the actual wavelength and the wavelength predicted from a linear function of wavelength vs. photodiode array position 47 Figure 3.6. Background signal (including dark current and DC offset) for a dark uncooled photodiode array, as a function of integration time: (a) up to 0.88 seconds; (b) up to 2.6 seconds. . 48 xi Figure 3.7. Schematic of the data acquisition and dynamic range enhancement algorithm. A number of single spectra (A) at set integration times are collected and stored in a data matrix (B). A set of measurements at different integration times (C) for each separate diode is taken from the data matrix, and the largest unsaturated measurement is found. This optimum measurement is scaled and becomes the measurement at that diode in the enhanced spectrum (D) Figure 3.8. Photodiode array spectrum of 5000 ppm cobalt centered on 341.0 nm. a) Normal. b) Dynamic range enhanced Figure 3.9. Photodiode array spectrum of 5000 ppm cobalt centered on 341.0 nm. Vertical scale 10 times that in Figure 2.11. a) Normal. b) Dynamic range enhanced Figure 3.10. Photodiode array spectrum of 5000 ppm cobalt centered on 341.0 nm. Vertical scale 100 times that in Figure 2.11. a) Normal. b) Dynamic range enhanced. . Figure 3.11. Dynamic range enhanced photodiode array spectrum of 5000 ppm cobalt centered on 341.0 nm. Vertical scale 1000 times that in Figure 2.11. At this scale expansion only noise is seen in the normal spectrum (not shown), while very weak spectral features can clearly be seen in the dynamic range enhanced spectrum Figure 3.12. Photodiode array spectrum of 5000 ppm cobalt; strong peaks between diodes 1100 to 1600. a) Normal spectrum. b) Enhanced spectrum xii Figure 3.13. Photodiode array spectrum of 5000 ppm cobalt; weak peaks between diodes 2600 to 3100. a) Normal spectrum. b) Enhanced spectrum 61 Figure 3.14. Photodiode array spectrum of 5000 ppm cobalt; very weak peaks between diodes 3500 to 4000. a) Normal spectrum. b) Enhanced spectrum. 62 Figure 3.15. Photodiode array spectrum of 5000 ppm cobalt; a comparison using a logarithmic intensity axis. a) Normal spectrum. b) Enhanced spectrum. 63 Figure 4.1. Spectral overlap is inevitable when hundreds of emission lines may be detected for each element. The spectra shown span only 40 nm., and each represents only a single element, but because of the number of emission lines in each spectrum, the probability of a direct spectral overlap is high. Line wings of intense lines may extend several nanometers, further increasing the probability of partial overlap and background shifts, (a) 500 ppm of Fe; (b) 500 ppm of Co; and (c) 500 ppm of La 68 Figure 4.2. The structure of a typical data matrix. Each entry represents an intensity measured at a specific diode (e.g. D3), for a specific sample spectrum (e.g. S2) 70 Figure 4.3. Conversion of a multiwavelength spectrum into a vector in multidimensional space 72 Figure 4.4. Unique Axes for each element in a multidimensional space 73 xiii Figure 4.5. Example of the derivation of eigenvectors for a two dimensional space where two components are present, resulting in non-zero variance in two orthogonal directions. The first eigenvector spans the greatest possible variance in one dimension, and the second eigenvector spans the maximum of the remaining variance in an orthogonal dimension 74 Figure 4.6. Evaluation of a test vector. The two dimensional plane represents the plane defined by the vectors of two pure components. If the test vector fits favorably into the plane, it is accepted. If the fit is poor, the test vector is rejected 80 Figure 4.7. Projection of the sample spectrum onto the component axes found by target testing. The displacement along each axis is proportional to the concentration of that component 81 Figure 4.8. Schematic representation of the major steps in a factor analysis 82 Figure 5.1. a) Estimation of concentrations ([El], [E2]) of two identified components (El, E2) in a mixture, b) Ambiguous estimation of the contribution of an unknown component to a mixture spectrum when only one of two components is identified 87 Figure 6.1. The complex structure of emission spectra make the interpretation of mixtures a difficult problem. Partial or complete spectral overlap of analyte lines with lines of other concomitant elements is a common occurance. The spectra of the above three elements are representative of these difficulties. (A) 500 ppm of Fe; (B) 500 ppm of Co; and (C) 500 ppm of La 105 Figure 6.2. Comparison of known and recovered spectra: (A) spectrum for 500 ppm of Co; (B) residual spectrum recovered from spectral mixtures of Fe and an unidentified component (scaled by 4.26); and (C) difference spectrum 126 xiv Figure 7.1. The number of diodes above a chosen threshold for a line will vary with the intensity and width of the peak 134 Figure 7.2. A typical residual spectrum. Each of the ten bars represents a prediction of the sum of all interferent intensities over five diodes 136 Figure 8.1. As each line is sequentially rejected, the concentration of Co (in ppm) is recalculated in a multivariate analysis using the remaining lines . 145 Figure 8.2. (a) The ten most intense lines for the spectrum of Co in the region of 340 nm. Each of the ten bars represents the sum of the five diodes with the greatest intensity for each line, (b) The actual sum of all interferent intensities for the ten Co lines, (c) An estimate of the sum of all interferent intensities for the ten Co lines, as obtained from the residual spectrum, (d) The ratio of the signal (Co spectrum) to the actual sum of the interferents. (e) The ratio of the signal (Co spectrum) to the estimated sum of the interferents, obtained from the residual spectrum 148 Figure 8.3. As each line is sequentially rejected, the concentration of Fe (in ppm) is recalculated in a multivariate analysis using the remaining lines 151 Figure 8.4. (a) The ten most intense lines for the spectrum of Fe in the region of 340 nm. Each of the ten bars represents the sum of the five diodes with the greatest intensity for each line, (b) The actual sum of all interferent intensities for the ten Fe lines, (c) An estimate of the sum of all interferent intensities for the ten Fe lines, as obtained from the residual spectrum, (d) The ratio of the signal (Fe spectrum) to the actual sum of the interferents. (e) The ratio of the signal (Fe spectrum) to the estimated sum of the interferents, obtained from the residual spectrum 152 xv As each line is sequentially rejected, the concentration of Cr (in ppm) is recalculated in a multivariate analysis using the remaining lines. . , (a) The ten most intense lines for the spectrum of Cr in the region of 340 nm. Each of the ten bars represents the sum of the five diodes with the greatest intensity for each line, (b) The actual sum of all interferent intensities for the ten Cr lines, (c) An estimate of the sum of all interferent intensities for the ten Cr lines, as obtained from the residual spectrum, (d) The ratio of the signal (Cr spectrum) to the actual sum of the interferents. (e) The ratio of the signal (Cr spectrum) to the estimated sum of the interferents, obtained from the residual spectrum. The residual mean (squares) and the residual standard deviation (diamonds) for the determination of Cr drop as the lines with the worst interferences are sequentially rejected. . . . As each line is sequentially rejected, the concentration of La (in ppm) is recalculated in a multivariate analysis using the remaining lines. . (a) The ten most intense lines for the spectrum of La in the region of 340 nm. Each of the ten bars represents the sum of the five diodes with the greatest intensity for each line, (b) The actual sum of all interferent intensities for the ten La lines, (c) An estimate of the sum of all interferent intensities for the ten La lines, as obtained from the residual spectrum, (d) The ratio of the signal (La spectrum) to the actual sum of the interferents. (e) The ratio of the signal (La spectrum) to the estimated sum of the interferents, obtained from the residual spectrum, (f) The residual mean (squares) and the residual standard deviation (diamonds) for the determination of La drop as the lines with the worst interferences are sequentially rejected. . . . xvi Figure 9.1. Schematic diagram of an echelle spectrometer with wavelength preselection and photodiode array detection. Light from a source is imaged first on an entrance slit. A low dispersion concave grating (Gl) disperses the light from the slit, and images a spectrum across the face of a mask. Slots cut on the mask allow only a narrow window of wavelengths to pass through. Several slots may be cut in the mask to provide several independent wavelength windows, although only one is shown here for clarity. The dispersed light which passes the mask is recollimated by a concave mirror (M2). The wavelength windows are recombined by an "undispersing" grating (G2). The resulting beam of collimated quasi-white light is passed to a high resolution echelle grating. Of the many overlapping orders from the echelle grating, only one (or two at most) for any wavelength will hit the final mirror. This final concave mirror (M2) images high resolution windows of spectral information (selected by the mask slots) onto a 1024 pixel photodiode array 166 Figure 9.2. Narrow wavelength windows selected by a spectral mask are imaged on a photodiode array. Four slots (I-IV) are shown in this example. The echelle grating produces a high resolution image of the narrow range of wavelengths which passed through a slot. The position of a spectral line on the array will depend upon its wavelength, and upon the order which falls on the array. The orders corresponding to each window in this example are shown on the right axis. Windows I and III partially overlap when they fall on the photodiode array, illustrating a further type of spectral overlap, order overlap 167 Figure 9.3. 1000 ppm Co in aqueous solution, as seen through a Co mask 172 Figure 9.4. Positions of leak lines due to Co which pass through a Co mask 173 xvii Figure 9.5. 50 ppm Co in aqueous solution, also containing 9 other possible interferent elements (Cd, Pb, Cu, Cr, Mn, Ni, Fe, Zn, and Y, all 50 ppm) 175 Figure 9.6. Sensitivity and selectivity of the Leco Plasmarray with photodiode array detection for two extreme cases. (a) As 193.696 line. (b) As/Cd pair at 228.802/228.812 nm 178 Figure 9.7. 25 ppm Co in aqueous solution, also containing 9 other possible interferent elements (Cd, Pb, Cu, Cr, Mn, Ni, Zn, and Y, all 50 ppm, and Fe 2000 ppm) 182 Figure 9.8. Spectral regions surrounding prominent Co lines in 1000 ppm Co and a mixture of 10 elements . including Co. (a) diodes 1 to 41, standard. (b) diodes 1 to 41, mixture. (c) diodes 185 to 225, standard. (d) diodes 185 to 225, mixture. (e) diodes 259 to 299, standard. (f) diodes 259 to 299, mixture 183 Figure 9.9. Spectral regions surrounding prominent Co lines in 1000 ppm Co and a mixture of 10 elements including Co. (a) diodes 372 to 412, standard. (b) diodes 372 to 412, mixture. (c) diodes 425 to 465, standard. (d) diodes 425 to 465, mixture. (e) diodes 573 to 613, standard. (f) diodes 573 to 613, mixture 184 Figure 9.10. Determination of Co in the presence of 9 interferents. (a) Residual deviation. (b) Flat background factor deviation. (c) Flat background factor (spoil values) 185 xviii Acknowledgements Thanks to my wife Lana for her patience while I was writing this thesis, and to my parents for all their encouragement through many years of university. I would like to thank Mike Blades for his guidance and supervision throughout my graduate program. Thanks also go to all the graduate students in Mike's research group for their stimulating discussions, and to many others throughout the department who have provided an important insight or spark of inspiration. The personnel in the Mechanical and Electronics shops are thanked for their assistance. The financial support of the Natural Sciences and Engineering Research Council is also gratefully acknowledged. xix Chapter 1 Introduction 1.1 Overview Inductively coupled plasma optical emission spectrometry (ICP-OES) provides a wealth of spectral emission lines for each element, allowing great flexibility in the choice of analytical lines. The spectral tables of Boumans [1] list over 750 potential spectral lines which can be used for elemental analysis. Traditional ICP-OES instruments (sequential slew scanning and simultaneous direct readers) typically i utilize only one analytical line for each element to be determined. Choosing an appropriate set of lines for multielement determinations can be very complex. The intensity of an emission line relative to background noise determines the detection limit, but the utility of a particular line for an analyte may also be affected by partial or complete spectral overlap by lines of concomitant elements. Many of these overlaps or coincidences have been documented in line tables such as those of Boumans, but even the most comprehensive tables do not list all the lines of all the elements. Furthermore, the degree of overlap depends on the spectrometer bandpass and to a lesser degree the plasma operating conditions. Thus each new analytical problem requires an ad hoc solution, by so-called "methods development". In this process various options for the choice of analysis lines are studied and, presumably, an optimum set of lines is found. This procedure includes the collection of spectral scans for all the potentially useful lines, from single-element standards and standard reference materials which have a composition similar to the prospective sample. Useful lines are chosen on the basis of their sensitivity and relative freedom from 1 spectral interferences. For direc^reading multichannel spectrometers this line set is then "hardware programmed" into the instrument, and the instrument then becomes more or less specialized to solve that problem. For sequential slew-scanning spectrometers, the line set is "software programmed" and consequently the instrument is more flexible, solving a wider variety of problems, although at the cost of slower data acquisition. Sequential spectrometers may also allow selection of an alternative line when there is a spectral line overlap with the chosen analysis line. However, even when sequential spectrometers are used it is too time consuming to examine all the potential choices, for all lines, for all samples. In recent years there have been many reports on the use of multiplex spectrometers for ICP-OES [1,2,3,4,5,6,7]. These are capable of simultaneously acquiring a "window" of spectral information (i.e. all lines within a chosen wavelength range including their line shapes and adjacent background regions), providing the user with the flexibility of a sequential system and the speed of a direct reader. The two most common multiplex approaches use Fourier transform spectrometers (FTS) [2,3,4] and instruments based on optoelectronic image sensors [5,6,7]. The key feature of multiplex spectrometers is their ability to simultaneously collect data at multiple wavelengths, so that, in principle, the spectrum includes all of the information about the analyte line, interferents, and background, provided by the ICP. Two different approaches can be taken to utilization of this spectral information for analytical determinations. First, since several lines are available for each analyte, all, or selected lines, may be used to construct the calibration curve. Secondly, the 2 utility of each of the analytical lines in the spectral window can be rapidly assessed with respect to sensitivity and relative freedom from interferences to enable choice of the optimum lines for each element in each individual sample. Through the application of factor analysis and related techniques developed in this laboratory, this selection process can be simplified. Furthermore, this method does not require that the identity of interferents be known. 1.2 History 1.2.1 Emission Spectroscopy Color has always been an important property describing, labelling, or identifying an item. What we perceive as color is the response of our eyes to a range of wavelengths from about 400 to 700 nm, which we call the visible spectrum. The visible spectrum is but a small part of the electromagnetic spectrum, but this region, along with the near ultraviolet region which extends down to about 150 nm, is extremely important in elemental analysis. 1.2.1.1 Elemental Spectral Fingerprints The light given off by a material upon heating may offer useful information about its properties, and the evolution of the understanding of these phenomena can be traced through a history of scientific investigation in this area [8,9]. In 1758, Marggraf noticed that characteristic colors were produced in a flame when salts of sodium and potassium were introduced, but there was much more information in this visible spectrum which could not be perceived with the naked eye. In 1822 Herschel used a prism to observe the spectrum of the light coming from a flame and found that 3 there were bright lines and dark spaces. In 1859, Bunsen and Kirchhoff built an instrument which they called a spectroscope, with which they could observe these lines more accurately. In addition to using a prism to disperse the light into its component wavelengths, they restricted the light incident on the prism to that which would pass through a narrow slit. Several images of the slit would fall upon a flat surface held some distance from the prism. The positions of the images on this surface could be measured by placing a measuring scale beneath the images. The position was found to be determined by the wavelength. They found that each element gave off characteristic lines, at specific wavelengths, independent of any other elements also in the flame. These lines could be seen even when very small quantities of an element were present. They were the first to identify a characteristic "fingerprint" or pattern for each element. When a new pattern of spectral lines was observed in certain mineral samples, new elements were found. Rubidium was discovered in this way, and owes its name to the characteristic red lines which Bunsen and Kirchhoff observed [8,9]. Kirchhoff went on to find that light passed through a gaseous material would be absorbed at the same wavelengths as the material would emit when heated. This was critical in the development of stellar spectroscopy, and provided the foundation for atomic absorption spectroscopy. 1.2.1.2 Extending the Senses Beyond the Visible Herschel's interest in the spectrum dated back to at least 1800, when he discovered that temperature rise of a thermometer held near the images of the slit produced by the aforementioned prism, was highest not in the middle of the visible 4 spectrum but below the red region. He postulated that "invisible" light existed in this region we now know as the infrared. In 1801, Putter noted that silver chloride broke down in the presence of light. The blue end of the spectrum was more efficient in accomplishing this breakdown than the red, but a region beyond the violet was even better than the blue region. He concluded that invisible light existed at the other end of the spectrum, above the violet region, which we now call the ultraviolet. By 1839, Daguerre developed a photographic process based on Putter's observation concerning silver salts. In 1841, Talbot independently patented a photographic process which produced negative prints, and drastically reduced exposure time. About 1840, Draper further reduced exposure time, and photographed the solar spectrum, which confirmed that spectral lines extended weD into the ultraviolet. Further refinements of photographic plates, followed by development of photomultiplier tubes, and more recently solid state detectors, have shown that these spectral fingerprints can be incredibly complex (due to the large number of possible transitions witnin the atom). 1.2.1.3 Advances in Wavelength Dispersion The dispersion offered by a prism was fundamental to the observation of the range of wavelengths which made up the ultraviolet and visible spectrum. However, a prism was not the only means of dispersing light. As early as 1661, Grimaldi had observed that a beam of light which passed through two narrow apertures and fell on a surface was slightly wider, which he attributed to the bending of light near the edges of the aperture, and which he called diffraction. Young further observed separate bands of light at the edges of such an image in 1803, and calculated a 5 wavelength of less than 1 um would be responsible for such fringes. Grimaldi also observed colored streaks at the edges, but it was not until 1820 that Fraunhofer used a grating of closely spaced thin wires to generate a spectrum from white light. Ruled gratings were to replace wires as the most effective dispersive element, and by the late 1870's Rowland had prepared gratings on concave metal or glass with almost 6000 lines per centimeter. With higher dispersion came higher resolution of spectra, which allowed Rowland to catalog more than 14,000 lines in the solar spectrum. 1.2.1.4 Development of Inductively Coupled Plasmas in Atomic Emission Spectroscopy The progress of elemental spectral analysis as a quantitative technique required the development of sources hot enough to completely atomize samples introduced into them. Arcs, sparks and flames evolved as important atomic emission sources. At the same time, studies of the absorption of light by atomic vapor, as observed in the dark lines in the solar spectrum, led to atomic absorption spectroscopy. All of these methods allow quantitative determinations of trace quantities of elements in a sample. No one method has proven to be superior in all cases, and the choice of method depends upon the sample to be analyzed. Chemical interferences, detection limits, and dynamic range must be considered when choosing the most suitable technique for a particular sample. Researchers continued to seek more universally applicable techniques, which could handle a wider variety of samples, offer freedom from chemical interferences, and achieve better detection limits. Plasmas (hot ionized gases) were hotter than previous sources (8000°K compared to 2000°K). The higher temperatures were more effective in atomizing 6 samples, making plasmas less prone to chemical interferences, and they were therefore investigated as one possible avenue of development. As early as the 1940's it was possible to sustain a radio frequency generated plasma at atmospheric pressure. By the early 1960's a stable argon plasma was generated in a tube with flowing argon [10]. In 1965, Wendt and Fassel [11] and Greenfield et al [12] initiated studies of the analytical performance of inductively coupled plasmas. An improved torch design included three concentric tubes, with a spiraling argon flow in the outer tube to lend stability to the plasma. Aqueous samples were aspirated into the center of the plasma through a central tube, and emission was observed in a tail above the plasma. These design improvements resulted in better detection limits, and are still in use in current ICP systems. By the end of the 1960's detection limits compared favorably with other elemental analysis methods. By the mid 1970's the first commercial ICP's become available, and these began to take advantage of the multielement capabilities of the ICP by incorporating direct reading polychromators. In its second decade the ICP has developed into a powerful analytical source which is the mainstay of many analytical laboratories. Working curves are linear over several orders of magnitude, and the presence of fewer chemical interferences than other emission sources makes it the preferred method for elemental analysis. 1.2.2 Multivariate Analysis The world is inherently multivariate. Processes seldom occur one at a time. Many independent and interdependent processes simultaneously influence observed 7 events. One of the goals of science is to carefully analyze such events, and isolate and quantify the many factors which are their cause. The interdependency of these processes cannot be ignored. Univariate models are frequently used in science, where the response of a system is measured as a single variable is changed. The success of univariate models is due to their conceptual simplicity, and the relative ease with which calculations may be carried out. However, models which assume that only one variable changes at a time are often not sufficient. The response of a system to simultaneous changes in several variables may be far more complex, and an understanding of the interactions between the variables is more difficult to grasp. With the increasing availability of powerful microcomputers, and their ability to process larger volumes of numerical data, it is possible to handle simultaneous changes in several variables and to find correlations between them. This can lead to a better understanding of a system, and an improved, multivariate model. 1.2.2.1 Origins of Factor Analysis Factor analysis is a statistical technique which was originally developed to assist in the interpretation of psychological data. Early researchers such as Spearman, Pearson, Garnett and Thurstone attempted to mathematically model psychological theories of human ability and behavior [13]. In the early 1900's, Pearson proposed a method of principal axes. It was however computationally complex, and could not be carried out by hand except for simple cases. Spearman developed a two factor theory that required a single general 8 factor and several other specific factors. This theory was not always sufficient to describe more complex psychological tests, so Garnett developed theories requiring multiple factors, extracted from a correlation matrix. Thurstone extracted these factors using the centroid method, which was developed as an approximation to the principal axes method but required less computation. Thurstone's most important contribution was the recognition that the number of factors may be determined from the rank of the correlation matrix. There exist an infinite number of solutions which will yield factors which describe the original correlation matrix; the centroid method produced just one of these. Mathematically, all solutions are equally valid, so the most suitable solution must be chosen by other criteria. The principal axes proposed by Pearson produced a set of factors where each successive factor accounted for a maximum of the remaining variance, and this was preferred due to its statistical simplicity [13]. The main drawback to this method was that it required extensive computations. This disadvantage has been overcome by the use of computers, and this is now the method of choice. Further interpretation of the data required that these factors fit some kind of scientific model. The nature of the model has lead to many variations in methods, target factor analysis being just one example. 1.2.2.2 Applications Factor analysis has been used extensively in many fields. In economics it has been used to develop economic equations, evaluate systems performance, and influence investment decisions. In sociology it has been used to evaluate census data, 9 and in political studies it has been used to look at the relationships between such variables as government policy, international trade, and gross national product. There have been other applications in the areas of medicine, meteorology, geology, communications, and architecture [13]. In the last twenty years, applications in the physical sciences have increased dramatically, and these developments will be discussed in more detail in the next section. 1.3 Multivariate Spectroscopy 1.3.1 Early Research Although multivariate methods had been used in many other fields, as discussed in the previous section, it was not until the mid 1960's that they were applied to spectroscopy. Blackburn [14] adopted a least squares solution for the determination of the number of components giving rise to a gamma ray spectrum. When an adequate number of standards were incorporated into a multivariate fit to account for all components present, excellent fits were achieved, finding certain elements even in the presence of severe spectral overlaps. Due to the number of channels of information involved, the matrix solution could only be calculated by computer, and limitations on memory and processor speed discouraged further applications at that time. As computers became more readily available, with greater memory and speed, a number of researchers further investigated the basics of multivariate analysis in spectroscopic applications. It was shown that factor analysis could give results consistent with univariate methods, with the possibility of greater accuracy [15]. 10 Many different terminologies arose to describe the same basic operations. Matrix rank analysis [16] was based on the idea that the rank of a noise free data matrix would exactly equal the number of components present, regardless of the number of spectra, as long as the total number of spectra was greater than the number of components. Since no laboratory data is entirely noise free, indicators of the number of significant factors were evaluated, such as the square root of the variance of each eigenvector, and the chi-squared value. 1.3.2 Expansion to Many Disciplines Multivariate analysis was applied in a wide variety of fields within spectroscopy and chemical analysis. Early applications included gamma ray spectroscopy [14], UV/VIS absorbance [15], X-ray fluorescence spectroscopy [17], and Fourier transform IR spectroscopy of polymers [18]. The increase in applications of these methods continues today in many diverse fields of study, with important contributions in many areas. This diversity can lead to a lack of communication among researchers in different fields, resulting in advances in one field which may not become widely known in another. It is therefore important that researchers using multivariate methods in one discipline remain aware of the advances made by their colleagues elsewhere. 1.3.3 Chemometrics The use of multivariate methods of analysis in science has only recently been recognized as its own distinct discipline. The term "Chemometrics" was coined in 1972 by Kowalski and Wold [19] to encompass the use of statistical methods in 11 Chemistry. In recent years, special sessions on Chemometrics have taken place at many of the largest conferences in spectroscopy [20,21]. A number of new journals exclusively covering this area have appeared [22,23], and established journals covering all areas of analytical chemistry [24,25,26] have devoted special issues to Chemometrics. 1.4 Multivariate Methods in Atomic Emission Spectroscopy Spectral overlap and unidentified interferences can be a serious problem in atomic emission spectroscopy when analyzing samples in complex chemical matrices. Chemometric methods are being developed, and it will be many years before the field can be considered a mature discipline. These multivariate methods show great promise in assisting with the interpretation of complex mixtures of components, especially with multiwavelength data. In the chapters which follow, a number of new methods are developed, and their application to modern problems in spectral analysis is described. 12 Chapter 2 Instrumentation 2.1 Introduction Instrumentation for elemental analysis consists of several components. A spectral source is required, which acts as an atom reservoir. The sample is introduced into the source and becomes equilibrated with its surroundings. Excitation and/or photon absorption and/or photon emission takes place within this atom reservoir. The wavelength dependence of these phenomena is observed with the combination of a dispersive apparatus and a detector. The quantitative information thus collected is processed to obtain the desired information on elemental composition. The following sections will discuss these concepts with respect to inductively coupled plasma atomic emission spectrometry. 2.2 The ICP as a Spectral Source A spectral source for analytical atomic spectroscopy can be evaluated by comparing the features available with those a spectroscopist would like in an "ideal source" [27]. An ideal source should: 1) encode information on all elements, from parts per trillion (ppt, 1 in 1012 by weight) through percent concentrations; 2) contain analytical information as simple, linear functions of the elements and their concentrations, without interferences; 3) have high precision and accuracy; 4) rapidly process samples in a wide variety of forms with little sample preparation. 13 The inductively coupled plasma (ICP), performs favorably when measured against these criteria. Its performance in each of the above categories will be briefly compared with two other commonly used spectroscopic techniques [28]. Flame atomic absorption (AA) is a well established method. It suffers from some interferences, but these may usually be identified. Measures can be taken to compensate for the interference, using correction factors. Problems occur with refractory elements (B, V Ta, and W), Mo and alkaline earths, since they are not completely dissociated in the flame. P, S and the halogens are not easily determined, since their resonance lines are in the far UV. For the remaining elements, detection limits are in the range of 1 -100 parts per billion (ppb, 1 in 109). Response may only be linear over two orders of magnitude. Since this is an absorption method, a narrow line source (hollow cathode lamp) is required for each element, and only one element may be analyzed at a time. Because of the higher temperatures possible in the ICP, complete dissociation of refractory elements is achieved, and analysis is possible. Overall, detection limits are comparable to Flame A A . The higher degree of excitation also generates more lines, increasing the possibility of spectral overlap. A greater number of lines are available for each element, and care must be taken in selecting a line with minimal interferences. Matrix-dependent line selection may be carried out using methods outlined in this thesis. As an emission method, the ICP is capable of simultaneous multielement analysis, resulting in faster throughput of samples. Simultaneous encoding of multielement information is perhaps the single most important feature 14 of the ICP. Another important feature is the linearity of response over a range of up to six orders of magnitude. Graphite furnace atomic absorption (GFAA) has detection limits from 10 to 100 times better than FAA or ICP. GFAA suffers from many chemical interferences but the use of platforms, high quality graphite, matrix modifiers and Zeeman background correction can compensate for most of these. Response is linear over just two orders of magnitude. Sample throughput is slow, and only one element may be determined at a time. Nevertheless, if the lowest detection limits are required, GFAA may be preferred. Inductively coupled plasma-mass spectroscopy is a hyphenated method where the ICP is used as an ion source for a mass spectrometer. ICP-MS maintains many of the advantages of ICP optical emission spectroscopy. The analyte information is encoded as a mass spectrum, which supplies additional information in the form of isotopic abundances for the elements of interest. Its detection limits are lower than the ICP (as a optical emission source), approaching or surpassing those of GFAA. This method shows much promise for future development. Precision for all methods can be better than 1% (except for GFAA, 5%), with accuracy more dependent upon sampling and preparation than behavior in the source. Samples are commonly aspirated into the plasma (ICP) or flame (FAA) as an aqueous aerosol, and are placed in the graphite furnace (GFAA) as an aqueous solution. Difficulties may arise from incomplete digestion of samples in acid solution. 15 To briefly summarize, all three methods have high precision (1%). The ICP has detection limits similar to Flame AA (1-100 ppb), which are sufficient for most analyses. Better detection limits (ppt) can be achieved with GFAA, but only for selected elements, one element at a time. Only the ICP can simultaneously encode information on all the elements, with a linear response of up to six orders of magnitude. The ICP suffers from fewer chemical interferences than Flame AA or GFAA. Although the ICP does not fully satisfy the requirements for an ideal source, it has important advantages over other methods, and has become a widely used source for elemental analysis in the laboratory. 2.2.1 Fundamental Processes in the ICP When a sample enters the plasma as an aerosol droplet, it undergoes a number of changes before its presence is finally detected by an emission or absorption at a particular wavelength. An understanding of the microscopic processes an analyte undergoes in the source can assist in explaining what is observed macroscopically by a detector. When the aerosol droplet first enters the plasma, it undergoes rapid heating, resulting in the evaporation of the solvent, and leaving behind a salt particle. The efficiency of the desolvation step depends upon the amount of energy available in the plasma, and the efficiency of heat transfer. The salt particle is further heated, and undergoes a phase transition from solid to gas. For this phase transition to occur, one must overcome the lattice energy of 16 the solid. Again this depends upon the amount of energy available in the plasma. Desolvation and vaporization are more rapidly achieved with higher power plasmas. Once the analyte salt is in the gas phase, it undergoes dissociation. This produces free atoms in the plasma. If the source is sufficiently hot, as is the case with a plasma, this will be followed by ionization. The high temperature of the plasma insures that dissociation is complete, but the degree of ionization will vary depending upon the ionization potential of the atom. The degree of ionization can vary from element to element, but is generally high (>99% for Ca (LP. 6.113 eV), 94% for Mg (LP. 7.646 eV), 86% for Cd (LP. 8.993 eV), and 69% for Zn (LP. 9.394 eV) at a power of 1.25 kW [29]). The ratio of ions to atoms is governed by the Saha relationship. n/na = (1/rO 2 (g/ga) (27cmekT/h2)3/2 exp(-E/kT) where: n/na = ratio of number densities of ions and atoms n,, = free electron number density S/Sa = statistical weights of ion and atom levels h = Planck's constant me = electron mass Ej = ionization energy Tj = ionization temperature The presence of large quantities of easily ionizable elements such as Ca have been observed to result in analyte signal enhancements. Although this was originally attributed to shifts in the ionization equilibrium, studies indicate that the observation of an analyte signal enhancement or suppression is a function of spatial position [30], and is likely due to increased collisional excitation. 17 The atoms and ions produced are originally in their ground states, but are thermally promoted to a number of excited states. The distribution of atoms in each energy level is governed by the Boltzmann distribution. V n q - (8/8,) e x P £-<EP " where: = ratio of atom (or ion) populations for states p and q = ratio of statistical g values for states p and q = excitation energies of states p and q = excitation temperature of the species g p / g q E P , E q If local thermal equilibrium is assumed, then T in each of the above equations will be the same, however, this is not usually the case. Further discussion is beyond the scope of this thesis, but detailed studies of have been carried out [29,31,32]. As the number of excited states with significant populations is large, many emission lines are observed for each element. The number of lines in a typical spectrum make ICP spectra appear complex. Attempts have been made to model emission spectra from fundamental parameters such as transition probabilities, ion/atom ratios, and non-linear level populations [33]. A comparison of the synthetic spectra with real spectra is impressive, and with continued improvement in the understanding of these fundamental processes, the model will no doubt be further refined. 2.2.2 Spectral Interferences 2.2.2,1 Argon and Concomitant Background The background spectrum from an ICP can show considerable structure, as well as a more uniform continuum intensity. The continuum is due to ion-electron 18 recombination processes, so it is proportional to the electron number density. Variations in operational parameters such as forward power and gas flow rates can alter the electron number density and thus change the continuum intensity. Aspiration of water or other solvents also alters the electron density. Structure in the background is due in part to over 200 argon lines which have been observed in the plasma, with the most intense lines above 400 nm. These are all atom lines (resulting from transitions within the neutral atom), and no ion lines (resulting from transitions within the singly charged cation) have been observed. As aqueous solutions are most commonly aspirated into the plasma, hydrogen lines are also observed. These lines can be up to 1 nm wide, due to Doppler (thermal) and Stark (charge) broadening processes in the plasma. In fact, one accepted method of electron density measurement in the ICP involves a determination of the half width of the H B line at 486.13 nm. N 2 may also be present in the plasma as a result of atmospheric entrainment or from dissolved gas in the aqueous solvent. Small amounts of other impurities are sometimes present, C as C0 2 in the argon gas, and Si from the quartz torch. Molecular species such as OH, NH and NO produce rotational/vibrational structure in the background. The contributions from all these constituents are dependent on operating conditions, and provide the observed background. Fortunately, a compromise set of operating conditions which minimizes these interferences can be found, and the background may then be treated as a constant from one sample to the next. 19 2.2.2.2 Continuum Radiation Another source of spectral interference results from the recombination of ions with free electrons to produce radiation over a wide range. This shifts the level of the background under other analyte lines. An example of this is seen in concentrated solutions containing Mg [34]. As this contribution to the continuum is proportional to analyte concentration, the use of a blank with a matrix composition similar to the sample is sufficient in many cases to adjust for this problem. 2.2.2.3 Spectral Interferences and Resolution Spectral interferences are closely related to the ability to resolve closely spaced lines. Two lines are considered resolved by the Rayleigh criterion if the first diffraction niinimum of one line falls on the diffraction maximum of the second line. This corresponds to a 19% valley between adjacent peaks of equal intensity. Resolution is dependent upon the spectral bandpass of the monochromator used. Partial overlaps from wings of adjacent intense lines also constitute an interference [35]. As the ICP produces many lines, there are a number of unlisted (and as yet unidentified) lines, which may cause a partial or complete spectral overlap with an analyte line of interest. Since these lines are untabulated, reference to spectral emission tables cannot assist in compensating for this type of interference. Multivariate methods can offer new techniques for detecting these interferences. 20 2.2.2.3.1 Spectral Bandpass Spectral overlap can be partial, in which case increased resolution may solve the problem, but if the overlap is the result of an exact wavelength coincidence, resolution is not possible. A spectral coincidence is much more likely to occur if the matrix contains elements with line rich spectra such as Fe at high concentrations. In these cases, the contribution of the interferent to the measured intensity may be subtracted if its contribution can be determined (by comparison to a standard Fe solution with the same high concentration for example). For single line analyses, increased resolution is preferred to isolate the analyte line of interest. The resolution of an instrument is defined by its spectral bandpass. A high resolution instrument has a spectral bandpass of 0.01 nm or less, determined by the reciprocal linear dispersion of the grating, the physical size of the monochromator (and therefore the focal length within the instrument), and the widths of its entrance and exit slits. Bouman's tables of spectral interferences [1] are a useful source of information on possible interferences for different values of spectral bandpass. 2.2.2.3.2 Line Broadening As the instrumental resolution improves, eventually the resolution of closely spaced lines no longer improves. At this point the broadening of the spectral line due to a number of other physical processes is dominant. In the ICP, Doppler, Stark, and Collisional broadening processes are of greatest importance. The width of a line is usually specified by its full width at half maximum (FWHM). 21 Doppler broadening is the result of the random thermal motion of the analyte. It is proportional to the wavelength, and to (T/M)172, where T is the temperature in Kelvin, and M is the atomic mass. Doppler broadening can contribute from 0.001 to 0.01 nm to the observed line width, and since the velocity distribution is Gaussian, it does not contribute significantly to the line wings. Stark broadening is the result of interactions between charged species. Perturbations of the orbitals result in widely broadened, asymmetric lines. This line broadening is dependent on the number density of charged species, and is most often observed for Ar and H lines. The line wings of these lines may extend outward more than 1 nm, but, at least in the case of Ar lines, compensation is possible by subtraction of a blank spectrum from the sample spectrum. Stark broadening of other components (for example Mg at high concentrations) can be more difficult to compensate for due to the complex dependence on ion number density. Collisional broadening is the result of collisions of analyte atoms with neutral Ar atoms. Although the FWHM is increased only slightly, the line wings are enhanced, resulting in a higher background for adjacent analyte lines. 2.2.3 Line Selection The ICP spectrum is rich in analyte lines, and there are many factors which must be taken into account when selecting a suitable line for quantitative analysis. A general set of criteria can be applied to the selection of lines regardless of the sample matrix. 22 2.2.3.1 Selection Criteria and Wavelength Tables For a line to be of analytical utility at lower concentrations, it must be a prominent line. That is, it must be easily discerned from surrounding spectral features at progressively lower concentrations. A line can be evaluated in terms of its sensitivity and signal-to-noise ratio. In a majority of analyses, the line should be several times larger than the background signal deviation at concentrations of interest. Detection limits are commonly defined so that the signal is equivalent to three times the standard deviation of the background noise. All other factors being equal, the line with the lowest detection limit is preferred. A second criterion is that the line be free of spectral interferences. As discussed in the previous section, this is a function of concomitant elements, and the spectral bandpass of the monochromator system. If the separation between the analyte line and an interfering line is less than twice the spectral bandpass [34], another line should be considered. In many cases where a single line is to be used for the analysis (a univariate analysis), interelement corrections, where the computed intensity due to a known concentration of interferent is subtracted from the observed analyte signal, must be applied to account for contributions from weak interfering lines. In cases where all the prominent lines suffer from spectral interferences, which is likely to occur when using a low resolution monochromator, an interelement correction is a necessity. Examination of samples with a similar matrix, but different concentrations of analyte and interferent can give the analyst a better view of the performance to be expected 23 in the presence of interferents. If the line is subsequently chosen, the expected contribution to the overall intensity from the major interferents may be estimated, and subtracted from the analyte intensity in the sample spectrum. Prominent lines in the ICP and possible interferences (with their magnitudes) for a variety of spectral bandpasses have been tabulated in such references as Boumans' "Line Coincidence Tables for Inductively Coupled Plasma Atomic Emission Spectrometry" [1], General tables of atomic emission lines such as the "MIT Wavelength Tables" [36] are also useful in the evaluation of possible interferences, although relative line intensities may differ from those seen in the ICP since these tables were derived for an arc source, where excitation mechanisms may differ. The compilation of an exhaustive listing of all possible interferences to all possible lines, under all possible operating conditions, is a daunting task. As an alternative, a method to extract information on interferences by multivariate analysis of entire spectral regions holds promise. This approach is evaluated in detail in later chapters of this thesis. 2.3 The ICP Spectrometer A functional ICP spectrometer consists of several components, as illustrated in Figure 2.1. These components are: 1) A radio frequency generator to supply energy to sustain the plasma; 2) A sample introduction system, such as analyte in an aqueous aerosol flow; 3) A torch to confine the plasma, and to align and combine appropriate sample and supporting gas flows; 24 Figure 2.1. Major Components of an Inductively Coupled Plasma Optical Emission Spectrometer. Plana RF Generator 1:1 Imaging Lens Photodiode Array J Data Acquisition Board PC Computer Data Display Program Automatic Hatching Hetwork Polychroaator (Czerny-Turner) Drain Spray Chamber Nebulizer Aqueous Analyte Solution Flow Meters 25 4) A mono/polychromator system to disperse the emitted light into its component wavelengths, and; 5) A detector to measure the intensity of the emitted light. Each of these components will be discussed briefly. 2.3.1 RF Generators and Stable Plasmas The plasma is sustained by inductive coupling of radio frequency (RF) energy supplied by an RF generator. These generators typically operate at powers from 1 to 2 kilowatts, at an assigned Industrial, Scientific and Medical (ISM) frequency of 27.125 MHz. These generators are based on designs which have been used successfully in broadcast transmitters and Amateur Radio transceivers and follow two basic circuits. The first is the free running generator, where the oscillation frequency is determined by a positive feedback loop. It is labelled free running since the frequency may shift slightly to adjust for changes in impedance in the plasma. A Colpitts oscillator circuit is commonly used since it has the greatest frequency stability of several similar circuits. In the second type of oscillator, the frequency is precisely controlled by a piezoelectric crystal. As the frequency cannot drift to accommodate changes in impedance of the plasma load, an impedance matching circuit is required. The plasma is initiated by seeding the argon flow with free electrons using a Tesla coil. These electrons then pass through the region of the induction coil, and are accelerated by the rapidly changing electromagnetic field, colliding with neutral atoms and producing more charged particles, and then a stable plasma. During the 26 plasma initiation, the impedance of the plasma may vary greatly, so the impedance matching circuit is usually automated to track this rapid change until the plasma is stable. Once formed the plasma is generally stable, and a final manual tuning step can reduce the reflected power for the most effective coupling of power into the plasma. The impedance changes only slightly with the introduction of analyte aerosol, and may be easily tracked automatically or manually if required. 2.3.2 Sample Introduction Introduction of the sample into the plasma has been accomplished in a number of ways. Solids are introduced by direct solid sample insertion [37], electrothermal vaporization [38], and ablation by arcs, sparks, or lasers [39,40,41]. Gaseous samples such as volatile hydrides may be introduced directly, as well as gas flows from gas chromatographs [42]. The most common method of sample introduction is as an aqueous aerosol. This is no doubt due to the relative ease with which aqueous solutions may be prepared, and the simplicity of aerosol generation. 2.3.2.1 Nebulizers Aerosols are produced by pneumatic nebulization in a high velocity gas stream, . or by ultrasonic nebulization. Although ultrasonic nebulization offers better detection limits than pneumatic nebulizers, greater cost has limited their general use. For pneumatic nebulizers, the Meinhard concentric nebulizer and the MAK high pressure nebulizer are both in common use and will be described in more detail. The Meinhard concentric nebulizer consists of two concentric tubes terminating in a constricted end (Figure 2.2). The sample solution is drawn up the central tube 27 ****** to zer and ****** er. 9on GQS us /on T ° O r a P r a ^ c / > a m 6 e r 28 by the flow of argon in the outer tube. Operating pressures are typically near 30 psi. At the venturi end, the liquid sample is exposed to a high velocity gas jet which breaks it into large drops. These large drops are blown into the shape of hollow "cups", which fall apart to form a range of sizes of smaller drops, constituting the aerosol. The MAK nebulizer is based on the design of a cross flow nebulizer, with orthogonal gas and sample tubes. In conventional cross flow nebulizers, relative movement and vibration of the two tubes can decrease precision. In the MAK nebulizer these tubes are made of thick walled glass tubes to minimize movement and vibration. A higher pressure of 200 psi is used, which allows more concentrated solutions to be used than is possible with the Meinhard concentric nebulizer. 2.3.2.2 Spray Chambers In both nebulization systems described above, a range of sizes of drops are produced in the aerosol. In order to insure complete desolvation, volatilization, and atomization in the plasma, the aerosol droplets should not be more than 10 um in diameter. However, droplets up to 100 um in diameter may be present. The spray chamber serves as a size filter to pass the fine droplets to the plasma, while the coarser droplets go to the drain. The predominant separation effects are turbulent and inertial deposition, and gravitational settling. Over 95% of the droplets are removed in this manner, leading to an overall nebulizer/spray chamber efficiency of less than 5%. Most of the solution is therefore drained rather than aspirated into the plasma. 29 2.3.3 Torch Design The torch contains the argon gas flow, and allows the aspiration of the sample directly into the central portion of the plasma. Figure 2.3 shows a typical torch, which consists of three concentric quartz tubes, and contains three independent gas flows. The main argon flow passes through the region between the outermost two tubes. It enters the torch off center, producing a spiral flow up the outer sleeve. When the argon reaches the region of the load coils it is heated by RF coupling. The spiral motion lends stability to the plasma, and assists in keeping the hot plasma from coming in contact with the torch walls, and provides a rough cylindrical symmetry to the plasma. The main argon gas has the greatest flow rate of the three gases, usually in the range of 10-15 1/min. The narrow central tube introduces the nebulizer gas flow to the center of the plasma. Analyte in aqueous aerosol form enters the plasma via this path. Nebulizer flow rates are typically 11/min. This flow is sufficient to punch a hole in the plasma, resulting in a doughnut shaped plasma, with analyte confined to a central channel. By the time analyte has passed through the plasma, it has been desolvated, atomized, and excited, and emission may be viewed in the tail flame area approximately 16 mm above the load coil. An auxiliary flow of argon gas of approximately 1 1/min. is sometimes used between the outer main argon flow and the nebulizer gas. This serves to raise the plasma slightly, and avoids contact with (and possible melting of) the central 30 Figure 2.3. Cross sectional view of an ICP torch. Analytical Zoie 16 , - Y O f Lot! Coil CeitralAialyte Chanel > Zf- Hail flow j—kr Btxiliary flow \ a m T leWlizer flow 31 nebulizer tube. 2.3.4 Mono/polychromators The function of a monochromator is to separate the light given off by the emission source into its component wavelengths. In the simplest configuration of a monochromator, a single entrance slit (25-100 um in width) allows a narrow sample of light to enter an otherwise dark enclosure. The most common geometry is based on a Czerny-Turner mount, where the light is collimated by a mirror, cu^ fracted by a grating, and refocused on a single entrance slit (Figure 2.4a). This allows the isolation of a single wavelength (in practice a narrow range of wavelengths, depending on the resolution of the monochromator). A photodetector at the exit slit measures the relative intensity of the desired wavelength.. In multielement analysis, more than one wavelength is of interest, and a polychromator is used with several exit slits and photodetectors in the exit focal plane. This is called a direct reader (Figure 2.4b), since it can measure several wavelength intensities simultaneously. The selection of wavelengths is limited by the linear dispersion in the exit focal plane, and the physical size of the photodetectors (commonly photomultiplier tubes), which limits how close two monitored wavelengths may be. As with the monochromator, the resolution of the polychromator is a function of the dispersion of the grating, and the size of the polychromator. All wavelengths are incident on the grating at the same angle, but will be diffracted at slightly different angles. This dispersion can be measured as a change in angle as a function 32 Figure 2.4. Monochromators and Polychromators. a) A simple Czemy-Turner mount monochromator for the isolation of a single wavelength. b) A direct reader polychromator. c) A sequential scanning monochromator. d) A polychromator employing a photodiode array detector. A B 33 of change in wavelength (dO/dX), but is more commonly described with reference to linear distance along the exit focal plane (dVdl). This is called reciprocal linear dispersion, and is measured in units of nm/mm. For a given angular dispersion, the reciprocal linear dispersion will depend on the path length to the focal plane. A greater path length will give greater separation. Dispersion is increased by increasing the number of lines per mm on the grating, and by using higher orders. Holographic gratings use a large number of lines per mm (e.g. 2400), while echelle gratings use higher orders to achieve greater dispersion, and consequently, better resolution. Another approach to multielement analysis is used by a sequential scanning spectrometer, where the orientation of the grating can be altered to change the angle of incidence. This permits the intensity at a selected series of wavelengths to be measured in sequence (Figure 2.4c). Wavelength registration is a concern in scanning systems, where the mechanical movement must be precise enough to reproducibly return the scanning system to the same position, within the resolution of the monochromator system. A more versatile detection system uses a continuous array of optoelectronic image sensors (photodiode arrays, charge coupled devices, vidicons) located at the exit focal plane. This gives a spectral window, with information not only on line intensities, but line shapes and widths, and background regions. A common configuration with a one dimensional detector is a Czerny-Turner mount with a photodiode array in the exit focal plane (Figure 2.4d). A trade off exists 34 between resolution and wavelength coverage. The limited number of pixels allows either a high resolution narrow window (2 nm) or progressively lower resolution with proportionally greater wavelength coverage, as a function of the grating used. Two dimensional detectors such as vidicon tubes and CCD detectors (similar to those found in a solid state TV camera) have also been used. Echelle gratings give very high dispersion, but cover the entire wavelength range by overlapping many orders in one dimension. If a low resolution dispersing element is placed orthogonal to the dispersion plane of the echelle grating, the spectrum is spread out in two dimensions, and may be detected as an two-dimensional image in the exit plane. Another kind of echelle based spectrometer, which uses an intermediate spectral mask to isolate selected windows rather than a second dispersive element, will be described later in this thesis. 2.3.5 Detectors 2.3.5.1 Photomultiplier Tubes Photomultiplier tubes are vacuum tube devices which provide high gain for the detection of weak emission sources. A PMT consists of a cathode and a series of anodes (called dynodes) at progressively higher voltages. When a photon strikes the cathode, a free electron is produced by the photoelectric effect. This electron is accelerated to the first dynode (which is positive relative to the cathode) where it strikes and causes secondary emission of several more electrons. These electrons are accelerated towards the next dynode. Each successive dynode multiplies the number of electrons. A single electron incident upon the cathode may produce up to 108 35 electrons at the last dynode. This current may then be amplified and converted to a voltage for measurement. PMTs have a low dark current, which is due to thermionic emission from the cathode, and this may be further reduced by cooling. The PMT also provides a wide dynamic range (typically 106). The spectral response of a PMT will depend on the photocathode material and window material, but PMT's are available which provide maximum sensitivity between 200 and 500 nm, with sensitivity remaining high (within a factor of 2 of the maximum) well below 200 nm and well above 600 nm [43]. PMT's are limited to the measurement of a single wavelength. In direct reader systems, several PMTs are positioned in the exit focal plane, with each capable of measuring an intensity at a single wavelength. Limitations such as the physical size of the PMT and the reciprocal linear dispersion of the polychromator reduce the selection of wavelengths to those which are well separated in the exit focal plane. Direct readers employing more than 30 PMT's have been used in commercial instruments for simultaneous multielement analysis. 2.3.5.2 Photodiode Arrays A linear photodiode array (PDA) is a solid state detector composed of a large number of equally spaced photodiodes. Each photodiode is sensitive to electromagnetic radiation in the range from 190 to 1100 nm, making it ideal as a detector in ultraviolet and visible spectroscopy. The PDA incorporates the photodetecting elements and readout electronics in one integrated circuit. Arrays are commercially available (Reticon, Sunnyvale, CA) for spectroscopic applications and 36 consist of from 128 through 4096 detecting elements (pixels). The detecting elements are spaced on 25 um centers, with apertures of 2.5 mm. This yields an aspect ratio of 100:1 which is comparable to entrance slits in conventional polychromators. Each detecting element is a semiconductor diode with a reverse biased p-n junction. Prior to a detection cycle, the diode is charged in the manner of a capacitor. When light falls upon this junction, the photons induce a discharge proportional to the intensity of the incident light. The photodiode array may be left for a selected period of time (milliseconds to seconds) to allow integration of the incident light over the entire time period. At the end of this integration period, the photodiode is recharged to its full capacity, and the amount of discharge may be determined through the associated readout electronics [44]. At excessively long integration times, the photodiode array may be completely discharged by the incident light, and this is termed saturation. In addition to discharge through photon generated charge carriers, the photodiodes will also slowly discharge due to thermally generated charge carriers. This thermal discharge is termed dark current, and may be reduced by cooling the array. Roughly every 6° C decrease in temperature halves the dark current, and arrays are typically cooled by Peltier coolers to -40° C. At this level of cooling, integration times in excess of 100 seconds are possible without dark current saturation, and this permits the integration of weak emission signals. In the following chapter, the photodiode arrays used in this study were evaluated and algorithms to improve dynamic range developed. 37 Chapter 3 Advances in Photodiode Array Detection 3.1 Evaluation of Photodiode Arrays Three different photodiode arrays were used during the course of this research, as arrays with superior performance were introduced to the laboratory. The performance of each was evaluated. 3.1.1 Design Differences All three arrays were of the same basic design, but differ in length and pixel to pixel separation. All three were manufactured by Reticon. The first two arrays used contained 4096 and 2048 photodiodes respectively, and customized electronic hardware was manufactured "in house" by the departmental electronics shop to readout the array and control integration time. This was coupled with a data acquisition package (R. C. Electronics) and a custom software interface described elsewhere in this thesis. The third photodiode array in use was supplied with all hardware and software as part of a complete spectrometer package from Leco Instruments, and is discussed in further detail in Chapter 8. 3.1.2 Performance As the majority of the data discussed in this thesis was collected on the system incorporating the 4096 pixel photodiode array, the noise characteristics and wavelength registration of this particular array are now discussed in greater detail. The same general arguments apply to the 2048 pixel photodiode array, except that the figures of merit for noise were generally lower. 38 3.1.2.1 Sources of Noise The major sources of noise in a photodiode array detector system can be classified as white noise and flicker noise. White noise, is more commonly referred to as shot noise. This arises from statistical variations in the signal as a result of the random emission of photons. The power spectrum of this type of noise shows equal power at all frequencies. An example of this kind of noise is thermal noise in electrical circuits. The standard deviation is proportional the square root of Boltzmann's constant and temperature (o a kT172). The thermal noise in the detector can be greatly reduced by cooling the array, but is still measurable as a dark current, or signal in the absence of light. The signal-to-noise ratio for shot noise is proportional to the square root of the signal itself. The second type of noise is flicker noise, which has a power spectrum which follows an inverse frequency relationship (a a 1/f). This is a significant source of noise in analytical spectrometry. In the ICP it is a result of variations in the source intensity. The signal to noise ratio for flicker noise is proportional to the signal. 3.1.2.2 Precision for a 4096 Diode Array Ten replicate spectra of a 100 ppm aqueous solution of Co were collected, and 30 diode regions around each of the ten most prominent lines were selected for further evaluation. These ten 30 diode regions are seen in Figure 3.1a, with the lines centered on position 15, 45, 75, etc. The standard deviation of these regions is seen in Figure 3.1b. The large response at position 30 is not related to the Co data, but is a result of noisy background in that region of the array. If the mean spectrum is 39 Figure 3.1. A set of ten 30 diode regions surrounding each of 10 prominent lines in a 100 ppm Co aqueous solution. a) Mean intensities, averaged over ten spectra. b) Standard deviation of the diode intensities displayed in (a). c) Mean/Standard deviation for the same diodes. 220 i i 1 1 1 1 1 r 0 30 60 90 120 150 160 210 240 270 300 Row Numbers B e i i 1 1 1 1 1 1 i 0 30 60 90 120 150 180 210 240 270 300 Row Humbert •o CO i I 1 1 1 1 1 r 0 30 60 90 120 150 180 210 240 270 300 Row Humbert 40 divided by the standard deviation spectrum, a parameter related to the signal-to-noise ratio is obtained, and is seen in Figure 3.1c. A plot of this mean/standard deviation spectrum vs. the mean for each of the 300 diodes is seen in Figure 3.2. If the noise in the spectra was due only to a constant background term, this graph would show a linear relationship with the mean/standard deviation increasing in proportion to the mean signal. If the noise in the spectra was due to a term which is proportional to signal, a horizontal response proportional to the inverse of the stnadard deviation would be observed. As can be seen in the graph, the actual case is somewhere between these two extremes, indicating contributions from both types of noise. The effect of summing several diodes across a peak on the standard deviation of the result was investigated (Figure 3.3). Values were calculated for the summing of 1 through 15 diodes across a peak, and the standard deviation of the integrated intensity was calculated. If 1 through 7 diodes are summed across a single peak, the standard deviation remains unchanged. There is therefore an advantage to summing across the peak since the magnitude of the integrated signal will increase with the number of diodes summed, while the standard deviation does not increase. If more than 7 diodes are summed, the standard deviation increases roughly linearly with the number of diodes. Examination of the peak profile shows that a typical peak extends across about 7 diodes (for this particular 4096 pixel array). Thus integrating the signal over up to 7 diodes adds more signal. Beyond these 7 diodes, additional diodes add only noise, and therefore the standard deviation increases while the magnitude 41 Figure 3.2. Mean/Standard deviation vs. Mean for 300 diode intensities in the regions surrounding each of ten prominent lines in a 100 ppm Co aqueous solution. 30 diodes * 10 lines = 300 points e e * e * e n £ Mean 42 Figure 3.3. Dependence of standard deviation on the number of diodes integrated across a peak. Optimum Number of Diodes for Integrated Intensity 6 c •2 5 re > o k-(0 •o re 3 • • • • • • • • • • ->—I—1—I— 1—I— 1—I— 1—I—'—I— 1—I—<-0 2 4 6 8 10 12 14 16 Number of Diode Intensities Summed 43 of the integrated signal does not. In summary, the sources of noise in the ICP are complex, with components proportional to the signal, as well as a constant noise term. The summation of a signal across several diodes is superior to the measurement of the intensity at a single diode, since the summed intensity is greater, but the standard deviation is unchanged from the single diode measurement. 3.1.2.3 Wavelength Registration A spectrum of Fe in aqueous solution (FeS04,2000 ppm) was collected and the wavelengths of 40 prominent lines across a wavelength range of 40 nm were identified. These 40 lines appeared at positions spread across the entire array. The relationship between diode position and wavelength was then investigated. The position of each iron line was assigned to the nearest tenth of a diode by interpolation of the intensity data in the region of the peak maximum. This interpolation was done by manual inspection of the distribution of intensity across the line profile. For example, a symmetric intensity distribution across seven diodes, such as 25,40,90,160,90,40,25 would place the peak maximum on the fourth diode. A distribution such as 20,75,140,140,75,20 would place the peak maximum at 3.5 diodes. Positions of peak maxima for nonsymmetric distributions were estimated to the nearest 0.1 diode between these two limiting cases. A plot of wavelength vs. diode number (Figure 3.4) shows a high degree of linearity. Using a linear fit to the data, wavelengths could be predicted from the diode position with an accuracy of 0.05 nm or better. 44 Figure 3.4. Relationship between wavelength and diode position on a 4096 pixel photodiode array. PDA Registration 45 There is a systematic deviation at either end of the array as a result of the geometry of the polychromator and the curvature of the exit focal plane. The relationship between the wavelength error and the diode position is given in Figure 3.5. A second order polynomial fit to this data gave much better wavelength prediction accuracy, to within 0.005 nm. Thus for an accurate registration of wavelength as a function of the position on the photodiode array, a second order polynomial fit is recommended. This is in agreement with the fmdings of McGeorge and Salin [45]. 3.1.2.4 Dark Current Response Background spectra were collected using an uncooled (room temperature, 20° C) dark array, and the measured intensity due to the dark current was plotted as a function of integration time as seen in Figure 3.6. The lowest integration time which could be set was one integration unit (coiresponding to about 8.8 milliseconds). The array Was saturated at approximately 3500 intensity units, and for this reason the longest integration time background collected with the uncooled array was 256 integration units (2.25 seconds). The value at 0 integration units can be ascertained from extrapolation of the graph, and gave a DC offset of 186.43 intensity units. This was a consequence of the readout electronics, and represented a constant shift of the baseline. The array response was quite linear at integration times below 100 units (Figure 3.6a), but some curvature was seen at integration times exceeding 100 integration units (Figure 3.6b). A second order polynomial fit of the dark current 46 Figure 3.5. Difference between the actual wavelength and the wavelength predicted from a linear function of wavelength vs. photodiode array position. Difference Between Actual and Predicted Wavelength for a Photodiode Array ~ 0.04 " I 0.03 " Diode Number 47 Figure 3.6. Background signal (inducting dark current and DC offset) for a dark uncooled photodiode array, as a function of integration time: (a) up to 0.88 seconds; (b) up to 2.6 seconds. Background vs Integration Time y = 186.43 + 10.970x A (dark current, uncooled) - 1.1823e-2xA2 1000 2 900 e § 800 u w 700 s 600 c ei 500 eo 400 300 200 100 ~> 1 1 • 1 • 1 ' 1 1  0 20 40 60 80 100 INT (time) B 3000 T 1 response as a function of integration time gave excellent agreement. The response of the uncooled array represented a worst case scenario, and when the array was cooled, the dark current decreased greatly, and a simple linear fit was sufficient to give agreement with the observed data. The dark current can thus be modelled as a linear function of integration time, with a constant DC offset resulting from the readout electronics. 3.2 Dynamic Range Enhancement of Photodiode Array Spectra The photodiode array (PDA) as a detection system for inductively coupled plasma spectra has several advantages over conventional detection systems. Sequential slew scanning systems can collect multiwavelength information but must do so one resolution element (wavelength) at a time, and thus relatively long periods of time can be required for data collection. Direct-reading spectrometers collect several channels of information simultaneously, but the choice of channels is limited by considerations of line proximity and the physical size of the detector (usually a photomultiplier tube, PMT) and once lines are chosen, they are inconvenient to change. The PDA, on the other hand, provides simultaneous wavelength coverage over a relatively large spectral window, allowing the choice of the most appropriate wavelength for an analysis to be made after data collection has taken place. One can also use intensity information across the entire array by applying pattern recognition techniques [46]. The PDA has been previously evaluated in many spectroscopic applications [6]. As a detector for inductively coupled plasma spectra [47], it has been evaluated 49 particularly with respect to dynamic range, detection limits, and signal-to-noise ratio. It was found that the PDA was capable of detection limits comparable to that of a PMT for wavelengths greater than 230 nm. A disadvantage of the PDA as a detector is its limited dynamic range, which covers only three orders of magnitude, compared with six for a PMT. The sensitivity of the PDA can be varied by changing the time during which the array integrates incident light intensity. In the case of intense lines, a short integration time is appropriate to prevent saturation, and in the case of a spectrum consisting only of weak lines, a longer integration time may be used. A problem arises because of this limited dynamic range when both strong and weak lines are present in the same spectral window. An integration time long enough to give good signal levels for weak peaks will cause the strong peaks to saturate, with important intensity information thereby being lost. The data acquisition system currently in place in the laboratory makes use of computer-controlled variable integration times to obtain a single wide-dynamic-range spectrum, which is linear over five orders of magnitude. These enhanced spectra have a dynamic range comparable to that obtained with a PMT, as well as the advantage of simultaneous acquisition of intensity data at 4096 (or 2048) discrete wavelengths. 3.2.1 Collection of Sample Spectra A standard solution of 5000 ppm Co was prepared from Co(N03)2*6H20. An inductively coupled plasma unit as described in later chapters was used to collect Co spectra, which were then processed to provide samples of normal and dynamic range enhanced spectra. 50 3.2.2 Algorithm The algorithm employed to produce a single dynamic range enhanced spectrum is outlined in Figure 3.7. The data collection software is run through a series of six spectral collection cycles, each with a different integration time. The spectrum obtained from each cycle is placed in a different column of a 4096 by 6 data array. Each column thus represents a normal photodiode array spectrum at a different integration time. The integration times routinely used are 1,4,16, 64, 256, and 1024 units, with the longest integration time representing approximately 9 seconds, and the others proportionally shorter times. Separate data arrays are stored in memory for the foreground and background spectra. A saturation threshold, representing the ceiling above which an intensity measurement may be considered saturated, can be preset within the program. One can reset this parameter without leaving the program to adjust for changes in the performance of the array. A DC offset parameter is required to compensate for the nonzero value added to any reading taken from the PDA as a result of the readout electronics. This parameter can also be reset within the program. This parameter is subtracted from all values read from the array as part of the computation of the enhanced spectrum. One can examine, individually, each row of the data array (representing each diode), starting at the highest integration time. If the intensity at the highest integration time is above the saturation threshold, the intensity at the next lowest 51 Figure 3.7. Schematic of the data acquisition and dynamic range enhancement algorithm. A number of single spectra (A) at set integration times are collected and stored in a data matrix (B). A set of measurements at different integration times (C) for each separate diode is taken from the data matrix, and the largest unsaturated measurement is found. This optimum measurement is scaled and becomes the measurement at that diode in the enhanced spectrum (D). t 1024 I N T E G R f i T I O N T I M E S 52 integration time is examined, and the process of stepping to lower integration times is continued until an intensity measurement below the saturation threshold is found. If the diode is saturated at the lowest integration time (only a problem in very concentrated solutions) a warning is given, and the value from the lowest integration time is taken. The chosen intensity is then scaled to its equivalent intensity at an integration time of 16. For example, measured intensity = PDA reading - DC offset 1660 = 1780 - 120 scaled intensity = measured intensity * 16/integration time 415 = 1660 * 16/64. This calculation is repeated for each diode in the foreground spectrum, and the results are placed in the normal foreground array vector where they can be accessed for plotting, screen display, or saving on disk. It is assumed that any measurements taken in the foreground will be at least as intense as those in the background, since the foreground spectrum is composed of background emission plus analyte emission. Thus a measurement which is not saturated in the foreground at a given integration time will also not be saturated in the background at that same integration time. For this reason the same integration times that are used for the foreground described above are used for the scaling of the background spectrum. A corrected spectrum is also calculated by subtraction of the background from the foreground. The total time required for the acquisition of a complete data matrix of foreground or background spectra for dynamic range enhancement is approximately 50 seconds. The calculation of the enhanced spectra takes less than 10 seconds. 53 3.2.3 Comparison of Normal and Enhanced Spectra A normal spectrum and an enhanced spectrum are compared in Figures 3.8 through 3.11. Each spectrum is plotted with a number of vertical scale expansions so that the differences between them can be emphasized. The normal spectrum is taken with an integration time of 8 (0.07 seconds) to avoid saturation of the most intense peaks. The enhanced spectrum is scaled to an integration time of 64 (0.56 seconds). Thus, peaks in the enhanced spectrum are numerically eight times larger than those in the normal spectrum. In Figures 3.8 through 3.11, the normal spectrum is compared to the dynamic range enhanced spectrum. The normal spectrum is scaled to give equivalent peak heights to the dynamic range enhanced spectrum. The enhanced spectra are seen in Figures 3.8b (plot scale 0.1), 3.9b (plot scale 1.0), 3.10b (plot scale 10.0), and 3.11 (plot scale 100.0). The normal spectra of peak height comparable to those of 3.8b, 3.9b, and 3.10b are seen in Figures 3.8a, 3.9a, and 3.10a. Figures 3.8a and 3.8b appear to be almost identical, with the only observable difference being the slightly increased noise in the bp.-eline in Figure 3.8a in the region of diodes 3000 to 4000. Figure 3.12 shows the region from diode 1100 to diode 1600 in greater detail. For these intense peaks, the normal and enhanced spectra appear indistinguishable. With plotter scale expansion enlarged by a factor of ten, Figures 3.9a and 3.9b are quite easily distinguished, with much higher baseline noise being evident in Figure 3.9a. With a further tenfold expansion, Figure 3.10a shows intense baseline noise, obscuring many of the weaker spectral features. Figure 3.10b is only beginning to show baseline noise, and as a result spectral peaks 54 Figure 3.8. Photodiode array spectrum of 5000 ppm cobalt centered on 341.0 nm. a) Normal. b) Dynamic range enhanced. Co 341rw SggBppff. (Nor) to z B Co 34Inn 5g00ppm (Enh) 55 Figure 3.9. Photodiode array spectrum of 5000 ppm cobalt centered on 341.0 nm. Vertical scale 10 times that in Figure 3.8. a) Normal. b) Dynamic range enhanced. Co 341™. Sflggppa (Nor) CD X tn z UJ B Co 34 Inn 5000ppm (Enh) tn 56 Figure 3.10. Photodiode array spectrum of 6000 ppm cobalt centered on 341.0 nm. Vertical scale 100 times that in Figure 3.8. a) Normal. b) Dynamic range enhanced. Ce 341n» SSBBppw (Nor) • 0I0DE NUMBER Co 341ntn SflBBppm (Enh) B DIODE NUMBER 57 Figure 3.11. Dynamic range enhanced photodiode array spectrum of5000 ppm cobalt centered on 341.0 nm. Vertical scale 1000 times that in Figure 3.8. At this scale expansion only noise is seen in the normal spectrum (not shown), while very weak spectral features can clearly be seen in the dynamic range enhanced spectrum. Figure 3.12. Photodiode array spectrum of 5000 ppm cobalt; strong peaks between diodes 1100 to 1600. a) Normal spectrum. b) Enhanced spectrum. Co 341™* 5000ppn> (Nor) s tn 12S M B 01ODE NUMBER 1475 Co 34Iron 5000ppm (Enh) B in U J 59 are easily distinguished from the noise. Figure 3.13 shows the region of weak peaks from diode 2600 to diode 3100. The difference in the level of noise is obvious. Figure 3.14 shows the region of very weak peaks from diode 3500 to diode 4000. In the normal spectrum, all but two of the peaks are entirely lost in the noise, whereas in the enhanced spectrum, many peaks are well above the noise. A final tenfold plot expansion gives Figure 3.11, for a total expansion of 1000 over the first plot (Figure 3.8b). The noise level in the baseline is now very noticeable, but tolerable, and many very weak spectral features can be seen. A plot for this degree of plot expansion for the corresponding normal spectrum is not shown, since it would show only baseline noise. By direct observation of the spectral plots, the RMS value of the noise in Figure 3.11 is seen to be greater than that seen in Figure 3.9a and less then that in Figure 3.10a. The enhanced spectrum gives tenfold to hundredfold improvement in the suppression of the noise floor. Another comparison of the normal and enhanced spectra can be made by examining a plot using a logarithmic intensity axis (Figure 3.15). The noise floor is approximately two orders of magnitude less in the enhanced spectrum than in the normal spectrum. Spectral features which are lost completely in the noise in the normal spectrum are clearly defined in the enhanced spectrum. 3.2.4 Effect of Saturation and DC Offset Parameters The saturation parameter is used as an estimate of the ceiling above which the intensity incident on a single photodiode is sufficient to exceed the linear response range of the photodiode. Some care must be taken to select an appropriate value. 60 Figure 3.13. Photodiode array spectrum of 5000 ppm cobalt; weak peaks between diodes 2600 to 3100. a) Normal spectrum. b) Enhanced spectrum. 341nm. 5BB0ppm tNor) _ A i DIODE NUMBER Co 3<lnm 5000ppm lEnh) B DIODE NUMBER 61 Figure 3.14. Photodiode array spectrum of 5000 ppm cobalt; very weak peaks between diodes 3500 to 4000. a) Normal spectrum. b) Enhanced spectrum. Co 341n« Seg0pp* (Nor) s s DIODE NUMBER Co 341nm SBBgppm (Enh) B CM X 62 Figure 3.15. Photodiode array spectrum of 5000 ppm cobalt; a comparison using a logarithmic intensity axis. a) Normal spectrum. b) Enhanced spectrum. COMPARISON OF LOGARITHMIC SPECTRA C c A NORMAL • B m log 1 log ENHANCED Diode Number 63 If the value chosen is too low, useful spectral information on spectral peaks will be lost, since a value from the next lower integration time (with a poorer signal to noise ratio) is taken instead. If the value chosen is too high, the peak will be cut off at the integration time chosen, and, in the enhanced spectrum, the peak height will be anomalously low. The optimum value of the saturation parameter for the particular data acquisition system can be found by taking a spectrum at a high integration time without cooling the photodiode array; this procedure results in saturation of the entire array. A value is then chosen which is lower than the lowest numerical values obtained from such a spectrum. The DC offset parameter is used as an estimate of the floor of the array in the absence of signal or dark current. A value which is too high or too low will result in the improper scaling of data from different integration times. As an example, a small error in the estimation of the DC background can lead to a significant error in the intensity of the scaled signal. A signal of 50, on a correctly estimated DC background of200, at an integration time of 4, would scale to: (250 - 200)) X 64 / 4 = 800 If the DC background were incorrectly estimated to be 140, the scaled value would be: (250 - 140) X 64/4 = 1760 The optimum value for the DC offset parameter can be obtained by taking spectra with entrance slits closed, at integration times of 1,2, and 3, with a cooled array, and extrapolating to an integration time of 0. In practice, an average of the values obtained at an integration time of 1 (closed entrance slits, cooled array) may be used. 64 3.2.5 Advantages The enhanced spectrum exhibits an improvement in the suppression of the noise floor by a factor of 10 to 100. Considering that the dynamic range of the photodiode array without enhancement is typically less than 10s, and that of a photomultipHer tube is 106 or better, this value represents a significant improvement in the utility of photodiode array spectra in situations requiring a greater dynamic range. With this enhancement the PDA provides a competitive alternative to the PMT, with the added benefit of allowing multiwavelength simultaneous data acquisition. Such spectra are ideal for factor analysis, since the contributions of minor components are not lost in the noise, as is the case with normal PDA spectra. Peak saturation also becomes a rare occurrence, since only the most concentrated analyte solutions will saturate the array at the lowest integration time. Enhanced spectra have been used in many of the factor analyses described in later chapters. Enhanced spectra have also been collected over the wavelength range of 200 to 450 nm for a number of elements. These spectra provide a spectral survey to assist in the evaluation of the best spectral windows for the determination of several elements in a complex mixture. 3.3 A Unique Spectrometer Design The performance of a new type of instrument combining the high resolution of an echelle spectrometer with a wavelength preselection capability and photodiode array detection has been previously evaluated by a number of researchers [D5,7,48]. 65 It offers a combination of unique features which can be taken advantage of to their fullest only by using multivariate data analysis methods. This system is evaluated in detail in a subsequent chapter. 66 Chapter 4 Factor Analysis 4.1 Introduction In this chapter, a general description of factor analysis is given. Detailed discussion of the line-selection algorithms can be found in following chapters. 4.2 Requirements for Successful Factor Analyses Factor analysis has proven to be a very powerful method for the extraction of analytical information from complex data sets [46,49,50,51,52,53,54]. Many complex data sets contain contributions from many components, and it is often desirable to qualitatively determine the identity of the components present. Once the identities of the components are known, a quantitative determination may also be desired to provide a concentration in aqueous solution. Factor analysis can provide this information. The requirements for successful factor analyses are now described. 4.2.1 Spectral Overlap In spectroscopy, spectral mixtures are often represented by overlapping spectra of the pure components (Figure 4.1). The contributions to the over-all measured intensity at any one wavelength will be the linear combination of contributions from each of the components present, with each contribution being directly proportional to the concentration of the component. 4.2.2 Dynamic Range For factor analysis methods to be successful, a high degree on linearity over a wide dynamic range to cover a broad concentration range is required. The inductively coupled plasma is well known for its linear response over several orders 67 Figure 4.1. Spectral overlap is inevitable when hundreds of emission lines may be detected for each element. The spectra shown span only 40 nm., and each represents only a single element, but because of the number of emission lines in each spectrum, the probability of a direct spectral overlap is high. line wings of intense lines may extend several nanometers, further increasing the probability of partial overlap and background shifts, (a) 500 ppm of Fe; (b) 500 ppm of Co; and (c) 500 ppm of La. B 1 J i l l J 111^  it) . * L l l L l J]..i .1 .L. . . - L i • 1 1 1001 JOC' 3001 ttK' II 1 U M i J . , 1 . 1 1 ID i i | | i jLll..i.Uf. I . l 1W- 20Ci 1001 4001 C i 1 1001 JOOl 3001 «O0i Diode number 68 of magnitude. This linearity is retained during data acquisition by applying dynamic range enhancement of photodiode array spectra [55], as previously outlined in this thesis. 4.2.3 Solutions for Multivariate Equations A data set such as that described above may be solved simultaneously for the concentrations of all the components present as long as the spectra of the individual components are known. The simplest solution to such an example would involve a least-squares fit of all the data to the known spectra. Even in such a straightforward situation, there may be advantages to using a factor analysis, for example for the partial removal of random noise [50,56,57]. In many instances, however, the identities of the components are not known a priori, so such a solution is not possible. In fact even the number of components (or factors) present may at first be uncertain. In these instances more sophisticated multivariate methods may be used. 4.3 Preparation of Data in a Form Suitable for Factor Analysis Initially, the data are entered into a two-dimensional matrix (Figure 4.2). The number of rows (M) corresponds to the number of detecting elements used, and the number of columns (N) to the number of spectra. A covariance matrix is generated by multiplying the transpose of the initial matrix by the matrix itself. This covariance matrix (now N by N) is subjected to an eigenvector analysis [50], with the resulting eigenvalues and eigenvectors stored for later reference. 69 Figure 4.2. The structure of a typical data matrix. Each entry represents an uitensity measured at a specific diode (e.g. D3), for a specific sample spectrum (e.g. 4- Spectre SI ,D1 S1JD2 82,02 S1,D3 S2,D3 81^4 S2,D4 t Diodes 70 4.4 Eigenvectors and Eigenvalues The spectrum for any element can be considered as a vector in multidimensional space. Each axis represents the measured intensity at a single wavelength (Figure 4.3). The magnitude of the displacement along each axis is proportional to the measured intensity. Since the line intensities and wavelengths are different for each element, the vector representing the spectrum of each element will point in a characteristic and unique direction in this multidimensional space (Figure 4.4). Thus the number of dimensions spanned by the data set will reflect the number of components (different elemental spectra) present. In a data set where the number of components is not known, the variance of the data set can be described by the way the data are spanned by a series of orthogonal vectors. The number of orthogonal vectors required to span the data set in multidimensional space will equal the number of elements. These orthogonal vectors are called eigenvectors. The first eigenvector is derived by finding the direction in which the highest variance is spanned in multidimensional space (Figure 4.5). The second eigenvector is then restrained to being orthogonal to the first, but may be rotated in any of the other directions to span the greatest of the remaining variance. Subsequent eigenvectors are obtained by continuing this process. The total number of eigenvectors generated in this manner is equal to the number of factors present in the original data matrix. Thus, if the initial data matrix contained five identical iron spectra and five identical cobalt spectra, the total number of eigenvectors generated would be two, corresponding to the two factors 71 Figure 4.3. Conversion of a multiwavelength spectrum into a vector in multidimensional space. REPRESENTATIONS OF SPECTRA IN INTENSITY SPACE 72 Figure 4.4. Unique Axes for each element in a multidimensional space. Unique Axes for each Element 73 Figure 4.5. Example of the derivation of eigenvectors for a two dimensional space where two components are present, resulting in non-zero variance in two orthogonal directions. The first eigenvector spans the greatest possible variance in one dimension, and the second eigenvector spans the maximum of the remaining variance in an orthogonal dimension. eigenvector 2 eigenvector 1 I 74 present, an iron spectrum and a cobalt spectrum. When dealing with real spectra, no two spectra will be exactly the same, even if they are taken immediately after each other from a very stable source, using a very stable detector. Two spectra will not be exactly the same because there is always noise associated with any measurement, no matter how precise. As a result of this random noise, an eigenvector analysis will generate more eigenvectors than the number of factors present. For the example of five iron and five cobalt spectra, a total of ten eigenvectors will be generated. The true number of components must be ascertained, not from the number of eigenvectors, but from their relative magnitudes. The eigenvalues give a measure of the magnitude of the variance spanned by each eigenvector. The first eigenvector spans the largest variance, and will have the greatest eigenvalue. Subsequent eigenvalues are of decreasing magnitude. The eigenvalues commonly exhibit a sudden drop when all the variance due to the real number of components has been spanned, which corresponds to the number of elements present. The remairiing, much smaller, variance is normally due to random noise. A factor analysis is carried out starting with the generation of the eigenvectors representing the variance of the data matrix. The largest eigenvectors represent mostly variance due to the components present and the smaller eigenvectors represent mostly variance due to random noise. A set of eigenvectors is selected, starting with the largest eigenvector, and adding eigenvectors in order of decreasing magnitude. This selected set of eigenvectors is a set of orthogonal abstract factors. 75 The goal is to select a sufficient number of abstract factors to fit the data within experimental error, and discard the remaining eigenvectors. These abstract factors then define the space within which all the data resides. Alternately, one can say that the number of abstract factors spans all the significant variance in the data set due to real components. The determination of this point separating the real factors from the noise has been the subject of much research. 4.5 Determination of the Number of Factors Ideally, there will be an obvious difference in magnitude between the eigenvectors which represent real factors (large eigenvalues) and the remaining eigenvectors which represent only random noise (small eigenvalues). The data matrix can then be reconstructed within experimental error by using only these first few eigenvectors. The selection of an appropriate number of eigenvectors for reconstruction of the data matrix is equivalent to the determination of the number of factors, and thus the number of components (elemental spectra). This determination of the number of factors can be complex, requiring the evaluation of not just one but a number of indicator functions [50,58,59,60,61]. An additional benefit of this approach is the rejection of the eigenvectors that are due to noise, which leads to a reduction in the overall noise in the data matrix. If too few factors are chosen, the data matrix will be poorly reconstructed, since significant information pertaining to the real components (in one or more of the eigenvectors) has been left out. In multidimensional space, some of the data points will He significantly outside the denned subspace. If too many factors are chosen, the 76 data will be satisfactorily reproduced, but will not have the optimum amount of noise removed. In practice, the partition between those significant eigenvectors due to real factors and the less significant eigenvectors due to noise can be indistinct. For this reason, a number of parameters and associated indicator functions [50,58,59] are examined, such as a ratio of successive eigenvalues, which can be useful in separating the total number of real factors in a data set from those factors due to noise. Following the initial generation of eigenvectors and their associated eigenvalues, the real, imbedded and extracted errors are computed for each possible number of components from 1 to N-l. Real error is an estimate of the total error present in the data matrix. Extracted error is that portion of the error which is removed by reconstructing the data matrix with just the first N' eigenvectors (where 1 < N' < N), and imbedded error is that portion of the error which remains in the data matrix even after reconstruction. All these indicators reach a relatively constant value when the correct number of components is reached. A second means of determining the number of factors is given by the indicator function (IND) [58,59], which is derived from the real error. The correct number of factors is characterized by a minimum in this function. The RATIO function represents a ratio of successive eigenvalues. A large drop in eigenvalues occurs when all significant components are spanned by the correct number of abstract factors. The behavior of the RATIO function is such that it will show a peak at the correct number of significant factors. It is not uncommon to see more than one peak. For example, this might occur when there is one spectrally 77 intense component present, with two other less intense components. A peak indicating one factor suggests the presence of a major spectral component, and a second peak indicating three factors suggests the presence of two additional significant components. An example is taken from Table 6.18, where the first four values of the RATIO function for an intermediate result are 1049.1,1.9,39.4, and 1.2. Of the many indicator functions which have been used in this research, the IND function [59], and the RATIO function [46] have been found the most useful. A consensus of these indicator functions must be considered when deciding upon the number of factors present in a particular set of data. Once the number of components present is determined, the dimensionality of the factor space is reduced to the minimum required to span all the components. This is done by selecting only the N1 largest eigenvectors. This has the advantage of eliminating some of the noise present in the data set [50], as the largest eigenvectors represent real components, while the smaller eigenvectors, which are discarded, represent noise. All spectra are then represented in factor space by co-orclinates relative to the remaining orthogonal abstract factors, which are equal in number to the total number of components present. In many applications of factor analysis, the identity of the elements present may not be known. Once the number of factors has been determined, the elemental spectra to which the eigenvectors correspond must be found. This identification is accomplished by target testing factor analysis. 78 4.6 Target Testing Factor Analysis In target testing factor analysis, a target test vector is first selected as a likely candidate to represent one of the factors. This test vector will usually be the spectrum of a single element. The fit of the test vector into the abstract factor space is evaluated (Figure 4.6). This evaluation indicates whether the chosen spectrum is in fact present in the data set. A parameter called SPOIL has been defined [50] to assist in determining whether a given test vector is acceptable. Generally, a value of SPOIL less than 3 is acceptable. A value between 3 and 6 is acceptable but less reliable, and above 6 is unacceptable. Several target tests are attempted with different possible components. When more than one component must be identified, more than one successful test vector will be found. After completion of the target testing, the number of successful test vectors should equal the previously determined number of components. The factor analysis continues by projecting each of the spectra in the data set on to the axes representing the components verified by target testing (Figure 4.7). The contribution of each component, as determined from the projection onto that axis, is termed a factor loading. These factor loadings so generated are proportional to the concentrations of these known components. An understanding of this process is facilitated by considering the eigenvectors generated at the beginning of the analysis. The entire factor analysis process may be represented schematically by Figure 4.8. A multidimensional space (intensity space) is defined by the original M intensity measurements at different wavelengths. The eigenvectors from an 79 Figure 4.6. Evaluation of a test vector. The two dimensional plane represents the plane defined by the vectors of two pure components. If the test vector fits favorably into the plane, it is accepted. If the fit is poor, the test vector is rejected. TEST VECTOR FIT in two dimensions i£ poor good 80 Figure 4.7. Projection of the sample spectrum onto the component axes found by target testing. The displacement along each axis is proportional to the concentration of that component. 81 Figure 4.8. Schematic representation of the major steps in a factor analysis. Factor Analysis of Spectra to give Component Concentrations Intensity Space A Abstract Factor Space J _ z Concentration Space • Eigenvector Analysis Target Testi ng end Projection 82 eigenvector analysis define a subspace (abstract factor space) where the number of dimensions are determined by the number of components. The original spectra are defined in this new coordinate system of abstract factors. Any vector which is to successfully describe the real spectrum of the component must lie in this subspace within experimental error. Target testing gives an evaluation of the goodness of fit of the test vector to the subspace. In a multielement determination, target factor analysis is continued until the number of test vectors found that have a satisfactory fit to the subspace is sufficient to account for the total number of factors determined to be present. The coordinate system of the subspace is redefined again by these test vectors (concentration space). Projection of the spectral vectors onto these axes give the concentration of each component. If all the factors, each representing a different element, cannot be successfully identified, the remaining dimensionality of the subspace may be spanned by generating residual vectors. These residual vectors are orthogonal to each other and to each of the successful test vectors. Projection of the spectral vectors may be carried out as above to give concentrations of components. In the following chapters, residual vectors are shown as a successful way to deal with unidentified components. 83 Chapter 5 Unidentified Components and Residual Vectors 5.1 Sources of Unidentified Components So far, this general overview has dealt only with situations where all the components can be identified by target testing. In many instances one or more unidentified components persist, despite all attempts to identify them by target testing. This might occur due to the lack of a suitable element in the test set. It may also occur due to non-linear interactions, such as matrix effects, which can be modelled as a linear factor [62,631, but for which no suitable test vector exists. The following sections focus on situations where many factors are identified successfully by target testing, but the number of factors still exceeds the number of successful test vectors. Thus, there remain unidentified components, which drastically affect the best fit of the data to the known components. To overcome this problem, a residual vector or vectors, representing the dimensionality of the factor space which is not spanned by the identified components, are generated from the factor analysis. 5.2 Rejection of Diodes with Interferences For atomic emission spectrometry using a linear photodiode array, a large number (up to 4096) of simultaneous intensity measurements are made. There are approximately 60 [64,65,66] elements which can be analyzed by ICP-OES. Not all of these would be present as components in even the most complex sample, so the data matrix is greatly over-determined. Thus it is possible to selectively discard those resolution elements (diodes) which may have a large contribution to the residual 84 vector. When these diodes are deleted from the data set, a new factor analysis can be carried out and a new residual vector generated. This process can be continued until all values in the residual vector are below an acceptable threshold; in other words, until the deviations from the known components represented by the residual vector approach the random deviations due to noise. The final determination of the concentrations of the known components can then be carried out with a minimum of interference from any unidentified components. Essentially, this approach allows the choice of diodes for which there is little or no contribution from unidentified components; that is, the known components may all simultaneously have measurable and overlapping signals at the diodes in question and still be quantitatively determined, as long as the unidentified component has a negligible signal at those diodes. It is of course advantageous to the analyst to attempt to identify as many of the components as possible, since the fewer components there are left to be identified, the less the possibility that a diode will be rejected as a result of a contribution from an unidentified component. With more diodes available to carry out the final multivariate determination, the results of the analysis may be more accurate as they are based on a larger number of independent measurements. 5.3 Problems in the Modelling of Unidentified Components The problem with unidentified components can be more clearly understood if one considers a vector representation of spectra of pure components and mixtures. Each spectrum of a pure component, which in atomic emission spectrometry corresponds to a spectrum of a single element, can be represented by its own unique 85 vector in multi-dimensional space. Each component of the vector will be one of the many emission intensity measurements from the photodiode array. A vector representing a mixture of two components will he in a plane defined by the vectors for each of the pure components (Figure 5.1a). The vectors representing the pure components will not be orthogonal, as all intensity measurements will be positive. All vectors must therefore He in the first quadrant, and the angles between spectral vectors will always be less than 90°. The two vectors representing the pure components are labelled E l and E2 in Figure 5.1a, and the position of the mixture spectrum is marked. The contribution from each of the components may be determined by projecting the point representing the mixture back onto axes E l and E2. There is no ambiguity concerning the projections when each component is known. This same process can be extended to as many dimensions as required for the quantification of a multi-component sample. A problem becomes evident if one now considers a binary mixture where only one component is known (Figure 5.1b). In this instance only the first component has been identified by target testing, and no successful test vector could be found for the second component. It can be seen that there now exists an ambiguity in the appropriate projection of the mixture point onto the axis for the one known component. In case A, the vector for the second component is arbitrarily chosen to be orthogonal to the known component. This solution requires that diode 1 registers a negative intensity for the pure second component, which is not encountered in real emission spectra. On this basis, case A may be rejected as a possibility. Case B is 86 Figure 5.1. a) Estimation of concentrations ([El], [E2]) of two identified components (El, E2) in a mixture. b) Ambiguous estimation of the contribution of an unknown component to a mixture spectrum when only one of two components is identified. N 0 m X v I (diode 1) Mixture I (diode i) 87 chosen so that no negative intensities occur, and requires that diode 1 has no measurable emission intensity for the second component. This is acceptable, but by no means the only possibility. In case C, diode 1 does in fact register a number of possible vectors which could represent the contribution from the second component. This situation becomes rapidly more complicated when there are several identified and unidentified components present. A common restriction placed on emission intensities requires them to be non-negative [67,68,69]. This rejects physically unacceptable solutions (negative emission intensities), but there still remain an infinite number of possible vectors to represent the unidentified component. 5.4 A Solution for Unidentified Components One possible method of dealing with the "unidentified component" problem is to lower the dimensionality of the data used in a multivariate analysis. This can be carried out by selectively removing those diodes which represent the greatest residual variance after all known components have been accounted for. This procedure must be performed in several steps, as many smaller residual variances will be due not to the presence of an unidentified component, but to the poor overall fit of the data to the known components. As the worst offenders (diodes with the largest contribution to the overall residual variance) are removed, the fit of other diodes will improve, and the remaining diodes with significant residual variance will stand out more clearly. 5.5 Regeneration of Interferent Spectra Once all known components have been quantified, one may return to the original full dimensional data set, and remove the contributions from the known 88 components, leaving only the contributions of the residual vectors in all spectra. If only one unidentified component were present, a single residual vector would represent the actual spectrum of the unidentified component. For two or more unidentified components, two or more residual vectors would be left, and would not directly represent the spectra of the unidentified components, but some linear combination of the spectra of these components. 5.6 Summary The processes described here are illustrated in the following chapter. Individual diodes are rejected on the basis of their magnitudes in the residual vector, and the remaining diodes are subjected to a factor analysis. 89 Chapter 6 Automatic Matrix-Dependent Wavelength Selection with Multi-Line Detection 6.1 Survey of Single Element Spectra from an ICP Spectrometer The spectrometer used in this study was capable of simultaneously collecting a 40 nm window of spectral information. 4096 photodetectors were present in the photodiode array, which over a wavelength range of 40 nm gave a pixel resolution of 40/4096 = 0.01 nm. The wavelength of the center of the window could be changed by scanning the polychromator to a position anywhere between 200-450 nm. This range of wavelengths contains the vast majority of analytical lines used in ICP spectroscopy. Survey spectra from 200 to 450 nm were taken for several elements to determine an optimum window containing the best selection of intense lines for all elements studied. Survey spectra of Ar (for background evaluation), Ni, Cr, Fe, Mn, Cu, Sr, and Co were taken. A hard copy of the spectra was recorded on the plotter. The plotter in use in the laboratory when these spectra were taken produced a compressed spectral plot where 130 mm on the plotter paper represented 30 nm of spectrum. The positions of peaks could be roughly labelled to the nearest nanometer by interpolation. The intensity of each emission line was a function of the operating conditions of the plasma, and the sensitivity of the detector system. The sensitivity decreased rapidly below 250 nm. Reference to spectral tables would not be expected to give accurate intensity information for this particular system without extensive 90 wavelength dependent sensitivity corrections. It was deemed simpler to examine these spectra empirically, and roughly classify their features by wavelength region and relative intensity for the laboratory system. A strong line was arbitrarily defined as one which was off scale for a standard vertical scaling factor, integration time, and analyte concentration. A very strong line was defined as one which was off scale and in addition had prominent wings, especially noticeable in the logarithmic plots. A medium line was at least half scale, and a weak line was less than 10% of the full scale. On this basis the major features of the spectra collected were briefly summarized (Table 6.1). One of the goals of this study was to carry out quantitative analytical determinations without reference to spectral tables of line intensities. These determinations were to be done using pure spectra as fingerprints, masks, or models, without knowledge of the origin or exact wavelengths of the lines. For this reason, it was decided to avoid identifying the lines or exact wavelengths used in this study. The identity of these lines was however determined later in this study, to allow comparison with univariate methods. Ar spectra were used only for evaluation of the success of background subtraction in the vicinity of intense Ar lines. On the basis of these survey spectra, a spectral window centered on 343 nm was chosen. Fe and Co, having many lines in this area, were chosen as the first elements to be dealt with in detail. The lines in this region also represented the most intense lines obtained for these two elements using the photodiode array spectrometer system. 91 Table 6.1. Major Spectral Features of Survey Spectra Element Feature Description Wavelength Region(s) (nm) Sr Strong lines 336, 345, 415, 428 Very strong lines 405, 419 Cr Strong triplets 357, 425 Medium clusters 283, 311 Many medium-weak lines Throughout Ni Many strong-medium lines 335-358 Many medium-weak lines 297-311 Cu Very strong lines 322, 325 Very weak elsewhere Co Medium-strong cluster 336-357 Approx. 4 medium lines 382-396 Mn Very strong multiplet 401 Strong triplets 292, 257 Strong line 342 Medium cluster 343-357 Approx. 3 medium-weak lines 380 Medium triplet 279 Fe Medium clusters 370-387 Medium-weak clusters 370-387, 410 Weak clusters elsewhere. 6.2 Spectral Overlap Spectral interferences are the major reason for the selection of an alternate line in a univariate analysis. Instrumental parameters such as the resolution of the spectrometer, including slit width, can affect the degree of overlap with adjacent lines. Although spectral overlap is a problem with univariate analyses, in 92 multivariate analyses, the contribution of adjacent lines, direct spectral coincidences, line wings, and background can be quantified, so that the analytical information still present in a severely overlapped line is not discarded. 6.3 Data Pre-processing The advantages of multivariate factor analysis for the interpretation of spectral data have already been discussed. Several methods of data pre-processing were evaluated with respect to their effects on the precision and accuracy of the results of a multivariate analysis. Methods examined included the weighting of data using functions based on single element standard spectra, smoothing of spectra by apodization of the Fourier transformed spectrum or by a five point average, and the setting of a threshold for the selection of diodes, below which there is considered to be only noise. The effect of slit width on factor analyses was also investigated. 6.3.1 Weighting 6.3.1.1 Justification for the Use of Weighting Schemes A weakness of the factor analysis process becomes apparent when working with spectral data where the majority of the detecting elements are measuring only noise. The noise at a background diode, can be 0.1 to 1% of the signal present at a diode where a peak is present. As an example, if one takes a spectrum from a 4096 pixel photodiode array containing 20 peaks, with each peak extending across 10 diodes, then this corresponds to 200 diodes measuring signal and 3896 diodes measuring noise. Each of these 4096 features is considered of equal importance in a factor analysis. The signal at a peak will add as a simple sum of the N pixel 93 intensities. The noise, if random in nature, will add as the square root of the number of pixels. Even though the signal has a net VN advantage over the noise, the much greater number of measurements containing only noise may actually decrease the precision of a determination. This may be compared to the problem of using an exit slit much wider than the spectral peak being observed. The noise from background measurements on either side of the peak, although small for each diode, in total add to a significant proportion of the observed signal. Thus weak signals are lost most easily in the noise. Ideally, one would like to give weight only to those diodes which are known to fall under an analyte peak, and ignore the rest. One way to emphasize the signal at diodes with analyte peaks without emphasizing the signal at background diodes is to multiply each diode by a diode specific weighting function. Such a weighting function can be derived from the analyte spectrum itself, or from a sum of several analyte spectra if more than one element is to be determined. The coefficients of the weighting function are the sum of the intensity contributions from a set of standard spectra. The sum is computed separately for each diode, to give a weighting factor for each diode. A diode is automatically given more weight if analyte emission is present at that diode, in proportion to the magnitude of the measured emission. When a sample spectrum is multiplied by this weighting function, the diodes containing analyte peaks automatically become more prominent, while those without analyte peaks remain the same. For example, the analyte signal at a diode measuring 100 intensity units is increased significantly to 10000 units, while another 94 diode measuring a noise signal of 3 intensity units is increased to just 9 units. For this particular example, the ratio of the signal contribution to the noise contribution is increased from 100/3 = 33.3 to 10000/9 = 1111.1. In terms of factor space, the dimensions with analyte information become expanded, while those dimensions without analyte information do not, so factor space is distorted to favor of analyte peaks. The weighting function may be more complex, such as a square, or cube, or an exponential function. These weighting functions were evaluated by observing their effect on the precision of a factor analysis. 6.3.1.2 Evaluation of Suitable Weighting Functions 6.3.1.2.1 Least Squares Fit with 100 diodes An initial evaluation^ the performance of a weighting function was done with a least squares fit over a 100 diode subset of the full spectrum. It was found that target factor analysis gave only a slight margin of improvement over a least squares fit in the cases where all the factors were well characterized (all elements were identified). In cases where all factors are not well characterized, a least squares fit will give erroneous results, since a fit will be forced to an insufficient number of parameters. It is in these cases where factor analysis is most useful. The least squares fit was carried out for 100 diodes containing both a strong Co and a strong Fe peak. In all cases, the concentration of standards was 500 ppm, and all standards were aqueous solutions. Three solutions of each standard were used. An average of all three standards was used when carrying out a least squares 95 fit. A weighting function which was a simple multiple, or power function was used. This is in the form of a weighting factor = XN , where X equals the peak intensity for a single diode when the averaged Co and Fe spectra are added together, and N is an integer. The results for values of N from 0 (no weighting) to 7 are shown in Table 6.2. A weighting of N=l is much better than no weighting at all. The standard deviation of a measurement of low Fe concentrations drop from 1.81 with no weighting to 0.94 ppm with N=l weighting. The standard deviations of 500 ppm Co or Fe remain roughly constant as the value of N is increased. The most drastic improvements at low concentrations are seen for values of N from 1 to 3. Higher values of N do not appear to improve precision further. This suggests that for a least squares fit, weighting with X 1 to X 3 seems to offer the most improvement. 6.3.1.2.2 Factor Analysis with 4096 Diodes Weighting was tried again with a more complex analysis (Tables 6.3, 6.4, and 6.5). All 4096 diodes were included, and a factor analysis with target testing was carried out. The standards used as target vectors were an average of the spectra taken for each 500 ppm standard, Co and Fe. The full data set included various combinations of concentrations of Co and Fe (Tables 6.3). In one case (5*) an extra four factors were allowed, but did not appear necessary, and the resulting changes in the above values were not significant. Other weighting schemes were also tried. A function l.lexp(X) failed, as seen 96 Table 6.2. Effect of the Weighting Function X N on Precision Standard Deviation Value ofN Co 500 ppm Co 0 ppm Fe 500 ppm Fe 0 ppm 0 1 2 3 4 5 6 7 8.36 8.27 8.38 8.38 8.35 8.30 8.23 8.15 0.62 0.21 0.32 0.39 0.44 0.48 0.51 0.54 4.03 4.59 5.04 5.36 5.59 5.77 5.91 6.04 1.81 0.94 0.61 0.45 0.37 0.32 0.28 0.24 in Table 6.6. The exponential weighting scheme does not appear to achieve what is desired, that is, a scheme which decreases spurious responses when the element is not present, but maintains the precision of the results when an element is present. 6.3.2 Spectral Smoothing 6.3.2.1 Smoothing by Fourier Transformation and Apodization In addition to weighting functions, various types of spectral smoothing were attempted to reduce noise in the spectra. The first tried was smoothing by truncation of a Fourier transformed spectrum. A boxcar apodization was used. The spectrum was then transformed back to wavelength space from frequency space. This approach was based on the concept that the fluctuations from diode to diode were noise, and any spectral information would fall across several diodes, not just one. The truncation of the high frequency component would have the effect of smoothing the 97 Table 6.3. Concentrations of Co and Fe used for Data Pre-Processing Evaluation. Co (ppm) Fe (ppm) 500.00 0.00 500.00 100.00 0.00 1.00 1.00 1.00 1.00 100.00 100.00 100.00 adjacent diode fluctuations due to noise without affecting the more gradual trends indicative of a real spectral peak. In practice, truncation added spurious periodic fluctuations in the background which were picked up as extra factors in the analysis. The accuracy of the factor analysis results actually decreased. Cutoff values could range from 2047 (no cutoff) to 0 (complete truncation). Values from 2047 to 1200 had little or no effect on the spectra, while values of 750 and below had extreme "ringing" in the vicinity of intense peaks. Thus, attempting to remove the high frequency components did nothing to improve the accuracy, and in fact degraded it when an effect was seen. A simple boxcar apodizatdon does not improve precision, but adds spurious factors to the factor analysis. Perhaps a more gradual cutoff apodization function could be used to reduce the effect of ringing, but this avenue was not further pursued. 98 Table 6.4. Weighting with the Function XN. Standard Deviation for the Determination of Co [Co] Value ofN 500 ppm 0 ppm 1 ppm 100 ppm 0 6.67 0.23 0.21 3.29 1 5.67 0.12 0.13 3.39 3 5.37 0.12 0.15 3.38 5 6.15 0.13 0.17 3.29 5* 5.96 0.13 0.16 3.31 6.3.2.2 Smoothing using a Five Point Average A simple 5 point smoothing was attempted next, where each smoothed diode was a weighted sum of the central diode and two diodes on either side. The relative weightings were generated by Pascal's triangle, and for 5 points is 1-4-6-4-1. This appeared to have better success than the Fourier transform approach. The smoothed results are displayed in Table 6.7. The error at the 1 ppm level did not exceed 0.2 ppm, while maintaining the accuracy at higher concentrations of both standards and mixtures. Thus, simple five point smoothing is a useful data enhancement technique, and it is also very straightforward to implement. No previous knowledge of the nature of the data (i.e. noise levels) is required. 6.3.3 Thresholds Another approach to data enhancement involved the use of a threshold value. Any diode readings above a preset threshold would be retained as useful data, while all readings below the threshold would be considered noise and were thus set to zero. 99 Table 6.5. Weighting with the Function XN. Standard Deviation for the Determination of Fe [Fe] Value ofN 500 ppm 0 ppm 1 ppm 100 ppm 0 1 3 5 5* 3.87 3.62 3.53 3.44 3.46 0.62 0.36 0.15 0.39 0.27 0.13 0.10 0.10 0.07 0.08 0.65 0.61 0.89 1.04 1.03 The setting of the threshold will vary depending upon the range of the data, but the discussion which follows illustrates the success or failure of the method, and general indications of whether a chosen threshold is suitable, or too high or low. The results are displayed in Table 6.8. A threshold of 5 units was first examined. At this level the detected concentrations of 100 and 500 ppm of Co or Fe were unaffected, but in samples containing 1 ppm of Co or Fe, none was detected. This indicates that the peaks for 1 ppm of either Co or Fe were small enough (less than 5 units) that they were eliminated, losing important analytical information when the threshold was applied. This also indicates that care must be taken choosing a threshold, since choosing a threshold which is too high will discard analytical data on the least concentrated components. A lower threshold of 2 was next apphed. At this level it was obvious that analytical information was still being lost, as concentrations at a 1 ppm level appeared at 0.4 ppm or less. Not all the weak peaks were below this threshold, 100 Table 6.6. Weighting with the Function l.lexp(X). Standard Deviation for the Determinations of Co and Fe [Co] or [Fe] 500 ppm 0 ppm 1 ppm 100 ppm l.le [ C o ] 30.88 0.39 0.26 4.37 l.le™ 100229.76 884.78 6086.70 2585.50 leading to a partial response at low concentrations, but much analytical information was still lost, so a multivariate analysis found systematically low concentrations. At a threshold of 1, the 1 ppm concentrations are fairly represented, and are only slightly low. This suggests that there is still some loss of analytical information even at this low threshold. Since the A/D converter resolves a difference of 1/8 of one unit (0.125), this effect is not likely to be a sampling artifact. A threshold of 1 was chosen for two further experiments, where weighting was used in addition to the threshold. With a weighting function of X N with N=l, the precision for the solutions containing 0 ppm of either Co or Fe was improved considerably, with no significant effect on the other concentrations. A weight of N=3 (X3) offered no further improvement. 6.3.4 Effect of Slit Width To evaluate the effect of increased throughput with reduced resolution on precision and accuracy, a series of sample spectra were collected with a slit width of 200 um rather than the previous 50 um. The results were disappointing, with concentrations at the 1 ppm level varying from 0.2 to 3.6 ppm, even when N=l 101 Table 6.7. Smoothing with a Five Point Average. Standard Deviation for the Determinations of Co and Fe [Co] or [Fe] 500 ppm 0 ppm 1 ppm 100 ppm Co 1.51 0.20 0.18 4.33 0.27 0.22 0.35 0.20 0.23 0.21 0.17 0.12 3.28 3.29 0.91 0.68 Fe weighting was used. Smoothing with a five point average was also tried, but appeared to offer little improvement. Visual inspection of the actual spectra offer some clues to explain this degraded performance. As the slits are widened, the peak shape becomes distorted. Instead of a smooth Gaussian shape (which would be the result of Doppler broadening), the peaks become flat on top, while maintaining sharp sides. This corresponds to the image of a wider entrance slit [70]. All the diodes on the top of the peak measure the same intensity, which results in little increase in magnitude from the case with 50 um slits. The peaks are wider, but not more intense. The total amount of analytical information may actually be less, as strong peaks widen and adjacent weaker peaks become lost in the shoulders of the strong peaks. 6.3.5 Recommendations for Data Pre-processing. A slit width of 60 um (actually an instrumental parameter) allows for adequate throughput. While larger slit widths improve throughput, they also degrade resolution. Smoothing with a five point average greatly improves the results and 102 Table 6.8. Factor Analysis for several thresholds and weightings. Mean and Standard Deviation for Determinations of Co and Fe. [Co] or [Fe] Elements 500 ppm Threshold: 5 intensity units. Weighting: n=0. Co 495.29±6.66 Fe 502.71±3.87 Threshold: 2 intensity units. Weighting: n=0. Co 495.29±6.67 Fe 502.71±3.87 Threshold: 1 intensity unit. Weighting: n=0. Co 495.29±6.66 Fe 502.71±3.87 Threshold: 1 intensity unit. Weighting: n=l. Co 497.11±5.13 Fe 502.24±3.63 Threshold: 1 intensity unit. Weighting: n=3. Co 497.52±5.44 Fe 501.85±3.47 0 ppm 1 ppm 100 ppm 0.015±0.067 0.37±0.60 0.014±0.115 0.34±0.65 -0.013±0.155 0.40±0.62 0.002±0.044 -0.009±0.142 -0.004±0.018 -0.04±0.22 -0.098+0.111 -0.29±0.30 0.28±0.15 0.43±0.06 0.61±0.14 0.86±0.08 0.73±0.13 0.87±0.07 0.76±0:16 0.79±0.11 98.43±3.21 104.12±0.74 98.82±3.49 104.31±0.68 98.59±3.28 104.34±0.65 99.58±3.45 105.48±0.66 100.60±3.40 105.91±0.86 103 should be done where a complete spectrum is to be used. A simple weighting (XN) should also be done, at the N=l to N=3 level. The weight vector should be a composite of all the elements targeted for, at the same concentration for each element. A noise threshold is sometimes useful, but care must be taken to choose it carefully, for a threshold which is too high will cut off crucial analytical information and degrade the results. These and related methods have been used in further experiments in this thesis, to assist in extracting analytical information from photodiode array spectra using multivariate analysis. 6.4 Experimental Data Collection Spectra (Figure 6.1) were taken of pure solutions of 500 ppm of Co, Fe and La, in the region near 375 nm, using the method of dynamic range enhancement of photodiode array spectra [55] to provide wide dynamic range spectra. An inductively coupled plasma unit manufactured by PlasmaTherm (Kreeson, NJ, USA) was used, consisting of an HFP-2500E R.F. generator, an AMN-2500E automatic impedance matching network and an APCS-1 automatic power control unit. A Sherritt Gordon (Fort Saskatchewan, AB, Canada) MAK-200 cross-flow nebulizer running at 200 lb in'2 was used for sample introduction. A plano-convex, fused silica lens with a focal length of 150 mm was used to produce a 1:1 image of the plasma on the entrance slit of the monochromator. A Schoeffel-McPherson (Acton, MA, USA) Model 2061, 1-m Czerny-Turner monochromator with a Model AH-3264 1200 line mm"1 holographic grating provided dispersion onto a Reticon (Sunnyvale, CA, USA) Model RL-4096/20 linear photodiode array. The 4096 detecting elements 104 Figure 6.1. The complex structure of emission spectra make the interpretation of mixtures a difficult problem. Partial or complete spectral overlap of analyte lines with lines of other concomitant elements is a common occurance. The spectra of the above three elements are representative of these difficulties. (A) 500 ppm of Fe; (B) 500 ppm of Co; and (C) 500 ppm of La. 1 1001 2001 3001 4001 Diode number 105 of the array covered a wavelength range of approximately 45.0 nm, giving a resolution of 0.04 nm with a 60 um entrance slit. The array was cooled to -15° C with a Melcor (Trenton, NJ, USA) Model CPI4-71-10L Peltier cooler mounted on the back of the array. Readout of the array was carried out with a Reticon Model RL-4096S-3 evaluation board interfaced with an R. C. Electronics (Santa Barbara, CA, USA) Model ISC-16 analog to digital converter installed in a PC-AT compatible computer (Tulsa Computers, Model 1280, Owasso, OK, USA). A complete software package, written in our laboratory in Turbo Pascal, was used for data acquisition, screen display, plotting and other specialized applications. Spectra were stored on 5 1/4 in. floppy disks, and transferred to the campus computer system for processing. All programs used for post-collection processing were written in APL. 6.4.1 Selection of Suitable Analytes and Wavelengths The knowledge gained from a general survey of several elemental spectra in the previous section was used to select the best wavelength region for a simultaneous multielement factor analysis. As noted above, the center wavelength was 375 nm, which is slightly higher than the wavelength region used for the data pre-processing section. This wavelength region was chosen to retain the most intense peaks for Co and Fe, while including a number of intense lines for La. 6.4.2 Criteria for the Selection of Spectral Windows For the determinations which follow, each diode is treated as an independent measurement, giving a total of 4096 spectral windows, each one diode wide, for the initial factor analysis. In subsequent analyses, individual diodes are rejected on the 106 basis of their contribution to the residual vector. This is in contrast to procedures used in the next chapter, where a sum of diodes for each peak is taken before the residual vector is examined and further diodes are rejected. In this second approach, all the diodes for a single peak are removed at the same. This approach will be discussed in the next chapter. 6.4.3 Generation of Simulated Mixtures A data matrix consisting of pure Co or Fe spectra and spectra of mixtures of these two components was prepared. The mixtures were generated synthetically by scaling spectra and adding them. The units of intensity were relative, and were derived directly from the read-out of the photodiode array. Gaussian noise of a magnitude equal to that observed in background regions of these spectra was added to all spectra in the data matrix. This produced a set of synthetic standards and mixtures suitable for the evaluation of the performance of multivariate analyses in the presence of interferents. A data matrix consisting of pure Co, Fe and La standards and mixtures of all three was also produced in the same manner as above. The lines in the La spectrum were much more intense, on average (ten-fold), than those of either Co or Fe in this spectral region. Thus one would expect more severe interference problems, and perhaps problems due to extensive line wings from intense La lines. This data matrix allowed further evaluation of the performance of multivariate analyses, with the added complexity of an additional interferent, or a second analyte to be determined simultaneously. 107 6.5 Application of Methods A number of situations involving binary and ternary mixtures will now be discussed. Emphasis will be placed on the details of each application with the procedures previously described. 6.5.1 Single Element Determinations in a Binary Mixture In many circumstances, a determination of the concentration of a single element in a sample is desired. This analysis can be complicated by the presence of a second component. The determination of a single element in the presence of a single interferent can illustrate most clearly a method to overcome these complications. 6.5.1.1 Co in a Binary Mixture A multivariate analysis solely for Co was carried out, with the second component, Fe, being treated as if it were an unknown interferent. A series of steps was taken to reject diodes, and are referred to as cycles to emphasize the recurrent nature of the rejection process. In the first cycle, a 500 ppm standard Co spectrum was passed through a peak-finding program that selected only those diodes which exhibited a signal above a defined threshold. This threshold was chosen as 25 intensity units. In comparison, the noise level in the background regions of the spectrum was approximately 1 unit, and the most intense lines were 22,000 intensity units. This threshold gave a reduced matrix consisting of 403 diodes. A factor analysis was carried out, with Co successfully target tested, and the unknown second component was represented by a residual vector to satisfy the dimensionality of the 108 data matrix. The residual vector was then examined for large peaks, representing a large contribution to the signal at each diode. It is important to note that the residual vector may span the residual variance of the multi-dimensional space in either of two ways: as a positive spectrum with positive intensity measurements, giving a positive concentration of unknown; or as a negative spectrum giving a negative concentration of unknown. Both are mathematically acceptable in a factor analysis, but from a physical point of view, only the positive spectrum is acceptable. The average value of the residual spectrum is computed. If the average is negative, this indicates that the majority of peaks are negative, and the spectrum itself is negative. This situation can be easily remedied by inverting the residual spectrum (multiplication by -1). The peaks representing contributions from the unknown will then be positive, and will be properly detected by the peak-finding program. In the second cycle, the threshold was set at 25 units, and 36 diodes of the residual vector were rejected on the basis of large contributions from an unknown component, leaving 367 diodes for the following cycle. In the third cycle, the number of acceptable diodes was reduced to 342, with a threshold of 10 being used as the maximum acceptable deviation in the residual vector. The fourth cycle, with a threshold of 2, left 235 diodes. At this point, all remaining deviation in the residual vector were deemed to be due to random noise, and this is supported by the RATIO function, which indicated only one component remaining. The values for different possible numbers of factors and the number of diodes involved in each cycle are 109 summarized in Table 6.9. Further evidence supporting the existence of only one remaining factor by cycle five is given by a consideration of the observed base-line noise in the original spectra. The observed base-line noise and Gaussian noise added to the spectra were all of comparable intensities, with a standard deviation of approximately 1. Thus a threshold of 2 would be expected to eliminate all significant deviations above the ambient noise. In the results from the factor analysis after cycle five, the only spectrum to show a significant false reading for Co was the 500 ppm Fe spectrum. The measured and expected composition of each spectrum is given in Table 6.10. 6.5.1.2 Fe in a Binary Mixture A similar multivariate analysis was carried out on the same spectra for the determination of Fe. The initial selection of diodes with a relative intensity greater than 25 in a 500 ppm Fe spectrum retained 190 diodes. This indicated that the lines for Fe are generally fewer in number and less intense than those of Co in this wavelength range. The second cycle for rejection of diodes had large deviations in the residual vector, and this reduced the number of diodes to 151 at a threshold of 25. The third cycle reduced the number of diodes to 115 at a threshold of 10. The fourth cycle reduced the number of diodes to 78 at a threshold of 4. The fifth cycle reduced the number of diodes to 56, at a threshold of 2. These cycles are summarized in Tabl® 6.11. After these 56 diodes were selected as the best for the determination of Fe in the presence of an unidentified interferent (Co), the number of factors indicated by the RATIO function was only one, so no further reduction of the data was 110 Table 6.9. Multivariate analysis of Co in a binary Mixture of Co and Fe. Diodes at Cycle beginning Diodes Diodes RATIO function number of cycle rejected Threshold remaining (1,2,3,...factors) 0 4096 # — « 4096 3.1,3807.7,1.0,1.0 1 4096 + 3793 25 403 79.6,1251.7,1.1,1.1 2 403 * 36 25 367 4395.1,23.4,1.1,1.1 3 367 * 25 10 342 15713.6,6.4,1.1 4 342 * 42 4 300 49267.0,2.1,1.1 5 300 * 65 2 235 117038.0,1.0,1.1 # result with all diodes + selection of diodes with intensity in standard above threshold * rejection of diodes with intensity in residual above threshold warranted. All values were within the expected error with the sole exception of the 500 ppm Co spectrum, for which 1 ppm Fe was indicated as present (Table 6.12). 6.5.2 Single Element Determination in a Ternary Mixture When a factor analysis is carried out with all three components identified (successful target tests carried out), the results found are very close to those expected (Table 6.13). One would hope that comparable accuracy could be obtained in determinations of each individual component, where one or more other components may be unknown. Individual components were determined, treating the other components as unknown interferents, to compare the accuracy with these results. 6.5.2.1 La in a Ternary Mixture La was determined, in the presence Co and Fe, which both could be considered (on the basis of their weaker spectral intensities relative to La) minor interferents. I l l Table 6.10. Results of a multivariate analysis on the 235 best diodes for Co in binary mixtures of Co and Fe. Co. ppm Interferent (Fe) present Found Expected ppm 499.94 500.00 0.00 0.38 0.00 500.00 0.06 0.00 100.00 0.01 0.00 10.00 -0.02 0.00 1.00 100.05 100.00 0.00 9.88 10.00 0.00 0.86 1.00 0.00 100.16 100.00 100.00 10.15 10.00 100.00 1.04 1.00 100.00 100.11 100.00 10.00 10.01 10.00 10.00 0.94 1.00 10.00 99.95 100.00 1.00 10.09 10.00 1.00 1.08 1.00 1.00 An initial signal threshold of 25 yielded 973 diodes with sufficient intensity in the La spectrum to be retained for further analysis. When there are three components and only one is identified, the remaining two unknown components must be represented by two residual vectors. The residual vectors do not correspond directly to the two components, but instead are linear combinations of the two pure components. In a factor analysis, the greatest possible amount of residual variance is spanned by the first residual vector. The second residual vector spans only that variance orthogonal to the first which could not be 112 Table 6.11. Multivariate analysis of Fe in a binary mixture of Co and Fe. Diodes at Cycle beginning Diodes Diodes RATIO function number of cycle rejected Threshold remaining (1,2,3,...factors) 0 4096 # 4096 3.1,3807.7,1.0,1.0 1 4096 + 3906 25 190 4.6,13059.3,1.0,1.1 2 190 * 39 25 151 1313.9,57.9,1.0,1.1 3 151 * 36 10 115 4912.2,16.5,1.1,1.0 4 115 * 37 4 78 27864.3,3.2,1.2,1.2 5 78 * 22 2 56 93314.1,1.3,1.2,1.1 # result with all diodes + selection of diodes with intensity in standard above threshold * rejection of diodes with intensity in residual above threshold included in the first residual vector. As a result, the first residual vector will always contain a greater proportion of the residual variance than the second. Strictly, the residual vectors should be examined carefully and those diodes which have a significant additional contribution in either residual vector should be rejected. In practice, however, it is usually acceptable to base diode rejection only on the first residual vector, since the rejection process passes through several cycles, and the deviations missed on one cycle will show up in subsequent cycles. In the second cycle, 85 diodes were rejected which exceeded a threshold of 50 in the first residual vector, leaving 888 diodes. In the third cycle, it was noted that the sum total intensity of the first residual vector (6073) was only slightly higher than that of the second residual vector (4241). It was decided that time could be saved, or perhaps fewer cycles required, by combining the rejected diodes from both residual vectors. With a 113 Table 6.12. Results of a multivariate analysis on the 56 best diodes for Fe in binary mixtures of Co and Fe. Fe. ppm Interferent (Co) present Found Expected ppm 1.04 0.00 500.00 500.11 500.00 0.00 99.75 100.00 0.00 10.27 10.00 0.00 1.01 1.00 0.00 -0.02 0.00 100.00 0.07 0.00 10.00 -0.02 0.00 1.00 100.33 100.00 100.00 100.41 100.00 10.00 100.00 100.00 1.00 10.10 10.00 100.00 10.15 10.00 10.00 10.01 10.00 1.00 1.02 1.00 100.00 1.15 1.00 10.00 1.11 1.00 1.00 threshold of 25, 41 diodes were rejected from the first residual vector and 29 diodes from the second residual vector. As some of the same diodes were rejected in both residual vectors, the total number of diodes rejected was 67. This left 821 acceptable diodes at the end of the third cycle. A factor analysis still showed the presence of three components (RATIO values 100601, 3.2, 15.0, 1.1), so the cyclic rejection procedure was repeated. Even with this many diodes remaining, almost all of the residual variance had been removed, and all concentrations for La were within 0.1 ppm of the correct values, with the sole exceptions of 500 ppm of Co and 500 ppm of 114 Table 6.13. Results of a multivariate analysis on 4096 diodes for La, Co and Fe in ternary mixtures. Co. ppm Fe. ppm La. ppm Found Expected Found Expected Found Expected 499.98 500.00 0.12 0.00 0.01 0.00 -0.05 0.00 500.28 500.00 -0.01 0.00 -0.17 0.00 -0.04 0.00 500.02 500.00 0.95 1.00 9.99 10.00 100.00 100.00 9.92 10.00 99.77 100.00 99.99 100.00 100.04 100.00 9.94 10.00 99.99 100.00 10.01 10.00 9.94 10.00 100.00 100.00 1.07 1.00 -0.02 0.00 100.01 100.00 0.10 0.00 0.97 1.00 100.00 100.00 10.07 10.00 0.05 0.00 100.01 100.00 -0.04 0.00 9.84 10.00 100.00 100.00 10.07 10.00 9.80 10.00 10.02 10.00 1.01 1.00 0.25 0.00 10.00 10.00 0.05 0.00 1.15 1.00 10.00 10.00 100.02 100.00 100.19 100.00 0.99 1.00 100.05 100.00 99.97 100.00 10.00 10.00 99.98 100.00 99.97 100.00 100.00 100.00 0.98 1.00 1.04 1.00 1.00 1.00 Fe, which showed 0.3 and 0.1 ppm of La, respectively. One more cycle was carried out with a threshold of 10. Between the first and second residual vectors, 151 diodes were rejected. These cycles are summarized in Table 6.14. A final factor analysis with 670 diodes (Table 6.15) gave results which were only a slight improvement over the analysis with 821 diodes. 6.5.2.2 Co in a Ternary Mixture Co was determined next in these same mixtures, in order to investigate the results in the presence of a major interferent (La). This analysis is summarized in 115 Table 6.14. Multivariate analysis of La in a ternary mixture of Co, Fe and La. Diodes at Cycle beginning Diodes Diodes RATIO function number of cycle rejected Threshold remaining (l,2,3,...factors) 0 4096 # 4096 94.4,3.2,3854.5,1.0 1 4096 + 3123 25 973 334.5,8.9,1520.0,1.0 2 973 * 85 50 888 10622.0,5.7,75.8,1.1 3 888 * 67 25 821 100601.3,3.2,15.0,1.1 4 821 * 151 10 670 367980.2,2.7,5.0,1.1 # result with all diodes + selection of diodes with intensity in standard above threshold * rejection of diodes with intensity in residual above threshold Table 6.16. Initially, 403 diodes were chosen with a signal above 25 units. The first residual vector had 41 diodes above a threshold of 25 and the second residual vector had 39 above this threshold. When duplicate diodes were removed, 78 diodes were rejected, leaving 327 diodes. Three factors were still evident in the RATIO function (286.7, 15.8, 22.3, 10.5) so further cycles were carried out. The sum of all diodes in the residual vectors indicated that most of the residual variance was in the first residual vector (3387, 415). Therefore, only the first residual vector was used in this cycle to remove a further 52 diodes, at a threshold of 25, leaving 275 diodes. Improvement was seen in the RATIO function, indicating that the unknown components were indeed being extracted, but three were still clearly present (938.1, 5.8, 18.0, 1.0). On the next cycle, the bulk of the residual variance was still associated with the first residual vector (2079,119), and in fact although there were 99 diodes above a threshold of 10 in the first residual vector, there were only 18 116 Table 6.15. Results of a multivariate analysis on the 670 best diodes for La, in ternary mixtures of La, Co and Fe. La. ppm Interferent Present, ppm Found Expected Co Fe 0.28 0.00 500.00 0.00 0.11 0.00 0.00 500.00 500.02 500.00 0.00 0.00 100.01 100.00 1.00 10.00 100.02 100.00 10.00 100.00 100.04 100.00 100.00 10.00 100.00 100.00 10.00 10.00 100.01 100.00 1.00 0.00 99.99 100.00 0.00 1.00 100.01 100.00 10.00 0.00 100.00 100.00 0.00 10.00 10.03 10.00 10.00 10.00 10.00 10.00 1.00 0.00 10.00 10.00 0.00 1.00 1.07 1.00 100.00 100.00 10.08 10.00 100.00 100.00 100.08 100.00 100.00 100.00 1.00 1.00 1.00 1.00 above this threshold in the second residual vector. Again, only diodes rejected by the first residual vector were removed, leaving 176 diodes. Further improvement in the RATIO function was seen (3875.2, 2.1, 16.0, 1.0), and the first residual vector again contained most of the variance (496, 49). At a threshold of 4, 106 diodes were rejected from the first residual vector, leaving 70 diodes. With these 70 best diodes, the RATIO function showed only a small amount of two additional components present (16293.7, 1.7, 8.1,1.1). At this point, another method of data manipulation was used which has proved to be very useful. This involves weighting each of the 117 Table 6.16. Multivariate analysis of Co in a ternary mixture of Co, Fe and La. Diodes at Cycle beginning number of cycle 0 1 2 3 4 5 4096 # 4096 + 403 * 327 * 275 * 176* Diodes 3793 78 52 99 106 Diodes RATIO function rejected Threshold remaining (1,2,3,...factors) 25 25 25 10 4 4096 403 327 275 176 70 94.4,3.2,3854.5,1.0 4.6,78.8,1215.1,1.1 286.7,15.8,22.3,1.1 938.1,5.8,18.0,1.1 3875.3,2.0,16.0,1.1 16293.7,1.7,8.1,1.1 # result with all diodes + selection of diodes with intensity in standard above threshold * rejection of diodes with intensity in residual above threshold diodes by its importance in the elemental spectrum of interest. Empirically, it has been found that the cube of the spectral intensities provides the best improvement. Thus a weighting vector corresponding to the cube of the intensities in a reference Co spectrum was incorporated in the factor analysis. Using the 70 best diodes an excellent final result was obtained. Table 6.17 compares the concentrations of Co found with the actual values and excellent agreement is seen throughout all spectra in the data matrix. 6.5.2.3 Fe in a Ternary Mixture Fe in these same spectra was then determined (Table 6.18), with 190 diodes selected from a standard Fe spectrum having intensities over 25 units. Forty diodes were rejected from the first residual vector as they were over 25 units, leaving 150 diodes. A large contribution from two other components still remained at this point, 118 Table 6.17. Results of a multivariate analysis on the 70 best diodes for Co in ternary mixtures of La, Co and Fe. Co. ppm Interferent Present, ppm Found Expected La Fe 499.96 500.00 0.00 0.00 0.12 0.00 0.00 500.00 0.20 0.00 500.00 0.00 0.97 1.00 100.00 10.00 10.06 10.00 100.00 100.00 100.23 100.00 _ 100.00 10.00 10.01 10.00 100.00 10.00 1.10 1.00 100.00 0.00 0.22 0.00 100.00 1.00 9.96 10.00 100.00 0.00 0.00 0.00 100.00 10.00 9.93 10.00 10.00 10.00 0.97 1.00 10.00 0.00 -0.02 0.00 10.00 1.00 100.12 100.00 1.00 100.00 100.00 100.00 10.00 100.00 100.12 100.00 100.00 100.00 0.92 1.00 1.00 1.00 as indicated by the RATIO function (114.0, 13.4, 42.6, 1.2). On the third cycle, 41 diodes were above a threshold of 25, reducing the number of acceptable diodes to 109. Three factors still remained, but improvement was seen (RATIO values: 1049.1, 1.9,39.4, 1.2 for one through four possible factors). In the next cycle, 45 diodes were rejected at a threshold of 10 units, leaving 64 diodes. The RATIO function was then 4810.6, 1.2, 17.4 and 1.2. In the next cycle, 12 diodes were rejected from both the first and second residual vectors, at a threshold of 10. The 52 remaining diodes still indicated some contribution from two additional components (RATIO values: 5882.1, 119 Table 6.18. Multivariate analysis of Fe in a ternary mixture of Co, Fe and La. Diodes at Cycle beginning Diodes Diodes RATIO function number of cycle rejected Threshold remaining (1,2,3,...factors) 0 4096 # • • • • 4096 94.4,3.2,3854.5,1.0 1 4096 + 3906 25 190 4.8,22.4,575.8,1.1 2 190 * 40 25 150 114.0,13.4,42.6,1.2 3 150 * 41 25 109 1049.1,1.9,39.4,1.2 4 109 * 45 10 64 4810.6,1.2,17.4,1.2 5 64 * 12 10 52 5882.1,2.1,6.9,1.3 6 52 * 25 4 27 15582.6,3.4,2.9,1.3 7 27 * 7 4 20 57738.2,1.5,2.0,1.4 # result with all diodes + selection of diodes with intensity in standard above threshold * rejection of diodes with intensity in residual above threshold 2.1, 6.9, 1.3). In a further cycle the majority of the remaining variance was in the first residual vector, with 25 diodes rejected over a threshold of 4. In the remaining 27 diodes, additional factors were still possibly present, but were not clearly indicated by the RATIO function (15582.6, 3.4 2.9, 1.3). The residual variances were summed over all 27 remaining diodes. The small totals for the two residual vectors showed that only a small deviation remained (14.4, 33.4). The rejection of 7 additional diodes above a threshold of 4 was carried out, and the resulting RATIO values (57738.2,1.5, 2.0,1.4) suggest an improvement, but a comparison of the factor analyses for 27 and 20 diodes (Table 6.19) showed no significant improvement in accuracy. Weighting of the results for 27 and 20 diodes did not show any further improvement, in contrast to the great improvement seen previously in the Co determination. Careful 120 Table 6.19. Comparison of results using 27 and 20 diodes to determine Fe in ternary solutions containing Co, Fe and La. Fe found, ppm Fe Interferent Present, ppm Expected 27 diodes 20 diodes ppm La Co 2.65 1.29 0.00 0.00 500.00 500.00 500.14 500.00 0.00 0.00 5.81 6.28 0.00 500.00 0.00 11.22 11.40 10.00 100.00 1.00 101.00 101.03 100.00 100.00 10.00 11.79 11.51 10.00 100.00 100.00 11.21 11.32 10.00 100.00 10.00 1.17 1.14 0.00 100.00 1.00 2.27 2.45 1.00 100.00 0.00 1.13 1.29 0.00 100.00 10.00 11.00 11.13 10.00 100.00 0.00 9.99 10.06 10.00 10.00 10.00 0.34 0.29 0.00 10.00 1.00 1.41 1.40 1.00 10.00 0.00 100.79 100.48 100.00 1.00 100.00 100.77 100.51 100.00 10.00 100.00 101.75 100.70 100.00 100.00 100.00 0.87 1.02 1.00 1.00 1.00 examination of the spectra of Co, Fe and La over these remaining 27 diodes showed that these are indeed the best diodes to use on the basis of a minimum of intensity present from elements other than Fe, but there is still unfortunately a measurable contribution from the line wings of La at these diodes. Thus, the optimum diodes were found, and the results were the best that could, be expected given the interference from the line wings of intense La lines. Such interferences from line wings have been previously recognized as a source of error in univariate measurements [35]. 121 6.5.3 Simultaneous Multielement Determinations 6.5.3.1 Co and Fe in the Presence of an Interferent An analysis was carried out to optimize simultaneously for the determination of both Fe and Co in the presence of the major interferent La. A spectrum of 500 ppm Co was added to a spectrum of 500 ppm Fe, and a threshold of 25 units was used initially to select suitable diodes. 595 diodes were found. The initial residual vector contained 33 diodes above the threshold, and were discarded. On the basis of the remaining 562 diodes, the RATIO function still strongly indicated the presence of three components (3.0, 40.9, 549.7, 1.1). Since the goal of this analysis was to maximize the number of components at two, representing Co and Fe, and remove the influence of La, a further 118 diodes were rejected at a threshold of 25. With 444 diodes remaining, the RATIO function indicated some improvement (2.7,249.2,104.7, 1.1) but further cycles were required. At a threshold of 10, another 144 diodes were rejected. With 295 diodes remaining, the RATIO function indicated continuing improvement (2.6, 1090.8, 29.9, 1.1). A weighting was carried out to evaluate if any further improvement would be seen. The results (Table 6.20) show excellent agreement for Co, but some deviations of the Fe concentrations (approximately 1 ppm) remain. Further cycles were carried out and all cycles are summarized in Table 6.21. .After completing 13 cycles, there appeared to be only two factors left, corresponding to Co and Fe, with all interference from La removed. The results for Fe were not as close to the actual values as those for Co, but both were satisfactory. 122 Table 6.20. Intermediate results in the determination of Co and Fe in the presence of La interferent; 295 diodes weighted by the Co and Fe spectra cubed. Co. ppm Fe. ppm Interferent (La) present Found Expected Found Expected ppm 499.94 500.00 0.11 0.00 0.00 0.14 0.00 500.22 500.00 0.00 0.25 0.00 6.35 0.00 500.00 1.00 1.00 11.46 10.00 100.00 10.07 10.00 101.01 100.00 100.00 100.24 100.00 11.26 10.00 100.00 10.01 10.00 11.24 10.00 100.00 1.11 1.00 1.08 0.00 100.00 0.22 0.00 2.51 1.00 100.00 9.96 10.00 1.25 0.00 100.00 0.01 0.00 11.12 10.00 100.00 9.95 10.00 10.01 10.00 10.00 0.95 1.00 0.31 0.00 10.00 -0.02 0.00 1.56 1.00 10.00 100.11 100.00 100.25 100.00 1.00 100.01 100.00 100.31 100.00 10.00 100.12 100.00 101.42 100.00 100.00 0.88 1.00 0.95 1.00 1.00 When a weighted factor analysis was carried out, the results were still better for Co than for Fe (Table 6.22). 6.5.4 Regeneration of Spectra of Unidentified Interferents For the determination of Fe in the presence of a Co interferent, the full spectrum of the unknown component was recovered to compare it with that of Co. The contribution from Fe in each spectrum was removed, and a factor analysis on the remaining data gave only one factor which was computed as a residual vector. This 123 Table 6.21. Multivariate analysis of Co and Fe in a ternary mixture of Co, Fe and La. Diodes at Cycle beginning Diodes Diodes RATIO function number of cycle rejected Threshold remaining (1,2,3,...factors) 0 4096 # 4096 94.4,3.2,3854.5,1.0 1 4096 + 3501 25 595 4.6,3.2,21388.2,1.1 2 595 * 33 25 562 3.0,40.9,549.7,1.1 3 562* 118 25 444 2.7,249.2,104.7,1.1 4 444 * 149 10 295 2.6,1090.8,29.9,1.1 5 295 * 22 10 273 2.4,1418.4,24.7,1.1 6 273* 76 7 197 2.5,3056.0,13.6,1.1 7 197 * 81 4 116 2.1,10513.9,6.0,1.1 8 116 * 12 4 104 3.1,8776.7,5.1,1.1 9 104* 6 4 98 3.2,9692.6,4.7,1.1 10 98 * 33 3 65 3.3,19152.2,2.9,1.1 11 65 * 7 3 58 6.0,14681.3,2.2,1.2 12 58 * 10 2.5 48 6.6,20295,4,1.6,1.2 13 48 * 4 2.5 44 6.9,24256.5,1.4,1.2 # result with all diodes + selection of diodes with intensity in standard above threshold * rejection of diodes with intensity in residual above threshold residual vector very strongly resembles that of Co, except for a scaling factor (Figure 6.2). Thus the spectrum of the interferent can be obtained to assist in determining the nature of an unknown component or components. A single element spectrum will be obtained only when one unknown component remains. If two or more unknown components remain, the residual spectra obtained will be linear combinations of the additional components, and visual identification may be more difficult. 124 Table 6.22. Final results for the determination of Co and Fe in the presence of La interferent; 44 diodes weighted by the Co and Fe spectra cubed. Co. m>m Fe, ppm Interferent (La) present Found Expected Found Expected ppm 499.95 500.00 0.36 0.00 0.00 0.13 0.00 500.15 500.00 0.00 0.17 0.00 2.51 0.00 500.00 0.98 1.00 10.43 10.00 100.00 10.04 10.00 101.04 100.00 100.00 100.22 100.00 11.18 10.00 100.00 9.99 10.00 9.96 10.00 100.00 1.12 1.00 1.10 0.00 100.00 0.20 0.00 0.88 1.00 100.00 9.93 10.00 0.12 0.00 100.00 -0.01 0.00 10.30 10.00 100.00 9.96 10.00 9.81 10.00 10.00 0.94 1.00 -0.09 0.00 10.00 -0.04 0.00 1.88 1.00 10.00 100.12 100.00 100.59 100.00 1.00 100.02 100.00 99.62 100.00 10.00 100.10 100.00 99.74 100.00 100.00 0.88 1.00 0.77 1.00 1.00 6.6 Automatic Line Selection The data matrix containing the 44 remaining diodes from the simultaneous analysis of Co and Fe has been reduced to the point where it is feasible to examine the remaining diodes in detail (Table 6.23). An examination of these remaining diodes shows that the program was successful in automatically selecting the best lines to use for the elements of interest to minimize the effect of an interferent. The best diodes for a determination are chosen on the basis of a lack of interferent signal, coincident with an intense peak for the analyte. Several lines may be suitable 125 Figure 6.2. Comparison of known and recovered spectra: (A) spectrum for 500 ppm of Co; (6) residual spectrum recovered from spectral mixtures of Fe and an unidentified component (scaled by 4.26); and (C) difference spectrum. 126 Table 6.23. Diodes remaining after the rejection of those with high deviations in the residual vector, and the intensity contributions for the residual, 500 ppm of Co, 500 ppm of Fe and 500 ppm of La. Diode Intensities Residual Co Fe La 702 1.73 1.42 326.00 2.50 703 1.63 1.83 262.33 2.00 1033 0.72 20.33 900.00 5.50 1034 -0.48 24.58 784.00 3.00 1035 -3.01 45.75 563.33 1.50 1036 0.75 147.33 231.67 2.00 1037 -1.45 316.67 67.67 0.50 1039 -0.10 824.00 19.75 -0.50 1040 -0.88 759.33 16.33 0.00 1041 0.67 562.00 13.83 1.00 1042 -0.96 234.67 12.92 0.00 1043 -0.09 66.75 11.50 1.00 1070 -3.21 3.00 49.33 -2.50 1071 0.69 3.25 103.00 1.00 1072 -0.79 2.67 239.33 0.50 1073 0.62 7.33 346.33 1.50 1076 0.93 65.75 143.00 2.00 1077 2.54 83.58 66.42 2.50 1078 1.43 74.67 12.33 1.50 1079 -3.27 52.33 9.08 -3.00 1080 -1.80 20.83 7.17 -1.50 1113 1.36 80.91 2.33 3.50 1135 0.96 0.92 25.42 0.50 1430 -2.81 160.00 0.25 -0.50 1434 1.11 135.67 1.75 2.00 1447 2.86 186.00 5.25 2.50 1458 1.13 33.33 1.17 1.00 1478 1.86 75.25 0.33 2.50 2021 0.68 44.00 0.25 0.25 2027 0.14 1717.33 0.08 1.38 2028 -0.08 2151.33 -0.17 0.00 2029 -0.05 2105.33 •0.58 3.00 2030 -0.87 1510.67 0.33 -0.50 2033 1.64 81.92 1.75 2.00 2035 2.04 46.75 0.83 2.13 2064 1.70 561.33 0.00 4.00 2069 2.13 53.42 1.00 3.00 2070 0.87 24.60 1.17 1.00 2088 2.16 26.42 1.25 3.00 2089 0.92 36.33 1.50 0.00 2092 2.55 40.67 2.67 3.00 2093 1.27 28.67 3.17 1.00 2137 0.18 5.17 508.00 2.50 2200 2.27 73.00 1.25 1.00 127 depending on the nature of the other known analytes and the unknown interferent. When several lines are suitable, a multivariate solution may be found, or the analyst may pick one of the lines suggested by the program and proceed with a conventional univariate analysis based on the intensity of a single line. For example, in an analysis where both Co and Fe are determined, the suggested best peaks on the basis of greatest intensity for Fe would be at diodes 1033 and 1034 (900 and 784 intensity units respectively), with an alternative at diode 2137 (508 units), a second alternative at diodes 1071-1073 (103, 239, 346 units) and yet another alternative at diodes 702 and 703 (326 and 262 units). For the determination of Co, the best line by far occurs at diodes 2027-2030 (1717,2151,2105, and 1511 units), with an alternative at diodes 1039-1042 (824, 759, 562 and 235 units), and a second alternative at diode 2064 (561 units). In a univariate analysis of Fe alone, only the peak at diodes 702-703 is free from an interference due to Co. In the determination of Co alone, the peak at diodes 2027-2030 is still the best choice, with diode 2064 also free from other interferences, and available as an alternative. If a multivariate analysis is carried out for Fe and Co, all the lines suggested above may be used simultaneously. In a three component system (containing Fe, Co and La), the identification of two components (Co and Fe) led to a greater number of suitable diodes which may be used in a multivariate analysis, than are available if each element is determined independently in two univariate analyses. Alternatively, one may obtain correction factors for the overlap of one known component with another, and carry out univariate analyses using any of the suggested lines and applying a correction factor derived from these results. 128 For example, the Fe peak at diode 1035 has an intensity of 563 units for 500 ppm of Fe. The presence of 500 ppm of Co will cause a further increase of 46 units. When this value is scaled to the known concentration of Co, it may be subtracted from the observed Fe line intensity to give a corrected intensity which accounts for spectral overlap with Co. The results from an analysis where both Co and Fe are known components yields slightly better results than the analyses where each is determined separately as the only known component present. Such a result might be expected as a larger number of diodes, having contributions from both Co and Fe, will be retained in a binary multivariate analysis. In a single element analysis, Fe would be rejected as an interferent to Co, and Co would be rejected as an interferent to Fe. Diodes containing analytical information on both Co and Fe would be lost simultaneously. This loss of analytical information is reflected in slightly poorer results. 6.7 Summary A number of alternatives exist for the analyst dealing with complex mixtures using these methods. If all components can be identified, then a straightforward factor analysis with the full data matrix will lead to excellent results. If one or more components cannot be identified, the data matrix may be reduced so as to remove those diodes for which interference from the unknown component is a problem. A factor analysis using the remaining data can then give a multivariate answer with all the advantages inherent to a multivariate approach. The best wavelengths (diodes) for analysis of the known or desired components 129 may be selected without prior knowledge of the nature of any interfering components. Automatic, matrix dependent, line selection is therefore possible, tailoring lines chosen to the elements of interest. A univariate analysis can be carried out to quantify each element separately using its best line, as obtained from a single element optimization, where all components other than the element of interest are treated as unknown interferents. If several elements are chosen for optimization, the best peaks for each element will be determined, along with a correction factor, if required, to account for any overlap with one of the other known components. Finally, the spectrum of the interfering component may be reconstructed as a final step in the analysis, to give the analyst one more piece of visual information which may assist in understanding the nature of any observed interferences. 130 Chapter 7 Integrated Spectral Windows and the Complexities of Spectra 7.1 Deteradnation of a Single Component In this chapter, a situation will be considered where, regardless of the actual number of components, the data may be reproduced by two factors. This is accomplished using a data matrix containing several virtually identical standard spectra, and a single sample spectrum. Thus the factor analysis finds only two factors. In the original coordinate system in multidimensional space, these factors can be represented by a spectrum for the standard, and a spectrum for the sample. In a new coordinate system, the two dimensions are spanned by a factor for the element of interest, and a factor for a linear combination of all interferents present in the sample spectrum. Since the data matrix contains several standard spectra of the element of interest, the target test for this one element must be successful. The most useful information obtained is not the confirmation of the presence of this element, but the generation of a residual vector from the variance not covered by that element. The subspaces obtained from the abstract factors are simply two-dimensional planes, where one factor corresponds to the element of interest, and the other factor is a single residual vector generated to be orthogonal to the first factor. The vectors representing all the spectra in the original data matrix are projected onto the factors (real and residual) found by this method. The projection along each factor axis gives the concentration of the element represented by that factor axis. 131 7.2 Simultaneous Determination of Several Components This approach may easily be adapted to the determination of several elements simultaneously. The sample vector is projected onto several factor axes in a higher dimensional space, with one factor axis for each of the elements desired and one for a residual spectrum. Such a determination would follow steps analogous to those outlined here for a single element determination. Simultaneous multielement determinations using these methods will be discussed in later chapters. 7.3 Spectral Windows vs. Individual Diodes Small shifts in the position of a line on a photodiode array can lead to much larger shifts in the relative intensities of neighboring diodes. This problem arises due to the discrete sampling of the line shape by the photodiode array. In an extreme case, a shift of 0.04 diodes can produce a change in neighboring diodes of 30% [71]. This far exceeds the signal standard deviations of the order of 1% commonly seen in atomic emission spectroscopy. Such deviations would also lead to artificially high contributions to the residual spectrum, making the discrimination of interferent signal from random noise more difficult. To overcome this problem, an entire spectral window of diodes is taken across the line rather than the individual diode intensities. The integrated intensity is a much more stable quantity since a slight shift of the line on the photodiode array may reduce the measured intensity at one diode, but will raise the intensity at the adjacent diode, so the sum will remain constant. 7.4 Initial Selection of Suitable Diodes for Further Analysis A prehminary analysis may be made with a data matrix containing diodes 132 representing all the analyte lines (above a threshold) in the spectral window. However, this leads to the selection of a different number of diodes for each determination, which depends on the threshold chosen, the number of lines above the threshold, and their relative intensity for each element within the spectral window. Thus, the selection of an element with many intense lines within the window will lead to an initial matrix containing more diodes than that for an element with only a few intense lines within the window. The number of diodes chosen will depend on the concentration of the standard solution used. Higher concentrations will result in more diodes yielding signals above the threshold. The number of diodes chosen for each line will also vary since a more intense line will have a wider base, and will cover a larger number of diodes (Figure 7.1). The results from these factor analysis procedures may be biased by the presence of a few elements with very intense lines, at the expense of other elements. For an individual element, the more intense lines dominate because a larger number of diodes is selected, and this again could lead to an undesirable bias. In an attempt to avoid these problems, a different method is adopted for the choice of the initial number of diodes. Regardless of the element to be examined, only the ten most intense lines for that element in the spectral window are selected, and for each line the five diodes covering the greatest intensity are chosen. These fifty diodes are the only ones retained for further processing. The data are entered into a matrix where each column contains the spectral intensities for a single spectrum, at the fifty selected photodiodes in the array. Since 133 Figure 7.1. The number of diodes above a chosen threshold for a line will vary with the intensity and width of the peak. 12 Diodes 4 Diodes 134 the same fifty diodes are chosen for all spectra, each row in this data matrix contains the measured intensity at the same diode for different spectra (Figure 4.2). Ten standard spectra of the element of interest are contained in the matrix. A single spectrum of the sample to be analyzed is also included in the matrix. As mentioned previously, a single spectrum is chosen to facilitate illustration of the methods in use, but the ideas may be extended to accommodate more than one sample spectrum. 7.5 The First Cycle of Factor Analysis A factor analysis, with target tests for the single element of interest, is performed as specified in previous chapters, and a residual vector is generated. This residual vector is used as a first approximation to the actual spectrum of an interferent in the sample matrix. If more than one interferent is present in the sample, but only a single residual vector is generated, the residual vector will be an approximation to some linear combination of the spectra of all interferents. The residual vector defines a line in multidimensional space, but the direction must be chosen (from the two possibilities) to be positive. This ensures that the lines in the residual spectrum represent positive spectral intensities, as required in further stages of the data manipulation. If the sum of all intensities is not positive, the entire residual spectrum is inverted (multiplication by -1). The residual is not an exact representation of the interferent spectrum, but only a mathematical approximation which fits the variance of the data. It is not uncommon for a few negative lines to appear along with a majority of positive lines in a residual spectrum (Figure 7.2). A non-negative restriction must be applied, and 135 Figure 7.2. A typical residual spectrum. Each of the ten bars represents a prediction of the sum of all interferent intensities over five diodes. Diode Number 136 this is done by setting all negative values in the residual to zero. A second factor analysis is then done, and two factors are found. The first target vector is the standard spectrum, as before, and the second target vector is the non-negative residual spectrum. This generates a row matrix containing reconstructed spectra for the analyte and the residual, and a column matrix proportional to the concentrations of each in the sample. Multiplication by a previously entered standard concentration yields an estimate of the actual analyte concentration in the sample. 7.6 Subsequent Cycles of Factor Analysis The next steps involve a cyclic process, where the most interfered-with line is rejected, and a factor analysis as described previously is applied to the remaining diodes. The most interfered-with line is determined by examining the residual. The signals of all five of the diodes corresponding to each line are summed, giving a residual spectrum as seen in Figure 7.2. The line with the largest value in the residual is rejected by removing all intensity information for the corresponding diodes from the data matrix. The factor analysis of the remaining data generates another residual spectrum, along with an updated estimate of the analyte concentration. This process is continued until all lines have been rejected, one by one, and the best line remains. 7.7 Summary Three options are available as variations of this approach to analysis. 1) The last line retained may be taken as the best line. 2) The line with the best signal/interference (target/residual) ratio may be used 137 in a single line determination. 3) Once the first few lines have been rejected, most of the lines containing major interferences are successfully removed. The factor analysis using the remaining lines gives a multiline analysis, which in many cases may be preferred to a single line determination. 138 Chapter 8 Multivariate Analysis of Mixtures Which Exhibit Spectral Overlap 8.1 Introduction Previous chapters have investigated and developed methods of analysis, and tested these methods using combinations of real single component spectra into synthetic mixtures of several components. This chapter will extend these methods and evaluate the performance of these methods in dealing with physical mixtures of several components. 8.2 Comparison of Spectral Overlap in Real and Simulated Spectra Spectra generated from combinations of single component spectra (simulated spectra) and spectra of physical mixtures of several components appear very similar when closely compared. One cannot distinguish between a spectrum of a simulated mixture (such as in chapter 5) and a physical mixture (this chapter) when given a spectrum of each on a single sheet of plotter paper. However, there are likely to be differences between them, particularly in the nature of the noise present, and possibly the structure of the background. These differences could result from interelement effects, which would appear only in physical mixtures. These effects may represent an additional uncertainty, or from the perspective of multivariate analysis, an additional factor, or may blur the determination of the number of factors present. 139 8.3 Experimental Data Collection 8.3.1 Instrumentation The experimental apparatus used was identical to that already described in the previous chapter, with wide dynamic range spectra collected using the method of dynamic range enhancement previously described. 8.3.2 Criteria for the Selection of Spectral Windows The lines to be initially considered in a determination were found by finding the ten most intense lines for the analytes, and selecting a window of diodes to cover the immediate spectral region of each line. 8.3.3 Selection of Suitable Analytes A number of transition metals were chosen to be analytes and interferents because of the large number of lines present in their spectra, and their abundance in samples of analytical interest such a geological samples. The large number of lines are required for two reasons. First, so that a large number of alternative selections are possible in choosing an analyte line. The line selection algorithms would then be challenged by the large number of possibilities, and the final choice or choices of the algorithm could be compared against conventional criteria for line selection (freedom from known interferences). Secondly, for any single element determination, the other elements in the sample would act as interferents, and the larger the number of interferent lines, the greater the possibility of spectral overlap. Again, the performance of the algorithm could be compared to 140 conventional methods, where the wavelength and intensity of the interferences are known. These elements all have intense clusters of lines between 325 and 365 nm. The photodiode array is most sensitive in this region, but can cover only 50 nm of spectrum simultaneously, hence the need for all elements chosen to have many intense lines within the same wavelength range. Co, Cr, Cu, Fe, Ni, and Mn met these criteria. La was chosen since it also has several intense lines across the spectral region chosen. It provided another analyte with many lines, from which the line selection algorithms could chose the most suitable. It would also act as a severe interference in the determination of other elements. 8.3.4 Preparation of Analyte Solutions Standard aqueous solutions were prepared from the nitrates of Co, Cu, Ni and La (all 500 ppm), and Cr (375 ppm). The sulfates were used for Fe (2000 ppm) and Mn (500 ppm). A number of mixtures were prepared with combinations of some or all of these elements, with Co at 100 ppm; Cu, Mn, Fe and La at 50 ppm; and Cr at 37.5 ppm. The results reported in the following sections are for the most complex mixture, containing all seven components. 8.3.5 Identification of Analyte Lines The wavelengths of each of the analyte lines used in this study are given in Table 8.1. Lines are henceforth referred to by diode number. The photodiode array was calibrated by matehing photodiode array intensity maxima with tabulated wavelengths for each of the ten Fe lines [72]. A second order polynomial fit gave the 141 Table 8.1 Wavelengths of analyte lines used. Diode Wavelength Atom (I) or Element number (nm) ion (II) line Fe 146 364.784 286 363.146 395 361.877 480 360.886 717 358.120 812 357.010 852 356.538 1491 349.058 1620 347.545 1916 344.061 Co 818 356.938 1156 352.981 1357 350.632 1391 350.228 1632 347.402 1807 345.350 1844 344.917 1892 344.364 2158 341.234 2219 340.512 Cr 510 360.533 612 359.349 738 357.869 1979 343.331 2069 342.274 2188 340.876 2235 340.332 2535 336.805 2601 336.030 2616 335.850 La 167 364.542 309 362.883 2425 ' 338.091 2464 337.633 2735 334.456 2795 333.749 3087 330.311 3405 326.567 3543 324.935 3579 324.513 142 relationship between diode number and wavelength. The wavelength obtained from the polynomial fit was in all cases within 0.005 nm of the tabulated wavelength. For all other lines (10 each for Co, Cr and La) wavelengths were assigned by locating the closest line to the predicted wavelength. In all cases the assignments were unambiguous. As the methods applied in this study are designed to indicate interferences without the need for wavelength tables, the wavelengths of interferent lines have not been tabulated. 8.4 Application of Methods A Note on Data Presentation As a consequence of the volume of data available from an analysis, considerable compression into a more suitable form for presentation is required. A number of graphs have been generated, illustrating the methods described in this thesis, and showing the progress made during each cycle of the analysis. Many figures portray spectra as bar graphs, where each bar represents the summed intensity of all the diodes used for any one fine. Both real spectra (initial data) and residual spectra (generated as the analysis proceeds) are represented in this manner. The units of intensity for real spectra are derived directly from the read-out of the photodiode array. These real spectra are normalized to the same scale to give an equivalent dynamic range enhanced spectrum (as described in this thesis) at an integration time of 64 units (corresponding to approximately 0.57 seconds). The units of intensity for residual spectra are arbitrary. Several determinations for different elements will be examined, and graphs illustrating the methods described in this paper will be discussed. 143 8.4.1 Single Element Determinations 8.4.1.1 Co in the Presence of Interferents A sample was prepared from the standards containing Co as the analyte (100 ppm), and La, Cu, Ni, Fe, Mn (all 50 ppm) and Cr (37.5 ppm) as interferents. Figure 8.1 is a plot of the calculated concentration of Co as a function of the number of lines rejected, for the sample above. The first point represents the result of a multivariate determination for which all analyte lines are retained (none rejected), so that all interferences are still present. Subsequent points represent the result of a multivariate determination with one, two, or more lines removed from the data set. Lines are removed sequentially on the basis of interference, as indicated in the residual spectrum. For all points, a single residual spectrum is incorporated into the fit to account for the contribution from the unidentified interferents. In a case where the data are fitted to a single parameter representing the Co spectrum, a large error due to unquantified interferents would be expected. Although the presence of interferents is known, their precise effect on the measured intensity of each line is not. The residual spectrum provides a first approximation to the interferent spectrum. The inclusion of a residual spectrum in the fit produces a better fit for Co by allowing for the variance due to the presence of an interferent. In other words, the incorporation of a residual spectrum which models the interferences provides an improvement over the representation of a two component system by only a single Co spectrum. 144 Figure 8.1. As each line is sequentially rejected, the concentration of Co (in ppm) is recalculated in a multivariate analysis using the remaining lines. 80-1 ' f • 1 1 ' 1 1 ' 1 0 2 4 6 8 10 Number of Lines Rejected 145 The result obtained for the concentration of Co when a residual is incorporated into the determination is reasonably close to the actual value (100 ppm), even though the interferences have not yet been removed. Following the first cycle, the line with the largest indicated interference is removed from the data set, and another analysis is performed. The calculated concentration drifts considerably over the first few cycles, depending on how well the residual and the Co standard fit the sample spectrum. As the lines with major interferences are removed, the concentration of Co tends towards a consistent value (Figure 8.1). The interpretation of these graphs is complicated by the scatter in the points, an indication of noise in the determination. Each point on this graph represents the result of a full multivariate determination using all remaining lines after each cycle of factor analysis. Each cycle yields an improvement in the fit for Co as lines with interferences are removed, and less interfered-with lines are retained. However, each subsequent cycle of analysis also used fewer lines, and therefore less information, for the determination. The last line retained may not be the strongest of the ten lines chosen. For the photodiode array used for this paper, the readout noise is high and the signal-to-noise ratio decreases with decreasing line intensity, so an analysis with a weak line would be expected to give a less precise value for Co concentration. The optimum result for a multiline determination of Co is likely to be a balance between the removal of most of the interferences, and the removal of the useful analytical line in the same process. Simultaneous analyses of 1-4 analytes in the 146 presence of 1-5 interferents suggest that an optimum exists between these two extremes. The most interfered-with lines must be removed, since this results in a decrease in analytical information for the determination. Figure 8.2a is a plot of the spectrum of 100 ppm Co, with the sum of the signals from 5 diodes for each line represented by a vertical bar. The diode numbers listed on the horizontal axis refer to the position of the line on the photodiode array, and appear in the order in which they are rejected, from left to right. All the Co lines initially chosen are reasonably intense. The most intense line at diode 1807 is roughly three times as intense as the weakest line at diode 1844. Therefore, the removal of any one line from the data matrix should not remove a large percentage of the useful analytical information. The question arises as to how many lines need to be removed before all the major interferences may be considered as removed. Figure 8.2b is a plot of the actual sum of all the interferents present. Normally the identity of the interferents would not be known, so a graph such as this would not be available. Since the compositions of the samples in this study are precisely known, the progress of the method can be followed and evaluated. The lines with the largest interference are removed first, and after the first three lines at diodes 1632, 818, and 1807 are removed, contributions from interferents are small. The residual spectrum models the spectrum of the interferents. Figure 8.2c gives an estimate of the sum of all the contributing interferences as obtained in a residual spectrum from the factor analysis. The agreement between the actual interferences and the residual is good, with the first two rejected lines at diodes 1632 147 Figure 8.2. (a) The ten most intense lines for the spectrum of Co in the region of 340 p m Each of the ten bars represents the sum of the five diodes with the greatest intensity for each line, (b) The actual sum of all interferent intensities for the ten Co lines, (c) An estimate of the sum of all interferent intensities for the ten Co lines, as obtained from the residual spectrum, (d) The ratio of the signal (Co spectrum) to the actual sum of the interferents. (e) The ratio of the signal (Co spectrum) to the estimated sum of the interferents, obtained from the residual spectrum. ModtNwber 148 and 818 dearly indicated. This graph was generated without knowledge of the identity of the interferents, or even the number of interferents present. The close match of the residual to the actual interferents shows that the residual vector can be a reliable indicator of the relative degree of interference for each analytical line. The residual spectrum does not indicate that the line at diode 1807 suffers from severe interference, but it does rank it correctly as the third most interfered-with of the lines, and as such it is the third line to be rejected. The rejection process may then be expected to continue, with the identification of subsequent interferences and the rejection of lines in roughly their order of severity. Figure 8.2d is a plot of the actual signal/interference ratio for the determination of Co in the presence of several interferents. This ratio is obtained by dividing the signals for the analyte (Figure 8.2a) by the summed signals for the interferents (Figure 8.2b). Such a graph would not normally be available since knowledge of the actual spectrum of the interferents is required. Figure 8.2e is a plot of the signal/residual ratio, generated by dividing the analyte signals by the residual signals. The information for this graph may be - obtained without any knowledge of the nature of the interferents. Examination of this graph suggests that the line with the least interference is the last one retained (at diode 2219), showing that the order of rejection is related to the amount of interference, leaving the best line for last. The second least interfered-with line in this graph is the one at diode 1391. Thus, on the basis of the residual spectra 149 generated, two lines are suggested, the best at diode 2219, and a close second at diode 1391. A comparison of Figure 8.2d with Figure 8.2e reveals that the second best line on the basis of signal/interference ratio (at diode 1391) is a good choice. Therefore, the method was successful in identifying the most useful lines to use for an analysis. Two approaches may be taken in the interpretation of these data, with the goal being the selection of the best single line. The last line retained may be taken as the best line, as mentioned previously as a first option. Alternatively, the line with the highest signal/interference ratio (target/residual) may be taken as the best line, as mentioned previously as a second option. A third option takes an intermediate multiline result as seen in Figure 8.1. This third option goes beyond the immediate goal for optimized single line selection, and is not discussed here. In the determination of Co, the same line is indicated by both the first two options. The line suggested was not the best according to the actual signal/interference ratio, but was nevertheless a reasonable choice. Further examination of analyses for other elements will show that when both options do not select the same line, either line suggested is a good choice. 8.4.1.2 Fe in the Presence of Interferents A sample was prepared from the standards containing Fe as the analyte (50 ppm), with La, Cu, Ni, Mn (all 50 ppm), Co (100 ppm), and Cr (37.5 ppm) as interferents. Figures 8.3 and 8.4 are a set of graphs which outline the determination of Fe in the presence these interferents. These graphs may be interpreted in a 150 Figure 8.3. As each line is sequentially rejected, the concentration of Fe (in ppm) is recalculated in a multivariate analysis using the remaining lines. 64 621 56-54 I • i 1 i 1 i 1 • 0 2 4 6 8 10 Number of Lines Rejected 151 Figure 8.4. (a) The ten most intense lines for the spectrum of Fe in the region of 340 nm. Each of the ten bars represents the sum of the five diodes with the greatest intensity for each line, (b) The actual sum of all interferent intensities for the ten Fe lines, (c) An estimate of the sum of all interferent intensities for the ten Fe lines, as obtained from the residual spectrum, (d) The ratio of the signal (Fe spectrum) to the actual sum of the interferents. (e) The ratio of the signal (Fe spectrum) to the estimated sum of the interferents, obtained from the residual spectrum. Diode Number Diode Nnmber Diode Number Diode Number Diode Number 152 manner similar to that used for the analogous graphs presented for the determination of Co. There is one major interference at diode 812, which is removed in the first cycle (Figures 8.4b and 8.4c). By either option, the best line is the last line retained at diode 717, as seen in Figures 8.4d and 8.4e. The line chosen is also the best line by several other criteria, having the best actual signal/interference ratio, and the greatest intensity of the Fe lines used in this analysis. There is at least one severe interference at diode 812, but it is easily detected. A number of less severe interferences at the remaining diodes are sequentially eliminated, leaving a clear choice of diode 717 for the best line. The calculated concentrations of Fe are systematically higher than the actual concentrations in the sample. This is due to slight hydrolysis in the 2000 ppm Fe standard. The resulting lower concentration in the standard gives a systematically higher apparent concentration in the sample mixture. 8.4.1.3 Cr in the Presence of Interferents A sample was prepared from the standards, containing Cr as the analyte (37.5 ppm), and La, Cu Ni, Fe, Mn (all 50 ppm) and Co (100 ppm) as interferents. Figures 8.5 and 8.6 show graphs for the determination of Cr in the presence of these interferents. Examination of these graphs yields some differences from the two previous determinations. In the plot of estimated Cr concentration vs. number of cycles (Figure 8.5), there is a sudden jump between the fifth and sixth line rejected. Thereafter the 153 Figure 8.5. As each line is sequentially rejected, the concentration of Cr (in ppm) is recalculated in a multivariate analysis using the remaining lines. 154 Figure 8.6. (a) The ten most intense lines for the spectrum of Cr in the region of 340 pm Each of the ten bars represents the sum of the five diodes with the greatest intensity for each line, (b) The actual sum of all interferent intensities for the ten Cr lines, .(c) An estimate of the sum of all interferent intensities for the ten Cr lines, as obtained from the residual spectrum, (d) The ratio of the signal (Cr spectrum) to the actual sum of the interferents. (e) The ratio of the signal (Cr spectrum) to the estimated sum of the interferents, obtained from the residual spectrum. Diode Number Diode Nnmber Diode Number Diode Number Diode Number 155 values seem to be stable. In Figure 8.6d, the highest signal to interference ratio is for the line centered at diode 612, yet this line is rejected early, in the third cycle. In Figure 8.6c, there is a negative intensity in the residual for the line at diode 738, which seems to be responsible for the sudden jump in concentration in Figure 8.5. Such a negative intensity can be explained by a poor overall fit of the target spectrum (Cr) and residual to the data. With the exception of this line, the method performs as expected, selecting the best line (based on signal/interferent ratio) from the remaining four lines. The problem lies in the fact that the two best lines, at diodes 612 and 2535, are rejected before the line at diode 738. In an attempt to understand this problem better, an additional indicator to track the progress of the analysis was used. If the contributions to the residual are small as is the case when a fit to the target is very close, the mean value of the residual would be expected to be small, as would the standard deviation of the residual. In Figure 8.7, the values for the residual mean and the residual standard deviation are plotted as a function of the number of lines rejected. The large drop over the first two cycles corresponds to the rejection of the lines at diodes 1979 and 2188, which both have major interferences (Figure 8.6b). These interferences are also noted in the residual (Figure 8.6d). One remaining major interference is seen at diode 510, and it is expected that this line would be rejected next. However, the next line rejected is at diode 612, which is the best line on the basis of signal to interferent intensity. This is accompanied by a minimal drop in the residual mean and standard deviation (Figure 8.7). The next line to be rejected is at diode 510, as would be expected from the actual sum of the 156 Figure 8.7. The residual mean (squares) and the residual standard deviation (diamonds) for the determination of Cr drop as the lines with the worst interferences are sequentially rejected. 30 0 I 1 i • i 1 i • i • I 0 2 4 6 8 10 Number of Lines Rejected 157 interferences (Figure 8.6b), but this produces no significant drop in the residual mean and standard deviation. In fact the mean and standard deviation do not drop until the line at diode 738 is removed. For the analyses of Co and Fe, the ten lines chosen were all atom lines. Of the ten lines chosen for Cr, seven are ion lines, and three (at diodes 510, 612 and 738) are atom lines. It is possible that the anomalous behavior of Cr in this analysis is related to the mixture of atom and ion lines for a single element in the multivariate analysis. All ion line intensities are well correlated since they are produced by the same species. The atom lines should be similarly well correlated. A shift in the ion to atom equilibrium will, however, affect the relative intensities of atom lines relative to ion lines, so the correlation between ion line and atom line intensities will be lower. Such an effect would cause a factor analysis to give a poorer fit in a one factor determination. Thus the elemental spectrum would be better represented by two factors, one for the atom spectrum and one for the ion spectrum. 8.4.1.4 La in the Presence of Interferents A sample was prepared from the standards, containing La as the analyte (50 ppm), and Cu, Ni, Fe, Mn (all 50 ppm), Cr (37.5 ppm) and Co (100 ppm) as interferents. A set of graphs for the determination of La in the presence of several interferents is given in Figures 8.8 and 8.9. For the La lines chosen, there is one major interference, which is removed in the first cycle, and a number of less severe interferences, which are removed successively. The line finally chosen at diode 3543 is a reasonable choice, being the 158 Figure 8.8. As each line is sequentially rejected, the concentration of La (in ppm) is recalculated in a multivariate analysis using the remaining lines. Number of Lines Rejected 159 Figure 8.9. (a) The ten most intense lines for the spectrum of La in the region of 340 nm. Each of the ten bars represents the sum of the five diodes with the greatest intensity for each line, (b) The actual sum of all interferent intensities for the ten La lines, (c) An estimate of the sum of all interferent intensities for the ten La lines, as obtained from the residual spectrum, (d) The ratio of the signal (La spectrum) to the actual sum of the interferents. (e) The ratio of the signal (La spectrum) to the estimated sum of the interferents, obtained from the residual spectrum, (f) The residual mean (squares) and the residual standard deviation (diamonds) for the determination of La drop as the lines with the worst interferences are sequentially rejected. m o 1 6000 4000 « aooo n - s H M n Diode Number Diode Nnmber X >oo „ too § I 5 g £ § 8 * 1 8 8 £ & 8 £ 8 8 * 8 8 Diode Number Diode Number 8 £ * 8 £ £ 8 * 8 8 Diode Number 160 second best on the basis of the actual signal/interference ratio (Figure 8.9d), and within 20% of equalling the best line at diode 2795 in analyte intensity. The best line, as suggested by the last retained line at diode 3543, is not the same as the line suggested by the target/residual ratio (Figure 8.9e). The target/residual ratio suggests two other lines; the first at diode 3087, and a close second at diode 2735. These are the third and fourth best lines on the basis of the actual signal/interference ratio (Figure 8.9d). Any of these three lines is quite acceptable for a single line determination of La. In this sense the method was successful in identifying a suitable line. However, another likely candidate at diode 2795 was passed over. These four lines are all within a factor of two for signal/interference ratio (Figure 8.9d), and the line at diode 2795 is the third choice when target/residual ratio is considered. These lines are close in signal/interference ratio, so it is to be expected that there would be some randomness in the order of choice. On the basis of absolute analyte intensity, the line at diode 2795 is the best choice, but the interferences are also greater, so the ratio of signal to interference is probably a better measure of analytical utility. The ten lines used in the analysis of La were all ion lines, and the analysis behaved as expected. This further supports the hypothesis that the Cr results are affected by the use of a mixture of atom and ion line. For the determination of Co, Cr, and La, the difference between actual and calculated concentrations (e.g. Figure 8.1) after all the interferences have been removed is due to the high noise level of the array itself. 161 8.5 Summary A method has been described for selecting a suitable line for analysis when there are several lines available, some of which may suffer from spectral interferences. The method is best suited to multiplex spectrometers which provide a window of spectral information. In most cases a suitable line is correctly identified. As no knowledge of the identity of the interferent(s) is required, this method has an advantage over other methods which must make use of spectral tables. These methods should be particularly useful for the selection of suitable analytical lines for a complex sample with untabulated interferences. 162 Chapter 9 Automated Analytical Line Selection for a Unique Polychromator Design 9.1 Introduction - A Unique Spectrometer Design A new type of spectrometer (the Leco Plasmarray) combines the high resolution of an echelle spectrometer with wavelength preselection capability and photodiode array detection. Previous studies have evaluated its performance [7,5,48], but a combination of the unique features of this spectrometer with the tools of multivariate analysis has not previously been explored. The unique features which make this spectrometer ideal for multielement analysis will now be discussed. 9.1.1 Polychromator Design 9.1.1.1 Echelle Dispersion The Leco spectrometer is built around an echelle grating, but unlike other echelle based spectrometers, it does not use cross dispersion to separate the orders. Instead, only certain narrow wavelength ranges are selected using a low resolution predispersion grating and a spectral mask. Narrow slots in a spectral mask can be used to select a few spectral windows of interest. The combination of echelle spectrometer and photodiode array can then view these selected windows with high resolution. The system will now be described in greater detail. In a conventional echelle spectrometer, the echelle grating provides high dispersion, and thus high resolution, but the light must be cross dispersed into a two dimensional array to avoid order overlap. If the light is focussed on a one dimensional photodiode array without cross dispersion, high resolution is still maintained, but severe order overlaps render the spectra unusable. Over 160 orders 163 may overlap on the array (from order 298 for Re 189.836 nm to order 135 for Eu 420.505 nm). As the wavelength increases, the order which actually falls on the array decreases. All other orders of the same wavelength are lost as stray light inside the polychromator and are absorbed by the dark inner surfaces of the enclosure. For example, a line at 235.028 nm will fall on the array only in the 241st order, while a line at 345.523 nm falls on the array in the 164th order. The problem is not that the orders are overlapped, but that the amount of spectral information falling on the array is far greater than the number of pixels. If all except the information in a single narrow wavelength range can be filtered out, a high resolution spectrum of a narrow spectral window is obtained. This can be accomplished by preselection in an optical stage previous to the echelle grating. 9.1.1.2 Wavelength Preselection This section of the spectrometer allows light in only a narrow wavelength range to fall on the array, with high dispersion, and makes possible a high resolution narrow wavelength coverage spectrometer. This is advantageous when only a narrow wavelength range is required, as in a single element analysis utilizing a single line. In practice, more than one window may be selected at the same time, since the possibility of order overlap of the lines within only two or three (or even ten) spectral windows is small, and multiwavelength analysis is viable. Such a wavelength filter is present in the first half of the Leco spectrometer described here. The light is dispersed, selected wavelength windows are chosen from the dispersed spectrum (using a spectral mask), and the light from these windows is 164 undispersed back into a quasi-white light. This quasi-white light may then be treated as if it were the original light from the source, and passed through any sort of wavelength dispersing polychromator, which is an echelle grating system in the case of the Leco spectrometer. A schematic is shown in Figure 9.1. The source (ICP) is imaged on an entrance slit by a lens. The light which passes through the entrance slit falls on a low dispersion concave grating (Gl), called a "predispersion" grating. At the focal plane of the concave grating, the entire spectrum is imaged over a plane approximately 10 cm wide. A mask is placed at this focal plane, and the slots in the mask are cut to allow through only selected narrow windows of spectral information. The quasi-white light is recollimated by a concave mirror (Ml) onto a second grating (G2). The second grating undisperses the light, so that a parallel beam of quasi-white light is passed to the next section of the spectrometer. In the next section, the quasi-white light falls on a high dispersion echelle grating. After dispersion, the dispersed quasi-white light from the echelle grating is imaged on the photodiode array using a second concave mirror (M2), and a lens. Only one, or occasionally two orders will fall on the photodiode array for a given wavelength. Each mask slot results in a spectral window imaged on the array, but the echelle order will be different for each window. 9.1.1.3 Wavelength and Diode Position Figure 9.2 schematically shows the position of four spectral windows resulting from four mask slots. The photodiode array axis is no longer a linear wavelength 165 Figure 9.1. Schematic diagram of an echelle spectrometer with wavelength preselection and photodiode array detection. Light from a source is imaged first on an entrance slit. A low dispersion concave grating (Gl) disperses the light from the slit, and images a spectrum aeross the face of a mask. Slots cut on the mask allow only a narrow window of wavelengths to pass through. Several slots may be cut in the mask to provide several independent wavelength windows, although only one is shown here for clarity. The dispersed light which passes the mask is recollimated by a concave mirror (M2). The wavelength windows are re combined by an "undispersing" grating (G2). The resulting beam of collimated quasi-white light is passed to a high resolution echelle grating. Of the many overlapping orders from the echelle grating, only one (or two at most) for any wavelength will hit the final mirror. This final concave mirror (M2) images high resolution windows of spectral information (selected by the mask slots) onto a 1024 pixel photodiode array. 166 Figure 9.2. Narrow wavelength windows selected by a spectral mask are imaged on a photodiode array. Four slots (I-IV) are shown in this example. The echelle grating produces a high resolution image of the narrow range of wavelengths which passed through a slot. The position of a spectral line on the array will depend upon its wavelength, and upon the order which falls on the array. The orders corresponding to each window in this example are shown on the right axis. Windows I and IE partially overlap when they fall on the photodiode array, illustrating a further type of spectral overlap, order overlap. 167 axis, as each window has its own wavelength limits, which are in turn determined by slot position in the mask. The wavelength of an observed line on the photodiode array may be determined once the identity of the mask slot through which it passed is known. Knowledge of the order which would fall on the array, and the position of the line on the array indicates the wavelength. A new type of spectral interference, due to the possible overlap of two different spectral windows, is illustrated by the windows from slots I and III. Although they are well separated in wavelength, the orders which fall on the array are also different, and this may result in an overlap of two lines. 9.1.2 Masks 9.1.2.1 Scanning Mask A narrow slot in an opaque band (scanning mask) can be moved to select a window over a restricted wavelength range. Only wavelengths within this narrow range pass through the scanning mask to the echelle grating in the later half of the spectrometer. This wavelength window is imaged in a single order across the entire photodiode array, providing high resolution over a restricted range of wavelengths. 9.1.2.2 Fixed Masks Rather than a movable scanning mask, a fixed mask with precut slots can be placed at the focal point of the predispersion grating, and several windows may be simultaneously imaged on the array. The vast majority of the spectral information is still filtered out, and only a few narrow wavelength ranges of interest are allowed through to the array. This allows for the tailoring of a mask to select any 168 combination of wavelength ranges, and provides for full wavelength coverage while retaining high resolution. 9.1.2.2.1 Single Element Masks Wavelength ranges may be chosen to select several lines for a single element. In this way, ten or more of the most prominent lines for an element can be selected to fall on the array with a minimum of order overlap with other spectral features. The availability of several lines allows the decision on line selection for the data processing steps of an analysis to be made after the data collection is complete. If one line is found to suffer from an interference, another line can be chosen from the remaining lines, without the need to collect another spectrum. 9.1.2.2.2 Multielement Masks The wavelength ranges may also be chosen to select one line each for a set of different elements. Up to 24 elements have been programmed into one multielement mask. The greater the number of slots in the mask, the greater the possibility that a signal which has passed through another slot will interfere with an analytical line. Care must be taken in the initial selection of a set of lines for a multielement mask, to avoid possible order overlaps. This is discussed in greater detail in the sections to follow. 9.2 Spectral Features Many of the unique features of this spectrometer design are best illustrated by example. A number of spectra of Co were taken, and evaluated with respect to several parameters. Six spectral analysis windows were defined on the array. Each 169 analysis window was centered on a prominent emission line of Co, and corresponded to the central portion of the spectral windows produced by a Co mask. The maxima of each of these lines fell within three diodes of the predicted position (Table 9.1). Table 9.1. Prominent lines of Co used in analysis. Wavelength Diode Order Predicted Diode Notes 228.616 230.786 234.739 238.892 345.350 228.616 19 205 279 392 445 592 247 245 241 237 164 248 16 204 277 391 442 591 (buddy line) (buddy line) (primary) (primary) (primary) (primary) 9.2.1 Buddy Lines The 228.616 line appears as two separate peaks in the photodiode array spectrum. It appears in the 248th order at diode 592, which is the order which is designated to fall on the array for this wavelength. It also appears in the 247th order at diode 19. This second image of the 228.616 line also falls on the array. The array covers a slightly greater range than that which would be covered by a single order, so it is common to see the next higher, or next lower order of a line appear near the ends of the array. These N+l or N-l order images have been labelled "buddy" lines, since they will always appear in concert with the primary image (Nth order). In some cases the N+l or N-l order line will fall just off the array, so no buddy lines will be detected. Only the primary image in the Nth order will be seen. The intensity of 170 the buddy line relative to the primary line will vary, depending upon the order which falls on the array. The buddy line can even be more intense than its primary line, as is the case above with the 230.786 line. 9.2.2 Leak Lines 9.2.2.1 Mask Elements A spectrum of 1000 ppm Co in aqueous solution, with the Co mask in the optical path, is seen in Figure 9.3. The six lines chosen above are the six most intense lines in this spectrum. A number of other, weaker lines are also seen. These are also Co lines, but the mask was not designed specifically to pass their wavelengths. Rather, these lines are close enough in wavelength to the mask lines, and the mask slots are wide enough, that the lines were able to pass through one of the mask slots. Since these lines were not chosen for their analytical utility, but just happened to pass through the Co mask, they are termed "leak" lines. The most prominent of the leak lines are identified in Table 9.2. These lines are not used further in this analysis. Table 9.2. Prominent leak lines of Co through a Co mask. Wavelength Order Predicted Diode 238.636 237 238 228.352 248 426 238.954 237 428 230.766 246 784 primary line (buddy at diode 204) A more detailed summary of possible leak lines and their positions on the array is given in Figure 9.4. The positions of these lines are given by thin vertical bars 171 Figure 9.3. 1000 ppm Co in aqueous solution, as seen through a Co mask. XT- 2.000 LINEAR 1000 PPM CO. CO MASK SD90031B.029 11B33 "I 24 — > -I 1 1 1 1 1 1 1 1 1 PIXEL 1024 172 Figure 9.4. Positions of leak lines due to Co which pass through a Co mask. HAVELENBTH El FHFNT QRDEELDIOQiL 8< 839.360 leak 236 9 228.616 Co 2 847 16 230.616 leek 845 86 238.390 leak 236 86 234.426 leak 841 88 22S.761 leak 847 119 238.486 leek 237 180 284.616 leak 841 804 230.766 Co 2 84S 804 236.636 leek 237 238 844.017 leek 164 263 844.844 laak 164 874 230.902 leak 84S 874 234.739 Co 2 241 877 238.746 laak 237 804 238.892 Co 2 837 891 828.352 836.954 831.160 345.350 235.026 826.465 345.523 835.139 839.137 230.397 230.418 345.693 826.616 239.260 830.618 834.426 839.390 226.781 230.786 834.616 238.636 230.902 234.739 238.892 226.352 231.160 laak leak leak Co 1 leak leak laak laak laak laak laak leak Co 2 leak laak laak laak laak 236.486 laak Co 2 laak laak laak Co 2 ,746 laak CO 2 leak laak 848 237 245 164 241 248 164 241 237 246 246 164 246 837 246 842 837 846 246 242 838 846 842 426 426 433 442 403 809 513 820 83? 543 856 883 891 611 616 676 869 694 836 780 764 763 841 857 868 236 806 993 249 1002 846 1016 r a > 173 placed along the diode axis. Primary lines are shown as solid bars, while buddy lines are shown as dashed bars. Above each line, the wavelength of the line and its identity is given. If the line is the desired line for which the mask slot has been cut, it is identified further as either an atom line (Co 1) or an ion line (Co 2). If the mask was not designed to pass the listed line, it is labelled only as a leak line. The next number gives the order which falls on that position of the array, and finally a predicted diode position is listed. 9.2.2.2 Other Elements Leak lines from other elements in a complex matrix are a more serious threat to the accuracy of an analysis. An example of this is given in Figure 9.5. A Co mask is used to isolate the Co lines in a mixture of 10 elements. Comparison of this spectrum with that of a pure Co solution (Figure 9.3), shows that several additional lines are present. The most prominent of these are due to Cd (228.802 nm) and Fe (238.863 nm). A number of other interelement leak lines have been labelled with an asterisk, but are not explicitly identified. In addition, it is possible that direct overlap of an interferent line with an analyte line may be present. This is difficult to detect except through multivariate methods, since the only evidence for such an overlap is a slight increase in intensity of one analyte line relative to other analyte lines of the same species. 9.2.3 Order Overlap: An Additional Spectral Interference The proliferation of leak lines reflects the difficulty of choosing an appropriate position for a mask slot. In a multielement mask, the leak lines from one element 174 Figure 9.5. 50 ppm Co in aqueous solution, also containing 9 other possible interferent elements (Cd, Pb, Cu, Cr, Mn, Ni, Fe, Zn, and Y, all 50 ppm). XT- £.000 LINEAR MIX 10, 00 MASK •39003IE.001 SB7-I UJ 44 uAJU 1—i t — i 1084 PIXEL 175 may spectrally overlap the primary line of a second element. Overlap of different orders further complicates the situation. For example, the Co leak line at 344.944 nm falls at diode 274 in the 164th order. The Co leak line at 230.902 nm also falls at diode 274, but in the 245th order. Thus we have a direct spectral overlap of two lines whose wavelengths differ by 14.042 nm. The overlap occurs since the two lines fall at the same diode position on the array, even though they have different wavelengths and are in different orders from the echelle grating. Leak lines of the same element in a single element determination are not generally a problem, since the intensity of a leakage line which overlaps a primary line is proportional to the primary line, and appears in both standards and samples. However, leak lines of elements other than the analyte can cause serious problems with intensity measurements unless an appropriate correction factor can be found. This is where multivariate methods will be shown to be most useful. 9.3 Performance of the Photodiode Array The performance of the photodiode array is an important parameter in determining the overall performance of a polychromator system. The 1024 pixel photodiode array supplied with the Leco Spectrometer was evaluated with respect to sensitivity and resolution, for comparison to other systems. 9.3.1 Sensitivity The sensitivity is hmited by the thermal noise in the array, which gives a random background signal (dark current). Peltier coolers on the array allow cooling to at least -35°C. The temperature is adjusted by setting a thermostat within a box 176 containing the control electronics for the photodiode array. A lower temperature of -40°C was attempted, but could not be reliably maintained. This was due to seasonal and climactic variations in the temperature of the tap water used to cool the opposite side of the Peltier cooler. The array was cooled to -35°C for all experiments carried out, and the temperature was stable within 0.1°C. For a series of consecutive spectra of dark current, the standard deviation of a single pixel intensity did not exceed ±2 intensity units (out of 16384 for a 14 bit A/D converter). The sensitivity of a photodiode array is generally less than that of a photomultiplier tube, but by increasing the integration time, comparable sensitivities can be obtained. A second problem with photodiode arrays is that their sensitivity falls off dramatically at shorter wavelengths. This is most apparent with lines in the region below 200 nm. One of the most commonly used lines for As is located at 193.696 nm (Figure 9.6a). Three other lines appear when using an As mask, at 228.812, 234.984, and 278.022 nm. The 193.696 nm line is far less intense relative to other lines on the Leco Plasmarray system than on systems using photomultiplier tubes. This lack of sensitivity may be partially overcome on systems which include an intensified array as part of the detection system. 9.3.2 Resolution The resolution of the Leco Plasmarray system is superior to most systems currently available as commercial instruments [73]. The ability to resolve the As/Cd pair at 228.812/228.802 nm is an indicator of superior resolution. These two lines appear on the array in the 248th order (Figure 9.6b). A separation of six pixels gives 177 Figure 9.6. Sensitivity and selectivity of the Leco Plasmarray with photodiode array detection for two extreme cases. (a) As 193.696 line. (b) As/Cd pair at 228.802/228.812 nm. XT- tt.OOO 4.011 PTJXL rr- u.eoo UMUW tm/ot aa.ou m u am M • 114.091 178 a reciprocal linear dispersion of 1.67 pm/pixel, which agrees with the theoretical resolution expected using this order [7], 9.3.3 Instrumental Limitations As mentioned previously, sensitivity decreases rapidly for wavelengths below 200 nm, at least using the photodiode array detector which was available in the laboratory. In most, if not all cases, a suitable line above 200 nm could be found as a second choice for an element of interest. The dynamic range of the detector was limited to one part in 16384, due to the use of a 14 bit A/D converter. A greater dynamic range is possible if several spectra are taken at different integration times and scaled appropriately before recombination (see Chapter 2). This was not done for the Leco Plasmarray spectra since the source code for the data acquisition system was not available for modification. For the spectra collected in this study, saturation was avoided by monitoring the signal intensity during collection, and adjusting the integration time manually. In some cases, interferent peaks were saturated, but these diodes were rejected in further data analysis stages, and as a result did not compromise the precision of subsequent results. 9.4 Data Pre-processing 9.4.1 Experimental Data Collection The experimental apparatus for the generation of the plasma, and introduction of aerosol sample, was identical to that described in previous chapters. The Leco Plasmarray spectrometer was used in place of the previously described optical system for spectral collection. 179 9.4.2 Selection of Suitable Spectral Windows Each analysis window was treated separately by factor analysis, with the fit and the success of target testing determining the suitability of the window for further analysis. The positions of the most prominent lines for the desired elements were located on the photodiode array. A further analysis window of diodes was defined to sample a narrow range of wavelengths centered on the line of interest. 9.5 Application of Methods 9.5.1 Determination of Co A slightly different approach than used in previous analyses was taken to select the best line for an analysis. The region surrounding each line was evaluated separately by factor analysis, to detect the presence of structured background which would compromise the precision of a univariate or multivariate determination. In the presence of 9 other elements, the background in the region of all Co analyte lines was observed to increase. This background is most likely due to line wings from intense interferent lines which may be partially or completely outside the chosen spectral analysis windows, as selected by the mask slots. The major interferent was Fe at 2000 ppm. Eight other elements, Cd, Cr, Cu, Mn, Ni, Pb, Zn and Y were present at 50 ppm. The nature of the background intensity precluded the use of a simple constant background term across the array, since the intensity of the background shifted slowly with array position. Within a single spectral analysis window (peak ± 10 diodes), the background due to line wings of relatively distant lines could be treated 180 as a constant offset factor. A different offset factor was required for each analysis window, since each analysis window came from a different part of the array, and each would suffer a different sum of contributions from line wings. Interferences due to more structured interferences, such as direct or partial spectral overlap could be accounted for by a residual factor. If this residual factor was too large, then the information in the spectral window could be considered compromised by an interference, and that window would be rejected. In addition, the spoil value for a constant background factor should be small, indicating a good fit. If the spoil value is too high (above six is considered unacceptable, see previous chapters), the background might still have little structure. In this case, a sloped background, such as would be produced if the analyte line were on the shoulder of an intense interferent line, could be suspected. Figure 9.7 is the spectrum of the above mixture of ten elements, with a Co mask allowing only the selected Co windows to pass through to the array. The spectrum has been scaled to show the Co lines at approximately half the intensity of the same lines in the spectrum of pure Co (Figure 9.3). This was done at the expense of the more intense interferent peaks being off scale. The spectrum of the mixture is considerably more complicated than the spectrum of a pure Co solution. Figures 9.8 and 9.9 give a more detailed view of each of the six spectral analysis windows, for the pure Co standard solution, and the mixture containing ten elements. 181 Figure 9.7. 25 ppm Co in aqueous solution, also cxmtaining 9 other possible interferent elements (Cd, Pb, Cu, Cr, Mn, Ni, Zn, and Y, all 50 ppm, and Fe 2000 ppm). 182 Figure 9.8. Spectral regions surrounding prominent Co lines in 1000 ppm Co and a mixture of 10 elements mduding Co. (a) diodes 1 to 41, standard. (b) diodes 1 to 41, mixture. (c) diodes 185 to 225, standard. (d) diodes 185 to 225, mixture. (e) diodes 259 to 299, standard. (f) diodes 259 to 299, mixture. 183 Figure 9.9. Spectral regions surrounding prominent Co lines in 1000 ppm Co and a mixture of 10 elements including Co. (a) diodes 372 to 412, standard. (b) diodes 372 to 412, mixture. (c) diodes 425 to 465, standard. (d) diodes 425 to 465, mixture. (e) diodes 573 to 613, standard. (f) diodes 573 to 613, mixture. 184 Figure 9.10. Determination of Co in the presence of 9 interferents. (a) Residual deviation. (b) Flat background factor deviation. (c) Flat background factor (spoil values). Residual Deviation (six spectral windows) (a) 19 205 279 392 «45 593 Spectral Window Flat Background Factor Deviation ( six spectral windows) 0 j . 19 205 279 392 445 593 Spectral Window Flat Background Factor (spoil values) 19 205 279 392 445 593 Spectral Window 185 Figure 9.10 shows three indicators of the suitability of each of these six windows for further analysis. Figure 9.10a gives the standard deviation of all the values in the residual for each of the six analysis windows. The analysis window centered on diode 392 is clearly indicated as the one with the greatest probability of containing an interference to the analyte line. Figure 9.10b is the standard deviation of the flat factor representing a constant background. The poorest fit, with the worst standard deviation is again the analysis window at diode 392, which confirms the interference indicated by the residual standard deviation. Figure 9.10c gives the spoil values for each of the six target tests with a flat (constant value) factor, representing the constant background shift. Again in this case an interference is indicated in the analysis window at diode 392. This indication is not as clear as in the previous cases. The analysis window at diode 19 also has a high spoil value. The spoil value thus suggests a priority for the windows, but confirmation is needed from the standard deviation of the flat background factor and the residual factor. Examination of each of the windows in detail (Figures 9.8 and 9.9) shows that the line at diode 392 appears on the shoulder of a much more intense line approximately 15 diodes lower. All six analysis windows show an increased background, but this can be accounted for by a flat factor except in the case of the window at diode 392, where the shift is not constant due to the shoulder of an adjacent iron line. 9.5.2 Comparison with a Univariate Analysis The ability of factor analysis to identify problems due to interferences can be evaluated by comparing the results for each of the multivariate windows with the 186 results for a conventional univariate analysis where a single line intensity is used for each line. Univariate analysis often corrects for background shifts by assuming that a measurement of intensity off the line (either higher or lower in wavelength) will accurately reflect the level of the background. For this univariate analysis, the background was estimated from a single intensity measurement 20 diodes to one side of the line of interest. In a multivariate analysis, no such correction is required, since a shift in background can be compensated for by inclusion of a flat background factor. The magnitude of this factor is automatically adjusted for the best fit to the data. A comparison of the univariate and multivariate results are given in Table 9.3. The values for the multivariate analysis were closer to the actual value of 25.0 ppm (as originally prepared by dilution). The univariate analysis for the line at diode 392 gives an unreliable result, since adequate correction could not be made for the background. The multivariate analysis for this line is also the worst of the six lines chosen, but the value does not differ from the actual value as much as would be expected by the magnitude of the interference. Clearly the dynamic background correction in the multivariate analysis (a combination of the use of a flat background factor, and a residual vector) is better able to handle such an interference than a univariate analysis. In the univariate analysis there is simply not enough information available from a single analyte and background measurement to characterize and correct for the interference. For the remaining five lines, the 187 Table 9.3. Comparison of Multivariate vs. Univariate Results Diode Multivariate Univariate * 19 205 279 392 445 592 24.08 ± 0.42 24.48 ± 0.47 22.89 ± 0.45 15.59 ± 0.57 23.49 ± 0.28 25.27 ± 0.59 22.58 ± 0.59 25.22 ± 0.48 24.51 ± 0.66 -218.94 ± 4.77 20.12 ± 0.31 22.50 ± 0.59 Average of all except diode 392 24.04 ± 0.91 22.99 ± 2.00 * Univariate results carried out correction by subtraction of background 20 diodes off peak (lower, except for diode 19 which had to be corrected higher). independent results are more precise in the multivariate approach than in the univariate case. This is reflected in the standard deviation of the five values (one for each line), which is twice as large in the univariate determination. Within the treatment of a single line, a multivariate window offers a slight improvement over a univariate determination when no significant interferences are present. When interferences are present, the univariate determination fails. The multivariate determination is also inaccurate, but the presence of an interference is clearly indicated, so that another line may be chosen. When the results from several lines are considered, the multivariate approach offers more consistency in the results for different lines than that found in a univariate determination. 188 9.6 Summary A method for the choice of the best line from several, in a unique wavelength multiplexed system, was shown to work successfully, and indicate those lines which suffer from interferences and should not be used. This allows a sample matrix-specific mask to be designed for a selected elements. Even for an analysis based on a single line, a multivariate analysis using a spectral window across the wavelength region surrounding the line was shown to be superior to a conventional analysis using only a single intensity measurement. 189 Chapter 10 Conclusions This thesis has reported the development of methods to assist in the interpretation of multiwavelength spectral data. In the course of this research, an algorithm to increase the effective dynamic range of a spectrum collected on a photodiode array has been designed and implemented. The enhanced spectrum exhibited an improvement in the suppression of the noise floor by a factor of 10 to 100, which represented a significant improvement in performance for a photodiode array. With this enhancement the PDA provides a competitive alternative to the PMT, with the added benefit of simultaneous multiwavelength data acquisition. The main direction of this research has focussed on the development of methods to overcome the problem of unidentified interferents and their effect on quantitative analysis. Factor analysis has proven to be a powerful tool to solve this problem. One method for dealing with unidentified spectral interferences employed the lowering of the dimensionality of the data used in a multivariate analysis. This involved selective stepwise removal of certain diodes, which represent the greatest residual variance after known components were identified. As the diodes with the largest contribution to the overall residual variance were removed, the fit of other diodes improved, and the remaining diodes with significant residual variance were then identified and removed. This provides the analyst with two alternatives for dealing with complex 190 mixtures. If all components are known, a straightforward factor analysis with the full data matrix will lead to excellent results. If one or more components cannot be identified, the data matrix may be reduced so as to remove those diodes which exhibit interference from the unknown component. The best wavelengths (diodes) for analysis of the known or desired components may be selected without prior knowledge of the identity of interfering components. Automatic, matrix dependent, line selection is therefore possible, tailoring lines chosen to the elements of interest. Using the selected line, a univariate analysis can be carried out to quantify each element separately. Alternately, several elements may be chosen for optimization, and in this case the best peaks for each element can be determined, along with a correction factor, if required, to account for any overlap with one of the other known components. As a final step, the spectrum of the interfering component may be reconstructed, to further assist the analyst in understanding the nature of any observed interferences. Another method for the selection of a suitable line for analysis involves the manipulation of spectral windows as an undivided entity. The residual in these cases represents an overall deviation for the entire spectral window in the region of a single line rather than the deviations of individual diodes as in the previous method. Several cycles of factor analysis are carried out, and at the end of each cycle the progress of the analysis in removing interferences may be evaluated by examination of the residual spectrum and the updated estimate of concentration based on a multivariate analysis of the remaining data. In each cycle, the most interfered-with 191 line is rejected. This cyclic process continues until only one line remains. Three options are available as variations of this approach to analysis. 1) The last line retained may be taken as the best line. 2) The line with the best signal/interference (target/residual) ratio may be used in a single line determination. 3) The first few rejected lines will contain most of the major interference effects. Factor analysis using the remaining lines gives a multiline analysis, which in many cases may be preferred to a single line determination. A method for the choice of the best line from several in a unique wavelength multiplexed system was shown to work successfully, and indicate those lines which suffer from interferences and should not be used. This allows a sample matrix specific mask to be designed for a selected element. Even for an analysis based on a single line, a multivariate analysis using a spectral window across the wavelength region surrounding the line was shown to be superior to a conventional analysis using only a single intensity measurement. The methods developed in this thesis are best suited to multiplex spectrometers which provide a window of spectral information. As no knowledge of the identity of the interferent(s) is required, this method has an advantage over other methods which rely solely on spectral tables. These methods are particularly useful for the selection of suitable analytical lines for a complex sample with untabulated interferences. It is hoped that other researchers will continue this work, and extend the 192 applications of these methods beyond the domain of emission spectroscopy. Applications in areas such as mass spectrometry, multiwavelength detection in chromatography, and image sensor manipulation, as well as other fields, should be possible. 193 References 01) Boumans, P. W. J. M. "Line Coincidence Tables for Inductively Coupled Plasma Atomic Emission Spectrometry"; Pergamon Press: Oxford, 1980. 02) Stubley, E. A.; Horlick, G. Appl. Spectrosc. 1985, 39, 800-804. 03) Stubley, E. A.; Horlick, G. AppL Spectrosc. 1985, 39, 805-809. 04) Ng, R. C. L.; Horlick, G. AppL Spectrosc. 1985, 39, 834-840. 05) Karanassios, V.; Horlick, G., AppL Spectrosc. 1986, 40, 813-821. 06) Talmi, Y. (ed.) "Multichannel Image Sensors", ACS Symposium Series 236, Vol. 2, American Chemical Society: Washington, D. C , 1983. 07) Levy, G. M.; Quaglia, A.; Lazure, R. E.; McGeorge, S. W. Spectrochim. Acta 1987, 42B, 341-351. 08) Leicester, H. M. "The Historical Background of Chemistry"; Dover: New York, 1971. 09) Asimov, I. "Asimov's Biographical Encyclopedia of Science and Technology, 2nd Ed."; Doubleday: New York, 1982. 10) Meyer, G. A. Anal. Chem. 1987, 59,1345A-1354A. 11) Wendt, R. H.; Fassel, V. A. Anal. Chem. 1965, 37, 920-922. 12) Greenfield, S.; Jones, I. L.; Berry, C. T. Analyst 1964, 89, 713-720. 13) Harman, H. H. "Modern Factor Analysis"; University of Chicago Press: Chicago, 1976. 14) Blackburn, J. A. Anal. Chem. 1965, 37, 1000-1003. 15) Kankare, J. J. Anal. Chem. 1970, 42,1322-1326. 16) Hugus, Z. Z.; El-Awady, A. A. J . Phys. Chem. 1971, 75, 2954-2957. 194 17) Kowalski, B. R.; Schatzki, T. F.; Stross, F. H. Anal. Chem. 1972, 44, 2176-2180. 18) Antoon, M. K; Koenig, J. H.; Koenig, J. L. Appl. Spectrosc. 1977, 31. 518-524. 19) Brereton, R. G. Analyst 1987, 112, 1635-1657. 20) Pittsburgh Conference and Exposition on Analytical Chemistry and Applied Spectroscopy (PITCON). 21) Federation of Analytical Chemistry and Spectroscopy Societies (FACSS). 22) Journal of Chemometrics, John Wiley and Sons Ltd., BaffLns Lane, Chichester, Sussex P019 1UD, England. 23) Journal of Chemometrics and Intelligent Laboratory Systems, Elsevier Science Publishers, P. O. Box 211,1000 AE Amsterdam, The Netherlands. 24) Special Issue on Multivariate Spectrochemical Analysis, Talanta 1990, 37. January. 25) Chemometrics Special Issue, AnaL Chun. Acta 1987, 191. 26) International Conference on Chemometrics in Analytical Chemistry (Collected Papers), AnaL Chim. Acta 1983, 150(1). 27) Ingle, J. D., Jr.; Crouch, S. R. "Spectrochemical Analysis"; Prentice-Hall: Englewood Cliffs, N.J., 1988. 28) Slavin, W. AnaL Chem. 1986, 58, 589A-597A. 29) Caughlin, B. L. Ph.D. Thesis; University of British Columbia: Vancouver, B. C , 1986. 30) Blades, M. W.; Horlick, G. Spectrochim. Acta 1981, 36B, 881-900. 31) Caughlin, B. L.; Blades, M. W. Spectrochim. Acta 1984, 39B, 1583-1602. 195 32) Blades, M. W.: Caughlin, B. L. Spectrochim. Acta 1985, 40B. 579-591. 33) Burton, L. L.; Blades, M. W. Spectrochim. Acta 1986, 41B, 1063-1074. 34) Montaser, A.; Gohghtly, D. W. "Inductively Coupled Plasmas in Analytical Atomic Spectrometry"; VCH Publishers: New York, 1987. 35) Boumans, P. W. J. M.; Vrakking, J. J. A. M. Spectrochim. Acta 1984, 39B. 1291-1305. 36) Harrison, G. R. "Massachusetts Institute of Technology Wavelength Tables"; MIT Press: Cambridge, MA., 1969. 37) Salin, E. R.; Horlick, G. Anal Chem. 1979, 51, 2284-2286. 38) Kleinman I.; Svoboda, V. Anal. Chem. 1969, 41, 1029-1033. 39) Kielsohn, J. P.; Deutsch, R. D.; Hieftje, G. M. AppL Spectrosc. 1983, 37, 101-105. 40) Human, H. G. C ; Scott, R. H.; Oakes, A. R.; West, C. D. Analyst 1976, 101. 265-271. 41) Piepmeier, E. H. (ed.) "Analytical Applications of Lasers, Chemical Analysis Monograph Series No. 87"; Wiley and Sons: New York, 1986. 42) Windsor, D. L.; Denton, M. B. AppL Spectrosc. 1978, 32, 366-371. 43) Thompson, M.; Walsh, J. N. "Handbook of Inductively Coupled Plasma Spectrometry"; Blackie and Son: London, 1989. 44) Talmi, Y.; Simpson, R. W. AppL Opt 1980, 19, 1401-1414. 45) McGeorge, S. W.; Salin, E. D. AnaL Chem. 1985, 57, 2740-2743. 46) Wirsz, D. F.; Blades, M. W. AnaL Chem. 1986, 58, 51-57. 47) McGeorge, S. W.; Salin, E. D. Spectrochim. Acta 1985, 40B, 435-445. 196 48) Brushwyler, K R.; Furuta, N.; Hieftje, G. M. Talanta 1990, 37, 23-32. 49) Rummel, R. J. "Applied Factor Analysis"; Northwestern University Press: New York, 1970. 50) Malinowski, E. R.; Howery, D. G. "Factor Analysis in Chemistry"; Wiley: New York, 1980. 51) Lorber, A. AnaL Chun. Acta 1984, 164, 293-297. 52) Malinowski, E. R.; Cox, R. A.; Haldna, U. L. AnaL Chem. 1984, 56, 778-781. 53) Lorber, A. AnaL Chem. 1984, 56, 1004-1010. 54) D'Amboise, M.; Noel, D. AnaL Chim. Acta 1985, 170, 255-264. 55) Wirsz, D. F.; Browne, R. J.; Blades, M. W. Appl. Spectrosc. 1987, 41, 1383-1387. 56) Lorber, A. Anal. Chem. 1984, 56, 1404-1409. 57) Gillette, P. C; Koenig, J. L. AppL Spectrosc. 1982, 36, 535-539. 58) Malinowski, E. R. AnaL Chem. 1977, 49, 606-611. 59) Malinowski, E. R. AnaL Chem. 1977, 49, 612-617. 60) Wirsz, D. F.; Blades, M. W. J . AnaL At. Spectrom. 1988, 3, 363-373. 61) Wold, S. Technometrics 1978, 20, 397-405. 62) Thompson, M.; Ramsey, M. H. Analyst 1985, 110, 1413-1422. 63) Ramsey, M. H.; Thompson, M. J . AnaL At. Spectrom. 1986, 1, 185-193. 64) Dahlquist, R. L.; Knoll, J. W. AppL Spectrosc. 1978, 32, 1-29. 197 65) Hee, S. H. Q.; Boyle, J. R. AnaL Chem. 1988, 60, 1033-1042. 66) Winge, R. K.; Fassel, V. A.; Kniseley, R. N.; Dekalb, E.; Hass, W. J., Jr. Spectrochim. Acta 1977, 32B. 327-345. 67) Martens, H. Anal. Chim. Acta 1979,112, 423-442. 68) Spjotvoll, E.; Martens H.; Volden, R. Technometrics 1982, 24, 173-180. 69) Borgen, 0. D.; Kowalski, B. R. AnaL Chim. Acta 1985, 174, 1-26. 70) Burton, L. L.; Blades, M. W. Spectrochim. Acta 1987, 42B, 513-519. 71) McGeorge, S. W.; Salin, E. D. Spectrochim. Acta 1986, 4IB. 327-333. 72) Corliss, C. H.; Bozman, W. R. "Experimental Transition Probabilities for Spectral Lines of Seventy Elements", NBS Monograph 53; U. S. Government Printing Office: Washington, D. C , 1962. 73) Mermet, J.-M. J . Anal. At. Spectrom. 1987, 2, 681-686. 198 Appendix 1. APL as a Development Language for Multivariate Spectroscopic Applications Summary APL is a convenient language for rapid development and evaluation of multivariate methods of data analysis. Its predefined vector and matrix operations, and the ability to build more complex programs from these and other operations, make APL particularly suited to spectral analysis. Examples are presented from programs in current use. 199 Introduction When new multivariate methods are applied to spectroscopic problems, algorithms must be evaluated, and many programs are written and rewritten during this development process. APL ("A Programming Language") is a useful language for such development and has therefore found a small but important niche among the "mainstream" computing languages. Originally developed in the 1950's as a method of notation for mathematical operators, it was implemented in the 1960's as a real programming language on IBM mainframes, and soon became available as a time shared service. Throughout the 1970's, it enjoyed considerable popularity on university mainframes1, and in the 1980's, established itself on the IBM PC. Several APL programs are used in this laboratory for the manipulation of spectra collected from a 4096 segment photodiode array. These include factor analysis methods to determine the number of elements in a multielement sample2, and algorithms to select the most suitable line for analysis, on the basis of freedom from spectral interference, without reference to tables of spectral emissions3,4. The advantages of APL as a development language will be illustrated through several examples derived from these programs. Upon beginning an APL session, the user is presented with an operating environment called a workspace. Within this workspace, the user can enter data into variables which may be scalars, vectors, or matrices. The data can then be manipulated using a large number of predefined mathematical operations. Commands may be entered one line at a time (which is particularly useful when 200 creating or investigating an algorithm) or they may be combined into a program (called a function in APL) which will then carry out the sequence of commands automatically. When proceeding one line at a time, the operating environment is much like that of a calculator. When running as a program, user interaction may be customized with the use of prompts and selection menus. A complete list of defined variables and programs is always available to the user. Predefined mathematical operations include all those commonly found on scientific calculators, and a large selection of matrix operations (e.g. inversion, multiplication) and array manipulations (e.g. drop rows, rotate columns). This is where APL is at its most powerful. Alphanumeric data may also be entered into arrays, and all array manipulation operations may be used. It should be mentioned that mathematical operations in APL are automatically done in double precision (for the implementations discussed here, and most others), since this is necessary for matrix calculations. Programs may be written which call other previously defined programs. This allows an increasing hierarchy of complex computations, while the overall structure of the algorithm remains apparent. For example, a line selection program has been developed which contains a number of factor analysis programs, and these in turn contain several eigenvector and array manipulation programs. In many other languages, it is easy to become lost in a jungle of loops and subroutine calls. The powerful operations available in APL obviate the need for a great deal of indexing and/or looping (loops are rare in APL because they are not needed!). Other languages 201 such as Pascal or Fortran may also require the explicit declaration of variables, and specification of which common variables are to be shared between subroutines. In APL, all variables and intermediate results are available to other programs unless otherwise declared as a local variable. Again, this allows for the rapid development of programs. After a program is functioning satisfactorily, some variables may be declared as local, keeping the workspace from becoming crowded with intermediate results. The reader is encouraged to make comparisons of APL source code with that of other languages (particularly Fortran and Pascal) for performance of the same tasks outlined in this paper. Such a comparison will show that an equivalent or more versatile program can be written in APL using far fewer lines of program code6. Books on APL are readily available at university and technical libraries. A short list for further reading is given in the references6"16. APL uses several special symbols which are not in the standard ASCII character set. On PC's, these symbols are shown using the graphics display. In the IBM mainframe implementation of APL at the University of British Columbia, these special APL symbols are transliterated into a set of characters compatible with the standard ASCII character set. For example, $RH is used in place of the greek symbol rho to find the dimensions of a matrix. Such transliterated characters are used in this paper. Table 1 shows a partial list of conventional APL symbols, their transliterated equivalents, and a brief definition. APL will operate identically on a mainframe (VS APL) or on a PC, the only 202 difference being the symbols displayed. One example of a PC implementation is APL*PLUS, a product of STSC16. It requires DOS 2.0 or later, and a minimum of 384K of RAM. A hard disk and a math coprocessor are not required, but are recommended. The APL character set is generated using the graphics display (monochrome, CGA, EGA, VGA, or others). A version for 80386 based systems (APL*PLUS II)16 allows increased speed and workspaces up to 15 MB, but requires DOS 3.3 and a minimum of 2 MB of RAM. Examples of APL operations. In Table 2, a number of operations in APL are shown. Within the following examples, direct entry from the keyboard is indicated by indentation from the left margin, while results returned by APL are not indented. The format of the listed input and output is identical to that seen by the user on the screen (except for the comments given in brackets). In line A, the vector INTENSITY is defined to contain six numerical values. Entries in this example are a mixture of integers, and real numbers with and without exponential notation. This variety illustrates the flexibility of input format. Spaces are sufficient to indicate the separation of entries. In line B, the dimension of the vector INTENSITY may be retrieved by the operation $RH (climension). Many symbols may also be used for operations requiring two operands. In line C, the values in the vector INTENSITY are redimensioned into a 3 by 2 array using $RH. 203 The contents of a variable may be displayed at any time by typing the variable name, as in line D. Mathematical operations may be carried out on all values in the array with a single operation. In line E all values in the array intensity are multiplied by 2. No indexing or looping is required. Data from specific portions of the matrix may be extracted using $TA (take). The first column is taken in line F, and the values stored in a new variable FIRSTCOLUMN. In line G, another variable, LASTCOLUMN, is denned and dimensioned, and in line H, this array is appended onto the existing array INTENSITY. Line I retrieves the new dimensions of the array. Line J shows two matrix operations carried out in a single line of APL. The operation $TR takes the transpose of the matrix INTENSITY. The operation +.* represents matrix multiplication. Lines K and L show the use of % (compression) and $C1 (compression[l]) which take from an array only those columns or rows which are flagged with logical l's. $% (expansion) and $X1 (expansion[l]) allows for the insertion of columns or rows of zeros. In line M, the array INTENSITY has columns of zeros inserted between the first and second, and second and third columns. A useful means of summing the columns of an array is provided by matrix multiplication by the scalar 1, as in line N. In many applications, the largest or smallest value in a vector must be found. 204 $GU (grade up) and $GD (grade down) provide these operations, and in addition provide a ranking of all other values in a vector. In line O, the first value in the response is 1, indicating that the lowest value is in the first position in the vector. The position of the second lowest value (6) is given next. In line P, the highest value is indicated to be at position 3, followed by position 4. Lines Q and R show two more useful operations. $RO (rotation) and $R1 (rotation[l]), which rotate the order of columns or rows in an array. Lastly, line S shows how matrix inversion may be accomplished in a single operation. These examples by no means include all of the useful operations defined in APL, but show some of the powerful manipulations of data which may be carried out on a single line. Programs are generally very compact. Because of the high density of operations in an APL program, good documentation is required, outh'ning the purpose of each line. APL applied to spectroscopic analysis. A specific example taken from current work is given in Table 3. This is a simplified version of a longer factor analysis program used in our laboratory. The example closely follows the example given in Chapter Five of "Factor Analysis in Chemistry"17. This example is limited to the analysis of a system containing only two factors, but the more general program in current use can handle any number of factors. This program will handle any number of features (channels of information), which in our case are spectral intensities at different wavelengths, and has no 205 limitation on the number of spectra which may be entered, except that the amount of data cannot exceed the memory available on the computer. The format of the data is thus flexible and is handled within the APL operating system. This greatly simplifies the input and output of data, in contrast to many other languages, where the format of the data must be explicitly defined. A comment has been added beside some lines in Table 3, expressing the operation on that line in conventional matrix notation. This illustrates the ease with which matrix notation may be converted into APL code. While very complex operations can be accomplished in a single line, it is good practice to break them up into several lines, and explain each line with a comment. Otherwise, it may be difficult to debug a program or to comprehend the program at a later date. Example of a spectroscopic analysis program. The first line of the program generates a covariance matrix from the original data (a matrix of spectral intensities) by multiplying the transpose of the data matrix by itself. A comparable subroutine in Fortran or Pascal would consist of many lines, requiring declaration of array dimensions, and nested loops. The second line checks for the number of spectra present, and stores the result in the variable N for later use. The third line is an example of the use of one program within another program. EIGENR is a program which finds eigenvalues and eigenvectors for a given matrix. Lines 4 and 5 store these eigenvalues and eigenvectors separately in the variables EGVL and EGVC. Line 6 generates an abstract column matrix, and line 7 truncates that matrix after the second factor. An abstract row matrix is generated in line 8, 206 and a 2 by 2 eigenvalue matrix in lines 9 through 11. Lines 12 and 13 generate transformation vectors for the first and second standard spectra. The complete transformation matrix is produced in line 14. The real row matrix (corresponding to the real spectra of the components in the data matrix) is produced in line 15. Line 16 produces the real column matrix (proportional to the concentrations of the two components in each spectrum). Lines 17 and 18 convert the values from the real column matrix into concentrations of the two components in parts per million. Table 4 shows the simplicity of the data input. DAT is a matrix of spectral data, where each column represents a single spectrum of a sample or a standard, and contains 4096 intensity measurements. It is produced by catenation of column vectors for each of three Co standards, three Fe standards, and a mixture with an unknown proportion of Co and Fe (line A). The standard spectrum for Co is composed of an average of three Co spectra, and the Fe standard is composed of an average of three Fe spectra (line B). CONC1 and CONC2 are scalars representing the concentration of Co and Fe in the standards in parts per million (line C). Once the data matrix, standards, and their concentrations are stored in these variables, the program is run by typing its name (line D). The results are stored in the variable REALCONC. This first column contains the calculated concentrations for the first standard (Co) and the second column for the second standard (Fe). Concentrations of approximately 500 ppm Co are listed for the first three spectra (Co standards), 2000 ppm for the next three spectra (Fe standards), and 100 ppm each for Co and Fe in the final mixture spectrum (line E). This agrees with the actual concentrations in 207 the mixture. A longer version of this program (QFACTOR) allows interactive entry of the standards and mixtures, evaluation of any number of components, target testing for any number of possible standard spectra, and generation of residual spectra if all components cannot be identified. Menus and prompts for the user may be added as desired to make the program as interactive and user friendly as desired. The factor analysis program, which reproduces the factor analysis routines in the book "Factor Analysis in Chemistry"17, has been written in fewer than 150 lines of APL code2, compared to 2500 lines of Fortran source code17. The APL program includes several lines which provide an interactive environment for the user. The user is prompted for names of data vectors and matrices. Printouts of real, extracted and imbedded errors, eigenvalues and the indicator function17 are given to assist in the determination of the number of factors. Printouts of the spoil value17, and apparent, root mean square, and real errors are given to assist in the acceptance or rejection of test vectors. The program QFACTOR has been used and modified extensively over the last three years, and combined with other multivariate analysis procedures. These include, in order of increasing complexity; i) removal from a data matrix of the intensities measured at a set of specified diodes on a photodiode array (DIODEDEL); ii) retention of only those spectral intensities in a data matrix corresponding to the specified diodes (DIODEKEEP); 208 xii) generation of a single residual spectrum, orthogonal to the entered standard spectra, which best spans the residual variance in a data matrix where one or more unidentified interferents are present (QFORTHRESID); iv) sequential removal of spectral information on lines which have identified interferences, and computation of a multivariate result on the remaining spectral information (AUTOLINE4). A more detailed account of the algorithms involved and their applications has been published elsewhere3,4. 209 Conclusion APL is a convenient language for rapid development and evaluation of multivariate methods as applied to spectral emission data. It is particularly suited to these methods due to its predefined matrix operations, and the ability to build more complex programs from these operations. The freedom from concern about input and output format also assists in the rapid development and assessment of algorithms. 210 Table 1. Partial List of Transliterated APL Symbols. Trans1iterat ion Definition V Define s 4- Assign $TR Transpose + . » + .X Matrix Multiply STA t Take SRH P Distension $/ a Matrix Divide * X Multiply •A / C empress $C1 ConpressC11 Sx \ Expand SGU 4> Grade Up SGD • Grade Down SRO • Rotate $R1 Rotate!11 211 Table 2. Examples of Some APL Operations. A) INTENSITY = 4 . 3 2.834 9.6 11.6 2.0E"02 14 (assign values) B) $RH INTENSITY (display dimensions) 6 C) INTENSITY = ( 3 2 ) $RH INTENSITY (redimension an array) D) INTENSITY (display contents of an array) 4.3 2.834 9.6 11.6 0.02 14 E) 2 * INTENSITY (Mult iply array contents by a constant) 8.6 5.668 19.2 23.2 0.04 28 F) FIRSTCOLUMN = (3 1) $TA INTENSITY (select a column from an array and assign se lected column to a new variable) G) LASTCOLUMN = (3 2) $RH 5 2 10 4 15 6 (create a new variable) H) INTENSITY = INTENSITY,LASTCOLUMN (append two arrays) I) $RH INTENSITY 3 4 J) COVAR = ( $TR INTENSITY) +.* INTENSITY (take transpose of an array and matrix mul t ip ly by i t s e l f ) K) 0 1 0 1 % INTENSITY (select s p e c i f i e d columns) 2.834 2 11.6 4 14 6 212 Table 2. (continued) L) 0 1 0 $C1 INTENSITY 19.2 23.2 10 4 M) 1 0 1 0 1 1 $ % INTENSITY N) 1 +.* INTENSITY 0) $GU 1 4 8 7 3 2 1 6 5 2 4 3 P) $GD 1 4 8 7 3 2 3 4 2 5 6 1 Q) $RO INTENSITY R) $R1 INTENSITY S) $/ COVAR (select spec i f i ed rows) ( insert columns of zeros into an array) (sum values i n each column) (create a vector which spec i f i e s p o s i t i o n of lowest to highest values i n a vector) ( . . . or highest to lowest) reverse order of columns i n array) (reverse order of rows i n array) (take the inverse of a matrix) 213 Table 3. Sample Program. APL Code Matrix Notation "APLDEMO [I] COVAR=( $TR DAT) +.* DAT [Z]=[D]T[D] [2] N = 1 $TA ( $RH COVAR) [3] EIG = EIGENR COVAR [4] EGVL = (1,N) $TA EIG [5] EGVC = (-N,N) $TA EIG [6] CDGR = $TR ((-N,-N) $TA EIG) [7] CDGR = (2,-N) $TA CDGR [8] RDGR = DAT +.* ( $TR CDGR) [R]=[D][C]T [9] EGVL1 = EGVL[1;1] [10] EGVL2 = EGVL[1;2] [II] LDGR = (2,2) $RH (EGVL1,0,0,EGVL2) [12] TRVC1 = ( $/ LDGR)+.*( $TR RDGR) +. *STD1 [T1] = [L]~ 1[R]T[S 1] [13] TRVC2 = ( $/ LDGR)+.*( $TR RDGR)+.*STD2 [14] TTRA = TRVC1,TRVC2 [15] ROWREAL = RDGR +.* TTRA [16] COLREAL = ( $/ TTRA) +.* CDGR [CR] = [T] _ 1 [C] [17] CONF = (2,2) $RH (CONC1,0,0,CONC2) [18] REALCONC = $TR (CONF +.* COLREAL) 214 Table 4. Example of a Factor Analysis of a Mixture of Co and Fe. DAT = C01,C02,C03,FE1,FE2 / FE3,MIX1 (A) STD1 STD2 (C01+C02+C03)/3 (FE1+FE2+FE3)/3 (B) C0NC1 C0NC2 500 2000 (C) APLDEMO (D) REALCONC 499.8779658 500.0070015 500.1150328 0.01797834787 -0.03169865765 0.01372030979 (E) 0.07153907849 -0.05226187709 -0.01927720137 1999.978709 2000.01577 2000.005521 100.0494902 99.88033359 Note: COl , C02, C03, FE1, FE2, FE3, and MIX1 are arrays which each contain 4096 spec tra l i n t e n s i t i e s , c o l l e c t e d using a photodiode array, across a wavelength range of approximately 40 nm. They represent three standard so lut ions of Co, three standard so lut ions of Fe, and a so lu t ion containing a mixture of Co and Fe. CONC1 and CONC2 are s c a l a r s . They contain the concentrations i n ppm for Co and Fe respec t ive ly i n the standard so lu t ions . 215 Appendix References 1) P. Wallich and G. Zorpette, "What ever happened to APL", IEEE Spectrum, 23 (1986) 17. 2) D. F. Wirsz and M. W. Blades, Anal. Chem., 58 (1986) 51. 3) D. F. Wirsz and M. W. Blades, J. Anal. At. Spectrom., 3 (1988) 363. 4) D. F. Wirsz and M. W. Blades, Talanta, 37 (1990) 39. 5) X. D. Liu and T. Babeliowsky, "APL in a chemical application: processing of spark source mass spectrometry data", Trends in Anal. Chem., 8 (1989) 88. 6) L. Gilman and A. J. Rose, "APL, An Interactive Approach" 3rd Ed., Wiley, New York, NY, 1984. 7) G. Helzer,"Applied Linear Algebra with APL", Little, Brown and Co., Boston, MA, 1983. 8) R. P. Polivka and S. Pakin, "APL: The language and its usage", Prentice-Hall, Englewood Cliffs, NJ, 1975. 9) G. A. Bergquist, "APL advanced techniques and utilities", Zark, Vernon, CN, 1987. 10) L. Gibson, J. S. Levine, and R. C. Metzger, "Application systems in APL: how to build them right", Prentice-Hall, Englewood Cliffs, NJ, 1985. 11) W. R. Le Page, "Applied APL programming", Prentice-Hall, Englewood Cliffs, NJ, 1978. 12) O. I. Franksen, P. Falster, and F. J. Evans, "Qualitative aspects of large scale systems: developing design rules using APL", Springer-Verlag, New York, NY, 1979. 13) A. Smith, "A design handbook for commercial systems", Wiley, New York, NY, 1982. 14) B. Legrand, "Learning and applying APL", Wiley, New York, NY, 1984. 216 Appendix References (continued) 15) A. J. Rose and B. A. Schick, Eds., "APL in practice : what you need to know to install and use successful APL systems and major applications: The Practical APL Conference, Washington, D.C., 9-11 April 1980" Wiley, New York, NY, 1980. 16) STSC Inc., 2115 East Jefferson St., Rockville, MD 20852, U.S.A. 17) E. R. Malinowski and D. G. Howery, "Factor Analysis in Chemistry", Wiley, New York, NY, 1980. 217 Append ix 2 . L i s t i n g o f APL programs u s e d i n t h i s s t u d y . Note: These programs are l i s t e d using the t r a n s l i t e r a t e d APL symbols discussed i n Appendix 1. Programs are l i s t e d a l p h a b e t i c a l l y by program name. A l l programs (c) Douglas F . Wirsz 1990, except those l a b e l l e d as o r i g i n a t i n g from the UBC computing center. "AUTO1024<#>" " AUTO1024 [I] 'ENTER DATA MATRIX' [2] DATMATRIX=# [3] 'ENTER STANDARDS MATRIX' [4] STDMATRIX=# [5] 'ENTER CONCENTRATIONS OF STANDARDS' [6] STDCONC=# [ 7] STDCONC=0,STDCONC [8] STDCONC=l$DRSTDCONC [9] REALINDEX=INDEX1024 [10] NSTDS=$,STDCONC [II] NPEAKSDEL=0 [12] NAMEPEAKSDEL=' ' [13] 'ENTER INDEX OF DIODES TO BE USED' [14] PEAKINDEX=# [15] DKEEP DATMATRIX [16] DATMATRIX=MODMAT [17] DKEEP STDMATRIX [18] S TDMATRIX=MODMAT [19] DKEEP REALINDEX [20] REALINDEX=MODMAT [21] QFORTHRESID [22] RESIDORTH=(0,NSTDS)$DRROWREAL [23] SIGN=MVSD RESIDORTH [24] SIGN=(*((1,1)$TASIGN)) [25] RESIDORTH=SIGN*RESIDORTH [26] RESIDPOS=0$MARESIDORTH [27] STDMATRIXNORESID=STDMATRIX,RESIDPOS [28] STDCONCNORESID=STDCONC, 1 [29] QFNORESID [30] RESID=(0,NSTDS)$DRROWREAL [31] SIGN=MVSD RESID [32] SIGN [33] 'NO PEAKS REMOVED' [34] ($FM(1$TA($,REALINDEX))),' DIODES LEFT' [35] CYCLE:' ' [36] SIGN=(*((1,1)$TASIGN)) [37] RESID=SIGN*RESID [38] OPFSUM RESID [39] 'LAST DIODE IN EACH WINDOW, SUM OF RESIDUAL IN WINDOW' 218 [40 [41 [42 $DRLASTDIODEOFWINDOW)))),' [43 [44 [45 [46 [47 [48 [49 [50 [51 [52 [53 [54 [55 [56 [57 [58 [59 [60 [61 [62 [63 [64 [65 [66 [67 [68 [69 [70 LASTDIODEOFWINDOW,WINDSUMS PDFSUM NAMEPEAKSDEL=NAMEPEAKSDEL,($FM(,((1,1)$TA(((MAXPOS-1),0) DDEL DATMATRIX DATMATRIX=MODMAT DDEL STDMATRIX S TDMATRIX=MODMAT DDEL REALINDEX REALINDEX=MODMAT NPEAKSDEL=NPEAKSDEL+1 QFORTHRESID RESIDORTH=(0,NSTDS)$DRROWREAL SIGN=MVSD RESIDORTH SIGN=(*((1,1)$TASIGN)) RESIDORTH=SIGN*RESIDORTH RESIDPOS=0$MARESIDORTH STDMATRIXNORESID=STDMATRIX,RESIDPOS STDCONCNORESID=STDCONC,1 QFNORESID RESID=(0,NSTDS)$DRROWREAL SIGN=MVSD RESID 'MEAN, VARIANCE, STD DEVIATION OF RESIDUAL' SIGN ($FMNPEAKSDEL),' PEAKS REMOVED AT DIODES ',NAMEPEAKSDEL i i ($FM(1$TA($,REALINDEX))),' DIODES LEFT' / r 'TRY ANOTHER CYCLE] (1=YES)' QCYCLE=1 $>CYCLE IF(QCYCLE$EQ1) "AUT04<#>" " AUT04 [I] 'ENTER DATA MATRIX' [2] DATMATRIX=# [3] 'ENTER STANDARDS MATRIX' [4] STDMATRIX=# [5] 'ENTER CONCENTRATIONS OF STANDARDS' [6] STDCONC=# [ 7] STDCONC=0,STDCONC [8] STDCONC=l$DRSTDCONC [9] REALINDEX=INDEX4096 [10] NSTDS=$,STDCONC [II] NPEAKSDEL=0 [12] NAMEPEAKSDEL=' ' [13] 'ENTER INDEX OF DIODES TO BE USED' [14] PEAKINDEX=# [15] DKEEP DATMATRIX 219 [16] DATMATRIX=MODMAT [17] DKEEP STDMATRIX [18] S TDMATRIX=MODMAT [19] DKEEP REALINDEX [20] REALINDEX=MODMAT [21] QFORTHRESID [22] RESIDORTH=(0,NSTDS)$DRROWREAL [23] SIGN=MVSD RESIDORTH [24] SIGN=(*((1,1)$TASIGN)) [25] RESIDORTH=SIGN*RESIDORTH [26] RESIDPOS=0$MARESIDORTH [27] STDMATRIXNORESID=STDMATRIX,RESIDPOS [28]. STDCONCNORESID=STDCONC,1 [2 9] QFNORESID [30] RESID=(0,NSTDS)$DRROWREAL [31] SIGN=MVSD RESID [32] SIGN [33] 'NO PEAKS REMOVED' [34] ($FM(1$TA($,REALINDEX))),' DIODES LEFT' [35] CYCLE:' ' [36] SIGN=(*((1,1)$TASIGN)) [37] RESID=SIGN*RESID [38] OPFSUM RESID [39] 'LAST DIODE IN EACH WINDOW, SUM OF RESIDUAL IN WINDOW' [40] LASTDIODEOFWINDOW,WINDSUMS [41] PDFSUM [42] NAMEPEAKSDEL=NAMEPEAKSDEL,($FM(,((1,1)$TA(((MAXPOS-1),0) $DRLASTDIODEOFWINDOW)))) [43] DDEL DATMATRIX [44] DATMATRIX=MODMAT [45] DDEL STDMATRIX [4 6] STDMATRIX=MODMAT [47] DDEL REALINDEX [48] REALINDEX=MODMAT [4 9] NPEAKSDEL=NPEAKSDEL+1 [50] QFORTHRESID [51] RESIDORTH= (0,NSTDS) $DRROWREAL [52] SIGN=MVSD RESIDORTH [53] SIGN=(*((1,1)$TASIGN)) [54] RESIDORTH=SIGN*RESIDORTH [55] RESIDPOS=0$MARESIDORTH [56] STDMATRIXNORESID=STDMATRIX,RESIDPOS [57] STDCONCNORESID=STDCONC, 1 [58] QFNORESID [59] RESID=(0,NSTDS)$DRROWREAL [60] SIGN=MVSD RESID [61] 'MEAN, VARIANCE, STD DEVIATION OF RESIDUAL' [62] SIGN [63] ' ' [64] ($FMNPEAKSDEL),' PEAKS REMOVED AT DIODES ',NAMEPEAKSDEL [65] ' ' 220 [66] ($FM(1$TA($,REALINDEX))),' DIODES LEFT' [67] ' ' [68] 'TRY ANOTHER CYCLE] (1=YES)' [69] QCYCLE=1 [70] $>CYCLE IF(QCYCLE$EQ1) "DDEL<#>" " DDEL MAT;MN;M;IND;MI;MNI [1] MN=$,MAT [2] M=MN[1] [3] IND=PEAKINDEX [4] IND= ( (1$TA($, IND) ) , 1) $, IND [5] MNI=$,IND [6] MI=MNI[1] [7] 1=0 [8] BB=M$,1 t9] LOOP:1=1+1 [10] J=IND[I;1] [11] BB[J]=0 [12] $>DELETE IF(I$EQMI) [13] $>LOOP [14] DELETE:' ' [15] MODMAT=$TR(BB%($ TRMAT)) II "DIODEDEL<#>" " DIODEDEL;MAT;MN;M;N;INDEX;DIODVEC;VDM;J;I;DIODNUM [I] 'ENTER MATRIX OF DATA' [2] MAT=# [3] MN=$,MAT [4] M=MN[1] [5] N=MN[2] [6] INDEX=(M,1)$,($.M) [7] MODMAT=INDEX,MAT [8] 'ENTER A VECTOR OF DIODE INDICES TO REMOVE' [9] DIODVEC=# [10] VDM=$,DIODVEC [II] VDM=VDM[1] [12] J=0 [13] L4:J=J+1 [14] DIODNUM=DIODVEC[J/l] [15] 1=1 [16] LOOP1: $* DUMMY LINE [17] $>REMOVE IF((MODMAT[I;l])$EQDIODNUM) [18] 1=1+1 [19] $>LOOPl [20] REMOVE: $* DUMMY LINE [21] MODMAT=$TR(($TR(((1-1), (N+l))$ TAMODMAT)), ($TR( (1,0) $DRMODMAT) ) ) 221 [22] $>L4 IF(J$LTVDM) [23] MODMAT=(0,1)$DRMODMAT [24] 'RESULT IN VARIABLE MODMAT' II "DIODEKEEP<#>" " DIODEKEEP;DAT;MN;M;N;DIODTAKE;MND;MD;ND;I;J [I] 'ENTER MATRIX OF DATA' [2] DAT=# [3] MN=$,DAT [4] M=MN[1] [5] N=MN[2] [6] 'ENTER VECTOR OF DIODE NUMBERS TO SELECT' [7] DIODTAKE=# [8] MND=$,DIODTAKE [9] MD=MND[1] [10] ND=MND[2] [II] 1=1 [12] REDMAT=(MD,N)$,0 [13] START:J=1 [14] START2:REDMAT[I;J]=DAT[(DIODTAKE[I;1]);J] [15] J=J+1 [16] $>START2 IF(J$LEN) [17] 1=1+1 [18] $>START IF(I$LEMD) [19] 'RESULTS IN MATRIX REDMAT' II DKEEP<#>" DKEEP MAT;MN;M;IND;MI;MNI;BB;I;J [1] MN=$, MAT [2] M=MN[1] [3] IND=PEAKINDEX [4] IND=((1$TA($,IND)),1)$,IND [5] MNI=$,IND [6] MI=MNI[1] [7] 1=0 [8] BB=M$,0 [9] LOOP:1=1+1 [10] J=IND[I;1] [11] BB[J]=1 [12] $>DELETE IF(I$EQMI) [13] $>LOOP [14] DELETE:' ' [15] MODMAT=$TR(BB%($TRMAT)) II OPFSUM<#>" 222 " OPFSUM VEC;MN;M [1] SUMWINDOWS VEC [2] MN=$,WINDSUMS [3] M=MN[1] [4] VEC=M$,WINDSUMS [5] MAXPOS=l$TA($GDVEC) "PDFSUM<#>" " PDFSUM [I] REALFLAG=, ( (1/1)$TA ( ( (MAXPOS-1),0)$DRLASTDIODEOFWINDOW)) [2] INDREAL=REALINDEX [3] MN=$,INDREAL [4] M=MN[1] [5] 1=0 [6] LBAD:1=1+1 [7] RF=REALFLAG [8] $>LBAD IF(RF$NE(INDREAL[I;1])) [9] PEAKDIODES=I [10] J=I [II] $>LUPFIN IF(I$EQM) [12] LUP:I=I+1 [13] RF=RF+1 [14] $>LUPFIN IF(RF$NE((INDREAL[I;1]))) [15] PEAKDIODES=PEAKDIODES,I [16] $>LUP IF(I$NEM) [17] LUPFIN:RF=REALFLAG [18] I=J [19] $>LDOWNFIN IF(I$EQ1) [20] LDOWN:I=I-l [21] RF=RF-1 [22] $>LDOWNFIN IF(RF$NE((INDREAL[I;1]))) [23] PEAKDIODES=I,PEAKDIODES [24] $>LDOWN IF(I$NE1) [25] LDOWNFIN:MP=$,PEAKDIODES [26] PEAKDIODES=(MP,1)$,PEAKDIODES [27] PEAKINDEX=PEAKDIODES ii "PLEXDEX<#>" n P LEXDEX;COMPLEXITY;DAT;DIOMAX;DIONEXT;DIONUM;DIOOLD;D10S TACK;I;ID X;INTMAX;INTNEXT;INTSTACK;J;M;MN;MND;N;ND;SPEC;SPECTRUM;SUMINT;NU MPEAKS;PDX;THRESH;MD [1] $* DOUBLE PEAKS WITH AN INSUFFICIENT VALLEY BETWEEN THEM [2] $* WILL ONLY SHOW UP WITH THE POSITION AND INTENSITY OF THE MOST INTENSE [3] $* OF THE TWO PEAKS. THIS LIMITATION MAY BE REMEDIED IN THE FUTURE. [4] $* AUTHOR: DOUGLAS F . WIRSZ. 223 [5] 'ENTER SPECTRUM' [6] SPECTRUM=# [7] $* PEAKFINDING ROUTINE [8] SPEC=SPECTRUM [9] MN=$,SPEC [10] M=MN[1] [11] N=MN[2] [12] 'ENTER THRESHOLD' [13] THRESH=# [14] PDX=0 [15] IDX=$.M [16] SPEC=M$,SPEC [17] 1=0 [18] PI:1=1+1 [19] $>P3 IF(I$GTM) [20] $>P2 IF(SPEC[I]$GTTHRESH) [21] $>P1 [22] P2:PDX=PDX,(IDX[I]) [23] $>P1 [24] P3:PDX=1$DRPDX [25] 'NUMBER OF DIODES ABOVE THRESHOLD = ',$FM($,PDX) [26] PDX=(($,PDX),1)$,PDX [27] DIONUM=PDX [28] $* DIODE SELECTION ROUTINE [29] DAT=SPECTRUM [30] MND=$,PDX [31] MD=MND [ 1] [32] ND=MND[2] [33] 1=1 [34] SPECTRUM=(MD,N)$,0 [35] START:J=1 [36] START2:SPECTRUM[I;J]=DAT[(PDX[I;1]);J] [37] J=J+1 [38] $>START2 IF(J$LEN) [39] 1=1+1 [40] $>START IF(I$LEMD) [41] MN=$,SPECTRUM [42] M=MN[1] [43] SPECTRUM=M$,SPECTRUM [44] DIONUM=M$,DIONUM [45] DIOSTACK=0 [4 6] INTSTACK=0 [47] 1=1 [48] L1:INTMAX=SPECTRUM[I] [49] DIOMAX=DIONUM[I] [50] L2:DIOOLD=DIONUM[I] [51] 1=1+1 [52] $>FINISH IF(I$EQM) [53] DIONEXT=DIONUM[I] [54] INTNEXT=SPECTRUM[I] [55] $>LABEL1 IF(DIONEXT$NE(DIOOLD+1)) [56] $>L2 IF(INTNEXT$LEINTMAX) 224 [57] INTMAX=INTNEXT [58] DIOMAX=DIONEXT [59] $>L2 [60] LABEL1:DIOSTACK=DIOSTACK,DIOMAX [61] INTSTACK=INTSTACK,INTMAX [62] $>L1 [63] FINISH:DIONEXT=DIONUM[I] [64] INTNEXT=SPECTRUM[I] [65] $>LABEL3 IF(DIONEXT$NE(DIOOLD+1)) [6.6] $>LABEL2 IF (INTNEXT$LEINTMAX) [67] LABEL2:DIOSTACK=DIOSTACK,DIOMAX [68] INT S TACK=INT S TACK,INTMAX [69] $>ENDUP [70] LABEL3:DIOSTACK=DIOSTACK,DIONEXT [71] INTSTACK=INTSTACK, INTNEXT [72] ENDUP:PEAKDIODES=l$DRDIOSTACK [73] PEAKINTEN=1$DRINTSTACK [74] M=$,PEAKDIODES [75] PEAKDIODES=(M,1)$,PEAKDIODES [76] PEAKINTEN=(M,1) $,PEAKINTEN [77] 'PEAK POSITION AND INTENSITY IN VARIABLES PEAKDIODES,PEAKINTEN . ' [78] SUMINT=1+.*PEAKINTEN [79] NUMPEAKS=1$TA($,PEAKDIODES) [80] COMPLEXITY=NUMPEAKS*SUMINT [81] ' ' [82] 'NUMBER OF PEAKS FOUND ,= ',$FMNUMPEAKS [83] 'SUM OF PEAK INTENSITIES = ',$FMSUMINT [84] 'COMPLEXITY INDEX = ',$FMCOMPLEXITY [85] ' ' [86] ' ' ii "POSPKFIND<#>" " POSPKFIND;I;INDEX;M;MN;N;SPEC;THRESH [I] 'ENTER TARGET SPECTRUM (OR COMBINATION)' [2] SPEC=# [3] MN=$,SPEC [4] M=MN[1] [5] N=MN[2] [6] 'ENTER THRESHOLD' [7] THRESH=# [8] PEAKINDEX=0 [9] INDEX=$.M [10] SPEC=M$,SPEC [II] 1=0 [12] LI:1=1+1 [13] $>L3 IF(I$GTM) [14] $>L2 IF(SPEC[I]$GTTHRESH) [15] $>L1 [16] L2:PEAKINDEX=PEAKINDEX,(INDEX[I]) 225 [17] $>L1 [18] L 3 : ' ' [19] PEAKINDEX=1$DRPEAKINDEX [20] ($FM($,PEAKINDEX)),' DIODES ABOVE THRESHOLD' [21] PEAKINDEX=(($,PEAKINDEX),1)$,PEAKINDEX "QFACTOR18<#>" "QFACTOR18;AET;AFAC;AX;CDGR;COFF;CONF;CV;DAT;EFL;EGTK/EGVC;EGVI;E GVL;EIG/FACF;I;IE;IND;J;LDGI;LDGR;M;MN;N;NDAT;NMLT;NRMT;NPRM;NAFF ;NUFF;NVX;PRVC;Q;QTE;RT;RDGR;RE;REI;REP;RET;RTNM;RWTT;SMFLG;SP;SP T;SUM;TBAR;TRVC;TTRA;TV;VDIF;VX;XE;XTRA;WTFLG;WTVC;WTVN [I] QWT:'DO YOU WANT WEIGHTING? Y/N' [2] Q=$# [3] ' ' [4] $>YESWT IF((1$TAQ)$EQ'Y') [5] $>NOWT IF((1$TAQ)$EQ'N') [6] $>QWT [7] YESWT:WTFLG=1 [8] 'ENTER WEIGHTING VECTOR' [9] WTVN=$# [10] WTVC=$EXWTVN [II] $>QSMTH [12] NOWT:WTFLG=0 [13] WTVN=' ' [14] WTVC=1 [15] QSMTH:'DO YOU WANT 5 POINT SMOOTHING? Y/N' [16] Q=$# [17] ' ' [18] $>YESSM IF((1$TAQ)$EQ'Y') [19] $>NOSM IF((1$TAQ)$EQ'N') [20] $>QSMTH [21] YESSM:SMFLG=1 [22] $>START [23] NOSM:SMFLG=0 [24] START:'ENTER THE NAME OF YOUR DATA MATRIX' [25] NDAT=$# [26] DAT=$ EXNDAT [27] ' ' [28] $>BEGINWT IF(SMFLG$EQ0) [29] DAT=SMTH5 DAT [30] WTVC=SMTH5 WTVC [31] BEGINWT:$>BEGIN IF(WTFLG$EQ0) [32] DAT=WTVC WEIGHT DAT [33 ] BEGIN:MN=$,DAT [34] M=MN[1] [35] N=MN[2] [36] CV=($TRDAT)+.*DAT [37] EIG=EIGENR CV [38] EGVL=(1, N)$TAEIG [39] EGVC=(-N,N)$TAEIG 226 [40] REALERROR:1=0 [41] RE=0 [42] NPRM=0, (N-l)$TA($.N) [43] LI:1=1+1 [44] EGTK=(0,1)$DREGVL [45] EGTK=(N-I)$,EGTK [46] J=0 [47] SUM=0 [48] INL1:J=J+1 [49] SUM=SUM+EGTK[J] [50] $>INL1 IF(J$LT(N-I)) [51] $* NEXT LINE INSURES +VE SUM, PREVENTS SQRT OF _VE # [52] REI=((SUM/(M*(N-I)))@2)60.25 [53] RE=RE,REI [54] $>L1 IF(I$LT(N-1)) [55] IE=RE*((NPRM/N)@0.5) [56] XE=RE*(((N-NPRM)/N)60.5) [57] IND=RE/((N-NPRM)62) [58] RT=(N-l)$TA((N$,EGVL)/(l$RO(N$,EGVL))) [59] QRE:'FULL ERROR DETERMINATIONS FOR 1 THRU ' , ( $ F M ( N - l ) ) , ' POSSIBLE FACTORS Y/N' [60] Q=$# [61] ' ' [62] $>PERR IF((1$TAQ)$EQ'Y') [63] $>SETNUMFAC IF((1$TAQ)$EQ'N') [64] $>QRE [65] PERR:'THE EIGENVALUES ARE: ' [66] $TREGVL [67] ' ' [68] 'REAL ERRORS:' [69] 1$DRRE [70] ' ' [71] 'IMBEDDED ERRORS:' [72] 1$DRIE [73] ' ' [74] 'EXTRACTED (RMS) ERRORS:' [75] 1$DRXE [76] ' ' [77] SETNUMFAC:'INDICATOR FUNCTION:' [78] 1$DRIND [79] ' ' [80] 'RATIO FUNCTION:' [81] RT [82] ' ' [83] 'SELECT THE NUMBER OF FACTORS TO BE FOUND' [84] NPRM=# [85] ' ' [86] NUFF=0 [87] FACF=(M,1)$,0 [88] AFAC=(M,1)$,0 [89] NAFF=' ' [90] SP=0 227 [91] COFF=0 [92] QSHORTERR:'FULL ERROR ANALYSIS OF TARGET TESTING? Y/N' [93] QTE=$# [94] ' ' [95] $>QSHORTERR IF(((1$TAQTE)$NE'Y' ) & ((1$TAQTE)$NE'N')) [96] TTRA=(NPRM,NPRM)$,0 [97] CDGR=$TR((~N,-N)$TAEIG) [98] CDGR=(NPRM,-N)$TACDGR [99] RDGR=(DAT)+.*($TRCDGR) DIAG2:I=NPRM LDGR=(NPRM,NPRM)$, 0 LDGI=LDGR EGVI=1/EGVL DIAG:LDGR[I;I]=EGVL[1;I] LDGI[I;I]=EGVI[1;I] 1=1-1 $>DIAG IF I$GT0 $>TESTS TESTS:'THE NUMBER OF FACTORS TO BE FOUND IS ' , ($FMNPRM),' ' , ($FMNUFF,' FACTORS FOUND SO FAR.' 'THE FACTORS FOUND SO FAR ARE: ' NAFF [100 [101 [102 [103 [104 [105 [106 [107 [108 [109 WITH [110 [111 [112 [113 [114 [115 [116 [117 [118 [119 [120 [121 [122 [123 [124 [125 [126 [127 [128 [129 [130 [131 [132 [133 [134 [135 [136 [137 [138 [139 [140 [141 'ENTER A TEST VECTOR, OR:' '(1) TO COMPUTE RESIDUAL VECTORS' '(2) TO RESET THE NUMBER OF FACTORS' RTNM=$# i i $>SETNUMFAC IF(1$TARTNM$EQ'2') $>XFAC IF(1$TARTNM$EQ'1') RWTT=$EXRTNM RWTT=(M,1)$,RWTT $>SMWARD IF(SMFLG$EQ0) RWTT=SMTH5 RWTT SMWARD:$>ONWARD IF(WTFLG$EQ0) RWTT=WTVC WEIGHT RWTT ONWARD:TRVC=($/LDGR)+.*($TRRDGR)+.*(RWTT) PRVC=RDGR+.*TRVC VDIF=PRVC-RWTT VDIF=M$,VDIF VDIF=VDIF@2 SUM=VDIF+.*1 AET=((SUM)/M)@0.5 TV=NPRM$, TRVC REP=(RE[NPRM+1])*((TV+.*TV)@0.5) RET=(((AET@2)-(REP@2))@2)@0.25 SPT=RET/REP $>SHORTERR IF((1$TAQTE)$NE' Y' ) 'APPARENT ERROR IN THE TEST VECTOR = ',$FM(AET) 'RMS ERROR IN THE PREDICTED VECTOR = ',$FM(REP) 'REAL ERROR IN THE TARGET VECTOR = ',$FM(RET) SHORTERR:'SPOIL (0-3 GOOD,3-6 FAIR,6+ POOR) = ',$FM(SPT) 228 [142] ' ' [143] ENTERFAC:'ENTER THIS TEST VECTOR AS A REAL FACTOR? Y/N' [144] Q=$# [145] ' ' [146] $>TESTS IF((1$TAQ)$EQ'N') [147] $>ADDFAC IF((1$TAQ)$EQ'Y') [14 8] $>ENTERFAC [14 9] ADDFAC:TTRA=TTRA,TRVC [150] NUFF=NUFF+1 [151] FACF=FACF,($EXRTNM) [152] NAFF=NAFF,' ',RTNM [153] SP=SP,SPT [154] 'ENTER CONCENTRATION OF STANDARD USED' [155] '(OR ENTER 1 IF NOT APPLICABLE)' [156] Q=# [157] COFF=COFF,Q [158] $>ALLFACFOUND IF((2*NPRM)$EQ(($,TTRA)[2])) [159] 'THIS TEST VECTOR HAS BEEN ADDED TO THE SET OF REAL FACTORS' [160] ' ' [161] ' ' [162] $>TESTS [163] ALLFACFOUND:' ' [164] ^QFACTOR HAS FOUND ',($FMNPRM),' FACTORS' [165] TTRA=(NPRM,-NPRM)$TATTRA [166] ROWREAL=RDGR+.*TTRA [167] COLREAL=($/TTRA)+.*CDGR [168] I=NPRM [169] COFF=l$DRCOFF [170] COFF=COFF,(NPRM-NUFF)$,1 [171] CONF=(NPRM,NPRM)$,0 [172] DIAG4:CONF[I;I]=COFF[I] [173] 1=1-1 [174] $>DIAG4 IF I$GT0 [175] CONF=CONF+.*COLREAL [176] ' ' [177] 'THE DATA MATRIX CONTAINED:' [178] NDAT [179] ' ' [180] 'THE RATIO FUNCTION WAS:' [181] RT [182] ' ' [183] 'NO WEIGHTING WAS USED' IF(WTFLG$EQ0) [184] 'THE WEIGHT VECTOR WAS:' IF(WTFLG$EQ1) [185] WTVN IF(WTFLG$EQ1) [186] ' ' [187] 'NO 5 POINT SMOOTHING WAS DONE' IF(SMFLG$EQ0) [188] '5 POINT SMOOTHING WAS DONE' IF(SMFLG$EQ1) [189] ' ' [190] 'THE FACTORS FOUND WERE:' [191] NAFF [192] ' ' 229 'SPOIL VALUES FOR THESE FACTORS WERE:' 1$DRSP 'THE CONCENTRATIONS OF THE STANDARDS WERE:' COFF 'THE CONCENTRATIONS OF THESE FACTORS IN EACH SAMPLE WERE:' $TRCONF TBAR=($/TTRA)+.*(LDGI0O.5) EFL=(RE[NPRM+1])*((0+.+(($TRTBAR)@2) )@0.5) 'ESTIMATED ERRORS (FACTOR LOADING ERRORS) WERE:' EFL=NPRM$,EFL COFF*EFL CHANGEFAC:'TRY A DIFFERENT NUMBER OF FACTORS? Y/N' Q=$# $>SETNUMFAC IF((1$TAQ)$EQ'Y') $>END IF((1$TAQ)$EQ'N') $ >CHANGEFAC XFAC:'RESIDUAL FACTORS BEING CALCULATED' FACF=(0,1)$DRFACF 1=0 LI1:1=1+1 VX=(0, (1-1))$DR((M,I)$ TAFACF) NVX=VX/((($TRVX)+.*VX)@0.5) AX=NVX J=0 LJ1:J=J+1 -((($TRNVX)+.*((0, (J-l))$DR((M,J)$ TAAFAC)))*((0, (J - l ) ) M,J)$ TAAFAC))) $>LJ1 IF(J$LTI) AX=AX/((($TRAX)+.*AX)@0.5) AFAC=AFAC, AX $>LI1 IF (I$LTNUFF) I=NUFF NRMT=I$, 1 LI2:I=I+1 VX=(0,(1-1))$DR((M,I)$TARDGR) NMLT=((($TRVX)+.*VX)@0.5) NVX=VX/NMLT NRMT=NRMT,(1$,NMLT) AX=NVX J=0 LJ2:J=J+1 230 [241] $>LI2 IF(I$LTNPRM) [242] AFAC=(0,1)$DRAFAC [243] I=NUFF [244] LI3:1=1+1 [245] RWTT=(NRMT[I])*((0,1-1)$DR((M,I)$TAAFAC)) [246] TRVC=($/LDGR)+.*($TRRDGR)+.*(RWTT) [247] .TTRA=TTRA,TRVC [248] $>LI3 IF(I$LTNPRM) [249] XTRA='S' IF((NPRM-NUFF)$NE1) [250] NAFF=NAFF,' ' , ($FM(NPRM-NUFF)) , ' EXTRA FACTOR',XTRA [251] $ >ALLFACFOUND [252] END:'DONE' II "QFNORESID<#>" it QFNORESID;AET;AFAC;CDGR;COFF;CV;DAT;EGVC;EGVL;FACF;I;IE;IND;J;LDG I;LDGR;M;MN;PRVC;Q;RDGR;RE;RET;RT;TRVC;TTRA;TV;VDIF;XE;EFL;EGTK;E GVI;EIG;NAFF;NMLT;NPRM;NRMT;NUFF;NVX;REI;REP;RWTT;SP;SPT;SUM;TEAR ;VX;XTRA [I] DAT=DATMATRIX [2] ' ' [3] BEGIN:MN=$,DAT [4] M=MN[1] [5] N=MN[2] [6] CV=($TRDAT)+.*DAT [7] EIG=EIGENR CV [8] EGVL=(1,N)$TAEIG [9] EGVC=(-N,N)$TAEIG [10] REALERROR:1=0 [II] RE=0 [12] NPRM=0,(N-l)$TA($.N) [13] LI:1=1+1 [14] EGTK=(0,I)$DREGVL [15] EGTK=(N-I)$,EGTK [16] J=0 [17] SUM=0 [18] INL1:J=J+1 [19] SUM=SUM+EGTK[J] [20] $>INL1 IF(J$LT(N-I)) [21] $* NEXT LINE INSURES +VE SUM, PREVENTS SQRT OF _VE # [22] REI=((SUM/(M*(N-I)))@2)@0.25 [23] RE=RE,REI [24] $>L1 IF(I$LT(N-l ) ) [25] IE=RE*((NPRM/N)@0.5) [26] XE=RE*(((N-NPRM)/N)@0•5) [27] IND=RE/((N-NPRM)@2) [28] RT=(N-l)$TA((N$,EGVL)/(l$RO(N$,EGVL))) [29] Q='N' ,[30) ' ' [31] MNS=$,STDMATRIXNORESID 231 [32] NS=MNS[2] [33] NPRM=NS [34] ' ' [35] NUFF=0 [36] FACF=(M,1)$,0 [37] AFAC=(M,1) $, 0 [38] NAFF=' ' [39] SP=0 [40] COFF=0 [41] TTRA=(NPRM,NPRM)$, 0 [42] CDGR=$TR((-N,-N)$TAEIG) [43] CDGR=(NPRM,-N)$TACDGR [44] RDGR=(DAT)+•*($TRCDGR) [45] DIAG2:I=NPRM [4 6] LDGR=(NPRM,NPRM)$, 0 [47] LDGI=LDGR [48] EGVI=1/EGVL [49] DIAG:LDGR[I;I]=EGVL[1;I] [50] LDGI[I;I]=EGVI[1;I] [51] 1=1-1 [52] $>DIAG IF I$GT0 [53] TESTS:' ' [54] STD=(0,NUFF)$DRS TDMATRIXNORE SID [55] RWTT=(M,1)$TASTD [56] STDVEC=RWTT [57] ONWARD:TRVC=($/LDGR)+.*($TRRDGR)+.*(RWTT) [58] PRVC=RDGR+.* TRVC [59] VDIF=PRVC-RWTT [60] VDIF=M$,VDIF [61] VDIF=VDIF@2 [62] SUM=VDIF+.*1 [63] AET=((SUM)/M)@0.5 [64] TV=NPRM$,TRVC [65] REP=(RE[NPRM+1])*((TV+.*TV)@0.5) [66] RET=(((AET@2)-(REP@2))@2)@0.25 [67] SPT=RET/REP [68] SHORTERR:' ' [69] Q='Y' [70] $>TESTS IF((1$TAQ)$EQ'N') [71] $>ADDFAC IF((1$TAQ)$EQ'Y') [72] $>ENTERFAC [73] ADDFAC:TTRA=TTRA,TRVC [74] NUFF=NUFF+1 [75] FACF=FACF,STDVEC [76] SP=SP,SPT [77] $>ALLFACFOUND IF((2*NPRM)$EQ(($,TTRA) [2])) [78] $>TESTS [7 9] ALLFACFOUND:' ' [80] 'QFACTOR HAS FOUND ',($FMNPRM),' FACTORS' [81] TTRA=(NPRM,-NPRM)$TATTRA [82] ROWREAL=RDGR+.*TTRA [83] COLREAL=($/TTRA)+.*CDGR 232 [84] I=NPRM [85] COFF=STDCONCNORESID [86] COFF=COFF,(NPRM-NUFF) $, 1 [87] CONF=(NPRM,NPRM)$, 0 [88] DIAG4:C0NF[I;I]=C0FF[I] [89] 1=1-1 [90] $>DIAG4 IF I$GT0 [91] CONF=CONF+.*COLREAL [92] 'THE INDICATOR FUNCTION WAS:' [93] 1$DRIND [94] ' ' [95] 'THE RATIO FUNCTION WAS:' [96] RT [97] ' ' [98] 'RATIO/INDICATOR:' [99] RT/(1$DRIND) [100] ' ' [101] 'SPOIL VALUES FOR THESE FACTORS WERE:' [102] 1$DRSP [103] ' ' [104] 'THE CONCENTRATIONS OF THE STANDARDS WERE:' [105] COFF [106] ' ' [107] 'THE CONCENTRATIONS OF THESE FACTORS IN EACH SAMPLE WERE:' [108] $ TRCONF [109] ' ' [110] TBAR=($/TTRA)+.*(LDGI@0.5) [111] EFL=(RE[NPRM+1])*((0+.+(($TRTBAR)82))@0.5) [112] 'ESTIMATED ERRORS (FACTOR LOADING ERRORS) WERE:' [113] EFL=NPRM$,EFL [114] COFF*EFL [115] ' ' [116] $>END [117] XFAC:' ' [118] FACF=(0,1)$DRFACF [119] 1=0 [120] LI1:1=1+1 [121] VX=(0, (1-1))$DR((M,I)$ TAFACF) [122] NVX=VX/((($TRVX)+.*VX)@0.5) [123] AX=NVX [124] J=0 [125] LJ1:J=J+1 [126] AX=AX-((($TRNVX)+.*((0,(J- l ) )$DR((M,J)$TAAFAC)))*((0,(J- l ) ) $DR( (M, J) $ TAAF AC) ) ) [127] $>LJ1 IF(J$LTI) [128] AX=AX/((($TRAX)+.*AX)@0.5) [129] AFAC=AFAC,AX [130] $>LI1 IF(I$LTNUFF) [131] I=NUFF [132] NRMT=I$,1 [133] LI2:I=I+1 233 VX=(0,(1-1))$DR((M,I)$TARDGR) NMLT=((($TRVX)+.*VX)@0.5) NVX=VX/NMLT NRMT=NRMT/(1 $,NMLT) AX=NVX J=0 LJ2:J=J+1 [134] [135] [136] [137] [138] [139] [140] [141] AX=AX-((($TRNVX)+.*((0,(J- l ) )$DR((M,J)$TAAFAC)))*((0,(J- l ) ) $DR( (M, J) $ TAAF AC) ) ) [142] $>LJ2 IF(J$LTI) [143] AX=AX/((($TRAX)+.*AX)@0.5) [144] AFAC=AFAC,AX [145] $>LI2 IF(I$LTNPRM) [146] AFAC=(0/1)$DRAFAC [147] I=NUFF [148] LI3:1=1+1 [149] RWTT=(NRMT[I])*((0/1-1)$DR((M/I)$TAAFAC)) [150] TRVC=($/LDGR)+.*($TRRDGR)+.*(RWTT) [151] TTRA=TTRA/TRVC [152] $>LI3 IF(I$LTNPRM) [153] XTRA='S' IF((NPRM-NUFF)$NE1) [154] NAFF=NAFF/' ($FM(NPRM-NUFF)),' EXTRA FACTOR',XTRA [155] $ >ALLFACFOUND. [156] END:' ' "QFORTHRESID<#>" II QFORTHRE SID;AET;AFAC;CDGR;COFF;CV;DAT;EGVC;EGVL;FACF;I;IE;IND;J;L DGI;LDGR/M;MN;PRVC;Q;RDGR;RE;RET;RT;TRVC;TTRA;TV;VDIF;XE;EFL;EGTK ;EGVI;EIG;NAFF;NMLT;NPRM;NRMT;NUFF;NVX;REI;REP;RWTT;SP;SP T;SUM;TB AR;VX;XTRA [I] DAT=DATMATRIX [2] BEGIN:MN=$,DAT [3] M=MN[1] [4] N=MN[2] [5] CV=($TRDAT)+.*DAT [6] EIG=EIGENR CV [7] EGVL=(1,N)$TAEIG [8] EGVC=(-N,N)$TAEIG [9] REALERROR:1=0 [10] RE=0 [II] NPRM=0,(N-l)$TA($.N) [12] LI:1=1+1 [13] EGTK=(0,1)$DREGVL [14] EGTK=(N-I)$,EGTK [15] J=0 [16] SUM=0 [17] INL1:J=J+1 [18] SUM=SUM+EGTK[J] 234 [19] $>INL1 IF(J$LT(N-I)) [20] $* NEXT LINE INSURES +VE SUM, PREVENTS SQRT OF _VE # [21] REI=((SUM/(M*(N-I)))@2)@0.25 [22] RE=RE,REI [23] $>L1 IF(I$LT(N-l ) ) [24] IE=RE*((NPRM/N)@0.5) [25] XE=RE*(((N-NPRM)/N)@0.5) [26] IND=RE/((N-NPRM)@2) [27] RT=(N-l)$TA((N$,EGVL)/(l$RO(N$,EGVL))) [28] Q='N' [29] MNS=$,STDMATRIX [30] NS=MNS[2] [31] NPRM=NS+1 [32] NUFF=0 [33] FACF=(M,l)$ / 0 [34] AFAC=(M,1) $, 0 [35] NAFF=' ' [36] SP=0 [37] COFF=0 [38] TTRA=(NPRM,NPRM)$,0 [39] CDGR=$TR((-N,-N)$TAEIG) [40] CDGR=(NPRM,-N)$TACDGR [41] RDGR=(DAT)+.*($TRCDGR) [42] DIAG2:I=NPRM [43] LDGR=(NPRM,NPRM)$,0 [44] LDGI=LDGR [45] EGVI=1/EGVL [46] DIAG:LDGR[I;I]=EGVL[1;I] [47] LDGI[I;I]=EGVI[1;I] [48] 1=1-1 [49] $>DIAG IF I$GT0 [50] TESTS:' ' [51] $>XFAC IF(NUFF$EQNS) [52] STD=(0,NUFF)$DRSTDMATRIX [53] RWTT=(M,1)$TASTD [54] STDVEC=RWTT [55] ONWARD:TRVC=($/LDGR)+.*($TRRDGR)+.*(RWTT) [56] PRVC=RDGR+.*TRVC [57] VDIF=PRVC-RWTT [58] VDIF=M$,VDIF [59] VDIF=VDIF@2 [60] SUM=VDIF+.*1 [61] AET=((SUM)/M)@0.5 [62] TV=NPRM$,TRVC [63] REP=(RE[NPRM+1])*((TV+.*TV)@0.5) [64] RET=(((AET@2)-(REP@2))@2)@0.25 [65] SPT=RET/REP [66] SHORTERR:' ' [67] Q='Y' [68] $>TESTS IF((1$TAQ)$EQ'N') [69] $>ADDFAC IF((1$TAQ)$EQ'Y' ) [70] $>ENTERFAC 235 [71] ADDFAC:TTRA=TTRA,TRVC [72] NUFF=NUFF+1 [73] FACF=FACF,STDVEC [74] SP=SP /SPT [75] $>ALLFACFOUND IF((2*NPRM)$EQ(($,TTRA) [2] ) [76] $>TESTS [77] ALLFACFOUND: ' ' [78] TTRA=(NPRM,-NPRM)$TATTRA [79] ROWREAL=RDGR+.* TTRA [80] COLREAL=($/TTRA)+.*CDGR [81] I=NPRM [82] COFF=STDCONC [83] COFF=COFF,(NPRM-NUFF)$,1 [84] CONF=(NPRM,NPRM)$,0 [85] DIAG4:CONF[I;I]=COFF[I] [86] 1=1-1 [87] $>DIAG4 IF I$GT0 [88] CONF=CONF+.*COLREAL [89] TBAR=($/TTRA)+.*(LDGI0O.5) [90] EFL=(RE[NPRM+1])*((0+. + (($TRTBAR)@2) ) 60.5) [91] EFL=NPRM$,EFL [92] $>END [93] XFAC:' ' [94] FACF=(0,1)$DRFACF [95] 1=0 [96] 1,11:1 = 1 + 1 [97] VX=(0,(1-1))$DR((M,I)$TAFACF) [98] NVX=VX/((($TRVX)+.*VX)@0.5) [99] AX=NVX [100] J=0 [101] LJ1:J=J+1 [102] AX=AX-((($TRNVX)+.*((0,(J- l ) )$DR((M,J)$TAAFAC)))*((0,(J- l ) ) $DR((M,J)$ TAAFAC))) [103] $>LJ1 IF(J$LTI) [104] AX=AX/((($TRAX)+.*AX)@0.5) [105] AFAC=AFAC,AX [106] $>LI1 IF (I$LTNUFF) [107] I=NUFF [108] NRMT=I$,1 [109] LI2:I=I+1 [110] VX=(0,(1-1))$DR((M,I)$TARDGR) [111] NMLT=((($TRVX)+.*VX)@0.5) [112] NVX=VX/NMLT [113] NRMT=NRMT,(1$,NMLT) [114] AX=NVX [115] J=0 [116] LJ2:J=J+1 [117] AX=AX-((($TRNVX)+.*((0,(J- l ) )$DR((M,J)$TAAFAC)))*((0,(J- l ) ) $DR((M,J)$TAAFAC))) [118] $>LJ2 IF(J$LTI) 236 [119] AX=AX/((($TRAX)+.*AX)@0.5) [120] AFAC=AFAC,AX [121] $>LI2 IF(I$LTNPRM) [122] AFAC=(0,1)$DRAFAC [123] I=NUFF [124] LI3:I=I+1 [125] RWTT=(NRMT[I])*((0,1-1)$DR((M,I)$ TAAFAC)) [126] TRVC=($/LDGR)+.*($TRRDGR)+.*(RWTT) [127] TTRA=TTRA,TRVC [128] $>LI3 IF (I$LTNPRM) [129] XTRA='S' IF((NPRM-NUFF)$NE1) [130] NAFF=NAFF,' ' ,($FM(NPRM-NUFF)),' EXTRA FACTOR',XTRA [131] $>ALLFACFOUND [132] END:' ' "SMTH5<#>" " OUT=SMTH5 DAT;DATM1;DATP1;DATM2;DATP2;DAT;M;N;MN [1] OUT=( ($TR( (-2) $RO($TRDAT) ) ) + ($TR (2$RO ($TRDAT) ) ) + (4* ($TR( (-1) $RO($ TRDAT)))--+(4*($TR(l$RO($TRDAT) ) ) ) +(6*DAT))/16 it "WEIGHT<#>" " OUT=WTVC WEIGHT MAT;MN;M;N;I;MATVEC [I] MN=$,MAT [2] M=MN[1] [3] N=MN[2] [4] OUT=(M, 1)$,0 [5] 1=0 [6] LI5:I=I+1 [7] MATVEC=(0,(1-1))$DR((M,I)$TAMAT) [8] MATVEC=WTVC*MATVEC [9] OUT=OUT,MATVEC [10] $>LI5 IF(I$LTN) [II] OUT=(0, 1)$DROUT ii The fo l lowing programs were used for eigenvector ana lys i s , and were ava i lab le from the UBC computing center. "EIGEN<#>" " VEC=EIGEN H;L1;K1;D;DD;Z;WR;WI;IER [1] H=_EQRH2F _EHESSF _EBALAF H [2] $>(IER$GT0)%L100 [3] $>L200,,VEC= SEPR EBBCKF EHBCKF Z 237 [4] [5] L100:VEC=(2 1 , $,WR)$,WR,WI L200:$>0 ii "EIGENR<#>" " VEC=EIGENR H [1] VEC=(EIGEN H)[1;;] n "_EBALAF<#>" " A=_EBALAF OLDA;N;I;J;C;R;S;F;NOCONV;IKl/KL [I] $*TRANSLATED FROM IMSL FORTRAN CODE BY DAN PRECHT, APPL. GROUP, COMPUTING SERVICES DEPT., U. OF ALBERTA [2] K1=($.0)$,1+$,DD=N$,N=($,A=OLDA)[1]+L1=0 [3] L5:$>(1$GT$,IK1=$.K1=K1-1)%L35 [4] $>(&%0$NEJ=(+%$MOA[IK1;IK1])-$MO 1 1 $TRA[IK1;IK1])%L35 [5] $>(K1$EQDD[K1]=J=_1$TA(J$EQ0)%IK1)%L5 [6] A[IK1;J,Kl]=A[IK1;K1,J] [7] $ > L 5 , , A [ J , K l ; ] = A [ K l , J ; ] [8] L35:$>(K1$LTL1)%0,0$,KL=(_1+L1=L1+1)$DRIK1 [9] $>(&%0$NEJ=(+$Cl$MOA[KL;KL])-$MO 1 1 $TRA[KL;KL])%L65 [10] $>(L1$EQDD[L1]=J=_1$TA(J$EQ0)%KL)%L35 [II] A[IK1;J ,LI]=A[IK1;L1,J] [12] $>L35,,A[J,L1;I]=A[L1,J;I=(L1-1)$DR$.N] [13] L65:DD[KL]=1 [14] L75:I=L1+NOCONV=0 [15] L1152:C=(+$Cl$MOA[KL;I])-J=$MO 1 1 $TRA[I;1=(1-1)$DR$.Kl] [16] S=C+R=(+%$MOA[I;KL])-J [17] C=C*256@J=$MA256$@R/16*C [18] $>(&%J=((C+R)/F=16@J)$GES*0.95)%L75*NOCONV [19] DD[I]=DD[I=l$TA($NOJ)%I]*(F=1$TA($NOJ)%F)*NOCONV=l [20] A[I;J]=A[I;J=(Ll-1)$DR$.N]*/F [21] A[IK1;I]=A[IK1;I]*F [22] $>((K1$GEI=I+1),1)%L1152,L75 n "_EBBCKF<#>" " A=_EBBCKF OLDA;I;N;J [1] $*WRITTEN BY DAN PRECHT, APPL. GROUP, COMPUTING SERVICES D E P T . , U . OF ALBERTA TO PERFORM THE FUNCTION [2] $*DESCRIBED IN THE DOCUMENTATION FOR THE IMSL ROUTINE OF THE SAME NAME [3] 1=(Ll-1)$DRKl$TA$.N=l$TA$,A=OLDA [4] A[I; ] = ($TR(N,$,I) $,DD[I] ) *A[I;] [5] DD[I]=I=((L1$LE$.N)&(K1$GE$.N))%$.N [6] DD[D][J?]=J=$.L1-1 [7] DD[DD]DD[J??]=J=K1$DR$.N [8] A[DD;]=A II 238 "_EHBCKF<#>" 11 Z=_EHBCKF OLDZ; I; LM2; KI / LTEMP;M;MA;MP2 ; T [1] $*TRANSLATED FROM IMSL FORTRAN CODE BY DAN PRECHT, APPL. GROUP, COMPUTING SERVICES D E P T . , U . OF ALBERTA [2] Z=OLDZ [3] $>(Ll$GTLM2=Kl-2)%0 [4] LTEMP=LM2+KI=L1 [5] L302:$>(0$EQT=/H[MA=M+1;M=LTEMP-KI])%L30 [6] $>(Kl$LTMP2=M+2)%L10 [7] D[I]=H[I=(MP2-1)$DR$.K1;M] [8] L10:$>(MA$GTK1)%L30 [9] Z[I;]=Z[I;]+D[I]$:.*(/T*D[MA])*D[I]+.*Zt1=(MA-1)$DR$.Kl;] [10] L30:$>(LM2$GEKI=KI+1)%L302 it "_EHESSF<#>" " A=_EHESSF OLDA;TOL;N;M;MMl;F;H;G;I;J;R;T [1] $*TRANSLATED FROM IMSL FORTRAN CODE BY DAN PRECHT, APPL. GROUP, COMPUTING SERVICES D E P T . , U . OF ALBERTA [2] D=(N=l$TA$,A=OLDA)$,0 [3] $>((Kl-1)$LTM=1+L1)%0*TOL=0.243087*10@_62 [4] L50:$>(TOL$GTH=+%F*D[I]=F=A[I=$ROMMl$DR$.Kl;MM1=M-1])%L45+G=0 [5] D[M]=F-G=(H@0.5)*(-F$GE0)+0$GTF=_1$TAF [6] A[I;J]=A[I;J]-D[I]$:.*(D[I]+.*A[I;J=MM1$DR$.N])/H=H-F*G [7] A[J;I ]=A[J;I ] - ( (A[J=$ .Kl ; I ]+ .*D[I ] ) /H)$: .*D[I ] [8] L45:A[M;MM1]=G [9] $>(K1$GTM=M+1)%L50 II "_EQRH2F<#>" " H=_EQRH2F OLDH;A;Al;B;EN;I;II;IL;IN;ITS;J;LM1;M;TOL;MM;N;NA;NM1;NN;NORM--N PL; P; Q;R; RA; S; T; T l ; T2; T3; W; WK;X; Y; ZZ [1] $*TRANSLATED FROM IMSL FORTRAN CODE BY DAN PRECHT, APPL. GROUP, COMPUTING SERVICES D E P T . , U . OF ALBERTA [2] WK=((J=$MI%3,N) ,N)$,II=($.0)$,1$,,Z=IN$:.$EQIN=$.N=1$TA$,H=OLDH [3] L10:WK[II;I]= 1 1 $TRH[I;(-II)+I=II$DRIN] [4] $>(J$GEII=II+1)%L10 [5] WR=WI=N$,T=IER=0*($,IL=(LM1=L1-1)$DR$.K1)*TOL=0.22204*10@_15 [6] WR[I]= 1 1 $TRH[I;1=((IN$LTL1)$ORIN$GTEN=Kl)%IN] [7] L20:$>(EN$LTL1)%L160+ITS=0 [8] NA=EN-1 [9] $>(EN$EQB=L1)%L40 [10] L25:J=($MO 1 1 $TRH[B;B-1])$LETOL*($MO 1 1 $TRH[B-1;B-1])+$MO 1 1 $TRH[B;BEN+L1-LM1$DR$.NA] 239 [11] B=1$TA(J, 1) %B,L1 [12] L40:X=,H[EN;EN] [13] $>(B$NEEN)%L125 [14] $>L20,(EN=NA), , H[EN;ITS%EN,NA]=((,H[EN;ITS%EN,NA])*ITS% 1 0)+(ITS=1,1$NEE)%T,WI[EN]=0*WR[EN]=X+T [15] L125:Y=,H[NA;NA] [16] R=,H[EN;NA]*H[NA;EN] [17] $>((B$EQNA),ITS$EQ 10 20 30)%L130,L45,L45,L310 [18] S=X+Y [19] $>L55,,Y=(X*Y)-R [20] L45:T=T+X [21] H[I;I]=H[I;I]-X*I$:.$EQI=LM1$DR$.EN [22] S=1.5*Y=,($MOH[EN;NA])+$MOH[NA;EN-2] [23] Y=Y*Y [24] L55:Q=(R= 1 1 $TRH[M+1;M])*(X= 1 1 $TRH[M;M])+(ZZ= 1 1 $TRH[1+M;1+M=NA-$.NB])-S [25] P=(X*(X-S))+Y+R* 1 1 $TRH[M;M+1] [26] P=P*($,P)$,/+$Cl$MOP=(3,$,P)$,P,Q,R* 1 1 $TRH[M+2;M+1] [27] J=(($MO 1 1 $TRH[M;M-1])*+$Cl$MO(0 _1)$DRP[2 3 ;])$LETOL*($MO_l$DRP[1;])*--$MO 1 1 $TRH[M-1;(M=_1$DRM)-1])+($MO_l$DRX)+$MO_l$DRZZ [28] P=,P[;MM=NA-A=M=1$TA(J,1)%M,B] [29] H[I;I-2]=H[I;I-2]*I$:.$NEI=(M+1)$DR$.EN [30] H[I;I-3]=H[I;I-3]*I$:.$NEI=1$DRI [31] S=( (-P[1]$LT0)+P[1]$GE0) * (+%P@2)@0.5 [32] $>(B$EQM)%L75,0$,A1=A,(A+l),(A$NENA)%A+2 . [33] $>L75, /H[A;A-1]=-H[A;A-1] [34] L1202:$>(0$EQX=+%$MOP=/H[A1=A/(A+l),(A$NENA)%A+2;A-1])%L120 [35] H[A;A-1]=-X*S=((-P[1]$LT0)+P[1]$GE0)*(+%(P=P/X)@2)@0.5 [36] L75:H[Al;J]=H[Al;J]-(P=(P$:.*P)/S*P[1]=P[1]+S) +.*H[A1;J=(A-l)$DRIN] [37] H[I;A1]=H[I;A1]-H[I=$.$MI%EN,A+3;A1]+.*$TRP [38] Z[IL;A1]=Z[IL;A1]-Z[IL;A1]+.*$TRP [39] L120:$>(NA$GEA=A+1)%L1202 [40] $>L25,ITS=ITS+1 [41] L130:ZZ=($MOQ=R+P*P=0.5*Y-X)@0.5 [42] H[NA;I]=H[NA;I=NA-NA$NE1]*NA$EQ1 [43] H[I;I]=H[I;I]+T*I$:.$EQI=NA,EN [44] $>(Q$LE0*X=X+T)%L150 [45] WR[I]=X+ZZ,-R/ZZ=P+ZZ*(P$GEO)-P$LTWI[I]=0 [4 6] Q=(ZZ,X)/((X*X=,H[EN;NA])+ZZ*ZZ)@0.5 [47] H[I;J]=(2,$,J)$,( ,Q+.*H[I;J]) , ,Q-.*$R1H[I;J=(NA-1)$DRIN] [48] H [ J ; I] =$TR (2, $, J) $, (,H[ J ; I] + . *Q) , , ($ROH [ J=$ .EN; I] ) - . *Q [49] Z[IL;I]=$TR(2,$ ,IL)$, (, Z [IL; I] +. *Q) , , ($ROZ [IL; I]) - . *Q [50] $>L155,H[EN;NA]=0 [51] L150:WR[I]=2$,X+P [52] WI[I]=ZZ,-ZZ [53] L155:$>L20,EN=EN-2 [54] L160:$>((N$EQ1)$OR0$EQNORM=+%+%$MOH*IN$:.$LEIN)%0*NN=2 [55] L2602:P=WR[EN=($.0)$,N+2-NN] [56] $>(0$NEQ=WI[EN])%L205 [57] NA=EN-,H[EN;M=EN]=II=1 240 [58] L2002:W=,H[I;I=EN-II]-P [59] R=,H[I;EN] [60] $>(M$GTNA)%L180 [61] R=R+,H[I;J]+.*H[J=(M-1)$DR$.NA;EN] [62] L180:$>(WI[I]$GE0)%L185 [63] $>L200,(S=R),ZZ=W [64] L185:$>(WI[M=I]$NE0)%L190 [65] $>L2 0 0,H[I;EN]=-R/T=W+TOL*NORM*W$EQ0 [66] L190:X= /H[I;I+1] [67] $>(($MOW)$LE$MOY=,H[I+l;I])%L195 [68] $>L200,H[I;EN]=(-R+X*H[I+1;EN]=ZZ=(S-T*R)/(X*T=Y/W)-ZZ)/W [69] L195:H]I;EN]=(-S+ZZ*H[I+l/EN]=X=(-R-T*S)/X-ZZ*T=W/Y)/Y [70] L200:$>(NA$GEII=II+1)%L2002 [71] $>L260 [72] L205:$>(Q$GT0)%L220 [73] $>(,($MOH[EN;NA])$LE$MOH[M=NA=EN-l;EN])%L210 [74] $>L215,,H[NA;NA,EN]=-((H[EN;EN]-P),Q)/H[EN;NA] [75] L210:H[NA;NA,EN]=((-,H[NA/EN]) , 0) _COMDIV (, H[NA;NA]-P),Q [76] L215:H[EN;NA,EN]= 1 0 [77] $>L260,NA=EN [78] L220:NM1=EN-II=1 [79] L2552:W=,H[I;I=EN-II]-P [80] RA=((,H[I;NA]),0)+,H[I;J]+.*H[J=(M-l)$DR$.EN;EN,NA] [81] $>(WI[I]$GE0)%L230 [82] $>L260 /(S=RA[2]),(R=RA[1]),ZZ=W [83] L230:T3=W,-Q [84] $>(WI[M=I]$GT0)%L235 [85] $>L250,T3=(-RA) _COMDIV T3 [86] L235:X=,H[I;I+1] [87] $>((($MOW)+$MOQ)$LE$MOY=,H[I+l;I])%L240 [88] R=R-+%RA*T1=(Y,0) _COMDIVT3 [89] T2 =(R,S=S—%T1*$R0RA) _COMDIV(T1*X)-ZZ,-Q [90] $>L245,T3=(-RA+X*T2) _COMDIV T3 [91] L240:RA=RA-(+%T1*R,S),-%(S,R)*T1=(W, -Q) /Y [92] T2=RA _COMDIV((+%T1*T3)-X),-%(T3=ZZ,Q)*$R0T1 [93] T3=-((+%R,T2*T3),S+-%T3*$ROT2)/Y [94] L245:H[I+1;EN,NA]=T2 [95] L250:H[I;EN,NA]=T3 [96] $>(NM1$GEII=II+1)%L2552 [97] L260:$>(N$GENN=NN+1)%L2602 [98] H[1;1]=(WI[1]$EQ0)+H[1;1]*WI[1]$NE0 [99] NM1=N-I=1 [100] L2702:$>((I$GEL1)&I$LEK1)%L270 [101] Z[I;J]=H[I;J=I$DRIN] [102] L270:$>(NM1$GEI=I+1)%L2702 [103] $>(K1$EQ0)%0*NPL=N+II=L1 [104] L3002:$>(WI[J=NPL-II]$GT0)%L300 [105] J=J,(WI[J]$LT0)%J-l [106] Z[IL;J]=Z[IL;A]+.*H[A=LM1$DR$.M=$MI%K1,1$TAJ;J] [107] L300:$>(N$GEII=II+1)%L3002 [108] J=$MI%3,N*II=1 [109] L305:H[I;I-II]=(H[I;I-II]*I$: .$NEI)+(I$: .$EQI)* 241 (2$,$,I)$ /WK[II;I=II$DRIN] [110] $>(J$GEII=II+1)%L305 [111] $>0 [112] L310:IER=EN [113] Z=(N,N)$,0 "_SEPR<#>" " B=_SEPR A; I; J [1] B=(2, (1 0)+$,A)$,0 [2] B[1;;]=WR /[1] A[;(J=$.$,WI)-I=WI$LT0] [3] B[2/ /I-1]=WI[I-1] / [1] A[;I=I%J] [4] B[2;;I]=WI[I] , [1]( -A[;I] ) [5] B=B[;;$GD((-WR$LT0)+WR$GE0)*(WR@2)-WI@2] II 242 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0059806/manifest

Comment

Related Items