LIKELIHOOD BASED RECONSTRUCTION ALGORITHM FOR FINDING SHORT TRACKS WITHIN THE T2K EXPERIMENT'S FINE-GRAINED DETECTORS by Mark John McCarthy B.Sc., Queen's University, 2012 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (PHYSICS) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) December 2014 © Mark John McCarthy, 2014i Abstract The T2K experiment is a long baseline neutrino experiment. Rates of different types of neutrino interactions are compared at a near detector and a detector 295 km from the neutrino source which enables neutrino oscillation and cross-section measurements. Understanding the activity near the neutrino interaction vertex in the near detector is crucial to identifying how a neutrino interacted. Current reconstruction algorithms for T2K's Fine Grained Detectors (FGD) have a decreased sensitivity to short tracks (<200mm) such as those created by low energy protons and pions. A new likelihood based reconstruction algorithm entitled FLikFit (FGD Likelihood Fitter) has been written that uses information from the deposited charge and the position of the hits within the FGDs in an attempt to tag these particles. It traces a predicted path of the particle through the detector and calculates expected kinetic energy and path length values which are fed to an existing probability distribution function to obtain a likelihood for each scintillator bar. The fitter is seen to calculate likelihoods effectively and proves to be adept at hypothesis testing, but further development is required to improve the FLikFit's ability to converge. ii Preface This thesis is based on the T2K experiment, run by an international collaboration of fellow scientists and researchers. Although the writing of this thesis is entirely the authour's own work, the primary work of the authour is closely integrated with the experiments, software, designs, and data created by others. Chapters 1 and 2 describe the relevant work performed virtually entirely by others that provide a context for the fitter that has been written. I have played only minor roles in maintaining the experiment and in the development of ElecSim, our electronics simulation software as briefly described in section 2.2. Chapters 3-5 describe the FGD likelihood fitter. The initial framework and ideas behind this fitter came from M. Wilking. He had created the PDFs for muons, software for recreating these PDFs, and had written a fitter for straight tracks before the project was abandoned. Once I continued the work on this project, much of the pre-existing code was rewritten or expanded upon. The muon PDFs were revised and recreated, the particle propagation code was altered to allow for curvature in a magnetic field, and all the other features mentioned in chapter 3 were added by myself. While I received supervision from S. Oser and occasional consultations with A. Hillairet, M. Scott, M. Wilking, and T. Lindner, all the coding and testing of this fitter was entirely my own work. iii Table of Contents Abstract ..................................................................................................................................i Preface.................................................................................................................................. ii Table of Contents ................................................................................................................ iii List of Tables ........................................................................................................................ v List of Figures ...................................................................................................................... vi Glossary and List of Abbreviations ................................................................................. xiii Acknowledgements ............................................................................................................ xiv Dedication ........................................................................................................................... xv Chapter 1: Introduction ....................................................................................................... 1 1.1 Neutrinos .................................................................................................................1 1.2 Science goals of T2K ...............................................................................................3 1.3 T2K overview ..........................................................................................................4 1.4 J-PARC Beamline ....................................................................................................5 1.5 ND280 .....................................................................................................................6 1.6 Super-Kamiokande ..................................................................................................9 1.7 The Fine Grained Detectors ................................................................................... 10 Chapter 2: ND280 Reconstruction ..................................................................................... 17 2.1 What is Reconstruction? ........................................................................................ 17 2.2 ND280 Software and Data Processing Overview .................................................... 17 2.3 Reconstruction prior to FLikFit .............................................................................. 19 2.3.1 TPC and ECAL reconstruction ....................................................................... 19 iv 2.3.2 Kalman Filters................................................................................................ 20 2.3.3 Isorecon ......................................................................................................... 21 Chapter 3: FGD Likelihood Fitter (FLikFit) ..................................................................... 24 3.1 Philosophy and concept of FLikFit ......................................................................... 24 3.2 Creation of PDFs ................................................................................................... 27 3.3 Form of PDFs ........................................................................................................ 29 3.4 Particle Propagation ............................................................................................... 34 3.5 Fitting algorithm and considerations ...................................................................... 36 3.6 Deweighting .......................................................................................................... 38 3.7 Penalty terms ......................................................................................................... 40 3.8 Multiple Particle fits .............................................................................................. 41 3.9 Seeding .................................................................................................................. 46 Chapter 4: Performance/Results ........................................................................................ 48 4.1 Displacements from seed ....................................................................................... 48 4.2 Likelihood surfaces ................................................................................................ 53 4.3 Hypothesis testing .................................................................................................. 62 Chapter 5: Conclusion ........................................................................................................ 69 5.1 FLikFit's potential as a useful reconstruction algorithm .......................................... 69 5.2 Proton seeding ....................................................................................................... 69 5.3 Potential avenues for improvements ....................................................................... 71 5.4 Final Comments ..................................................................................................... 74 Bibliography ....................................................................................................................... 75 v List of Tables Table 1.1 Table of neutrino oscillation parameters ..................................................................2 vi List of Figures Figure 1.1 The T2K neutrino beam is created in J-PARC on the east coast of Japan. 280 metres downstream is the ND280 detector suite. On the west side of Japan lies Super-Kamiokande, the far detector. Not to scale.[6] ..............................................................................4 Figure 1.2 The T2K energy spectrum of the muon neutrino beam. This is for an off-axis angle of 2.5° and a horn current of 250 kA, which corresponds to much of the accumulated data so far.[6] ...........................................................................................................................................6 Figure 1.3 An exploded view schematic of ND280. The neutrino beam is traverses the detector from left to right. The two FGDs (in green) located between the three TPCs (in orange) constitute the tracker. Upstream of the tracker lies the P0D (in turquoise). The P0D a and tracker is surrounded by the ECal in all directions except on the most upstream side. The SMRD lies within the magnet yoke which encircles the rest of the ND280 detectors.[6] .................................7 Figure 1.4 The energy deposited by various particles as a function of the distance traversed through the upstream FGD. Monte Carlo simulation. It can be seen that protons can be distinguished from other particles.[12] ........................................................................................ 11 Figure 1.5 The most upstream FGD (FGD1) during installation.[16]....................................... 12 Figure 1.6 A cross-section of a typical bar with distance measurements shown in mm. The titanium oxide coating is seen in white surrounding the grey scintillator. The hole in the centre is occupied by the wavelength shifting fibre. The bars extend 1.8m perpendicular to this cross-section.[12] ................................................................................................................................. 13 Figure 1.7 A typical MPPC. The area covered by the sensor is 1.3 x 1.3 mm2.[12] ................. 15 vii Figure 2.1 Coordinates used by the Radon transform. The origin of the new coordinate system for isorecon was chosen to be the same as the origin for the ND280 standard coordinate system, which lies just upstream and slightly offset from the centre of FGD1.[20] ................................... 22 Figure 2.2 The efficiency of various isorecon algorithms at finding a track. SBCAT is the cellular automaton algorithm. Flexible Radon is a slightly altered version of the Radon transform algorithm that was ultimately used in isorecon. Note the drop in efficiency for tracks less than 200 mm long.[20] ........................................................................................................................ 23 Figure 3.1 Distribution of the stopping power parameter of equation 3.2 for muons. Muons with a parameter larger than the small dip at 0.82 typically stop within the scintillator bar. Protons have a similar distribution............................................................................................. 29 Figure 3.2 An example normalized histogram of the charge in a given kinetic energy and path length bin for non-stopping muons. Although the x-axis is displayed as charge per path length, since all events are within a given path length bin, the path lengths for all events are similar and the only significant variation comes from the charge. The red line is the fitted curve to this histogram. For computational reasons, the value of this curve has been set to zero in bins with no entries, as can be seen. .............................................................................................................. 31 Figure 3.3 The parameter P0 as defined in equation 3.3 for all histograms within an example kinetic energy bin. This example is for non-stopping muons. This parameter was fit with a constant, however other parameters were fit with more complicated functions. Given errors are those reported by the previous fits. ............................................................................................ 32 Figure 3.4 The parameters of the fit along path length are next fit along kinetic energy as seen. This plot shows the fit for all non-stopping muons. Given errors are those reported by the previous fits. ............................................................................................................................. 33 viii Figure 3.5 An example of the plot of the negative log likelihood surface with the two angles varied and all other variables held at the truth. The dashed lines represent the true values and can be seen to coincide with the largest likelihood, or as seen above, the lowest negative log likelihood. Note that there is little correlation between the two angles. ...................................... 37 Figure 3.6 A plot of the likelihood surface in θ and φ as conventionally defined in spherical coordinates. Dashed line represents true values. This plot is from an older version of the software but the correlations between θ and φ are still apparent. The top crescent represents constant θxz and the bottom crescent represents constant θyz. Note the large correlation between the two parameters. ................................................................................................................................ 38 Figure 3.7 The number of hits traversed by the projected proton path relative to the total number of hits for true protons within the FGD shown as a function of distance from the vertex. The projected path is based on true vertex values. ..................................................................... 39 Figure 3.8 The number of hits traversed by the projected muon path relative to the total number of hits for true muons within the FGD shown as a function of distance from the vertex. The projected path is based on true vertex values. ..................................................................... 40 Figure 3.9 A display of many PDFs overlaid atop one another, all of which have an energy deposit of 5 MeV. Solid curves are protons, dotted curves are muons. Each colour represents a different initial kinetic energy from 40 MeV to 400 MeV in steps of 40 MeV. The solid black curve is for 40 MeV protons. ..................................................................................................... 44 Figure 3.10 The means of the Gaussian part of many PDFs displayed as a function of the initial kinetic energy of the particle with path lengths adjusted so that all particles lose 10 MeV of energy. The red line represents muons, the blue line represents protons. ................................ 45 ix Figure 3.11 The sigmas of the Gaussian part of many PDFs displayed as a function of the initial kinetic energy of the particle with path lengths adjusted so that all particles lose 10 MeV of energy. The red line represents muons, the blue line represents protons. ................................ 46 Figure 4.1 Effect of displacing the seed in the x-direction for a single muon fit. All values are relative to the true vertex position. An ideal fit would show a horizontal line at zero displacement between the reconstructed and true vertexes. A fit which returns the seed would show the diagonal line seen where the displacement of the seed from the truth is similar to the displacement of the reconstructed vertex from the truth. ............................................................ 49 Figure 4.2 Effect of displacing the seed in the x-direction for a single proton fit. All values are relative to the true vertex position. An ideal fit would show a horizontal line at zero displacement between the reconstructed and true vertexes. A fit which returns the seed would show the diagonal line seen where the displacement of the seed from the truth is similar to the displacement of the reconstructed vertex from the truth. ............................................................ 50 Figure 4.3 Effect of displacing the seed of the kinetic energy at the vertex for a single muon fit. All values are relative to the true kinetic energy. An ideal fit would show a horizontal line at zero change between the reconstructed and true values. A fit which returns the seed would show the diagonal line seen where the change of the seed energy from the true energy is similar to the change of the reconstructed energy from the true energy. .......................................................... 51 Figure 4.4 Effect of displacing the seed of the kinetic energy at the vertex for a single proton fit. All values are relative to the true kinetic energy. An ideal fit would show a horizontal line at zero change between the reconstructed and true values. A fit which returns the seed would show the diagonal line seen where the change of the seed energy from the true energy is similar to the change of the reconstructed energy from the true energy. .......................................................... 52 x Figure 4.5 The event display for the likelihood surfaces in figures 4.7 - 4.12 as shown on an x-z projection with the cross-sections of half the FGD bars shown. The horizontal layers are not shown but should be understood to exist between the vertical layers and with the long axes of the bars along the x-direction. Blue crosses represent the true proton positions with the particle starting at the green star and coming to rest at the yellow star. Green triangles represent the isorecon reconstruction which was used as the seed for this event. The magenta triangles and the line connecting them represent the final reconstructed trajectory for this event. The bars with deposited charge have large black dots in their centres. ............................................................. 54 Figure 4.6 The event display for the likelihood surfaces in figures 4.7 - 4.12 as shown on an y-z projection with the cross-sections of half the FGD bars shown. The vertical layers are not shown but should be understood to exist between the horizontal layers and with the long axes of the bars along the y-direction. Blue crosses represent the true proton positions with the particle starting at the green star and coming to rest at the yellow star. Green triangles represent the isorecon reconstruction which was used as the seed for this event. The magenta triangles and the line connecting them represent the final reconstructed trajectory for this event. The bars with deposited charge have large black dots in their centres. ............................................................. 55 Figure 4.7 The negative log likelihood surface when the x parameter of the vertex is varied. The seed, reconstructed, and true values are shown by the vertical solid blue line, dashed red line, and dashed black line respectively. .................................................................................... 56 Figure 4.8 The negative log likelihood surface when the y parameter of the vertex is varied. The seed, reconstructed, and true values are shown by the vertical solid blue line, dashed red line, and dashed black line respectively. .................................................................................... 57 xi Figure 4.9 The negative log likelihood surface when the z parameter of the vertex is varied. The seed, reconstructed, and true values are shown by the vertical solid blue line, dashed red line, and dashed black line respectively. .................................................................................... 58 Figure 4.10 The negative log likelihood surface when the θxz parameter of the particle at the vertex is varied. The seed, reconstructed, and true values are shown by the vertical solid blue line, dashed red line, and dashed black line respectively. ........................................................... 59 Figure 4.11 The negative log likelihood surface when the θyz parameter of the particle at the vertex is varied. The seed, reconstructed, and true values are shown by the vertical solid blue line, dashed red line, and dashed black line respectively. ........................................................... 60 Figure 4.12 The negative log likelihood surface when the kinetic energy of the particle at the vertex is varied. The seed, reconstructed, and true values are shown by the vertical solid blue line, dashed red line, and dashed black line respectively. ........................................................... 61 Figure 4.13 Likelihood ratio for single proton tracks fit with both a muon and proton hypothesis. Tracks considered more likely to be a proton are found on the right hand side. Of 3850 events, the fitter correctly concluded identified the proton 81% of the time. ...................... 63 Figure 4.14 Likelihood ratio for single muon tracks fit with both a muon and proton hypothesis. Tracks considered more likely to be a muon are found on the left hand side. Of 4911 events, the fitter correctly concluded identified the muon 74% of the time. ............................... 64 Figure 4.15 The likelihood ratio between the proton hypothesis and the muon hypothesis. The red curve had true protons, the blue curve had true muons. Both curves are normalized. ........... 65 Figure 4.16 Likelihood ratio for an event with a true long muon and a true short proton track fit with both a muon hypothesis and a muon + proton hypothesis. Tracks considered more likely xii to include a proton are found on the right hand side. Of 5000 events, the fitter correctly concluded that there was a proton 73% of the time. ................................................................... 66 Figure 4.17 Likelihood ratio for a single lone muon track fit with both a muon hypothesis and a muon + proton hypothesis. Tracks considered more likely to include a proton are found on the right hand side. Of 4995 events, 85% correctly conclude that there is no add additional proton. 67 Figure 4.18 The likelihood ratio of the hypothesis including the short proton and the hypothesis not including the proton. There existed a true proton in creating the blue histogram, there was no true proton for the red histogram. Both curves are normalized............................... 68 Figure 5.1 The track matched from the TPC is in green with the node used as the seed for FLikFit denoted by the large red star. Hit bars have black circles in their centres. The black line and magenta triangles represent nodes created by FLikFit's final reconstruction. Blue and black crosses denote the true path of the proton and muon respectively. ............................................. 71 xiii Glossary and List of Abbreviations FGD - Fine Grained Detector TPC - Time Projection Chamber ECal - Electromagnetic Calorimeter P0D - π0 detector INGRID - Interactive Neutrino GRID SMRD - Side Muon Range Detector ND280 - T2K's near detectors, situated 280m from the beam target. KE - Kinetic energy PL - Path length FLikFit - FGD Likelihood Fitter SBCAT - Scintillating bar cellular automaton tracking. Often referred to as the cellular automaton algorithm. Isorecon - FGD Isolated Reconstruction ROOT - A C++ based object oriented program and library developed by CERN and used heavily in experimental particle physics xiv Acknowledgements I offer my gratitude to the faculty, staff and my fellow students at UBC, who have given me the tools to develop into the physicist I hope to be, and who have given me the inspiration that I might actually attain this hope. I owe thanks to Graham Reid, whose kind support, enduring friendship, and consistent ability to outsmart me has motivated me to continue through the difficult times of this degree and to never be satisfied with anything but success. T2K has been a wonderful institution to be a part of and I have certainly been enriched from my experiences with this collaboration. I thank Thomas Lindner, Mark Scott, and Michael Wilking for their wisdom and patience both in helping me in the work described in this thesis and in many other tasks. I am particularly thankful for Scott Oser for his ingenuity and inspiration academically, and his understanding and persistence otherwise. His help and support have been instrumental over the course of this degree and I cannot understate my gratitude. Special thanks are owed to my family, who have unfalteringly supported me throughout my years of education. xv Dedication To my family and friends from afar 1 Chapter 1: Introduction 1.1 Neutrinos Neutrinos are elementary particles in the lepton family. Interactions with neutrinos are rare as neutrinos are without electric charge and colour charge making them immune to the electromagnetic and strong forces. Only through the weak and gravitational forces do neutrinos interact. It is known from studies of tritium beta decay that the neutrino rest mass is incredibly small (less than 2 eV at 95% confidence)[1], however this is only an upper limit and the actual masses could be much less. There are three types or flavours of neutrino: electron neutrinos, muon neutrinos, and tau neutrinos, so named by the lepton family they belong to. During the late 1990's and early 2000's it was confirmed that these neutrinos can change from one flavour to another during transit through a process known as oscillation[2][3]. Oscillation arises because the neutrino interaction eigenstates are different from the neutrino mass eigenstates. These two sets of eigenstates are not independent from one another, but are instead related by the Pontecorvo–Maki–Nakagawa–Sakata (PMNS) mixing matrix: )sin()cos(10000001000000100000012/2/12121212131313132323232332121nmnmnmnmiiiiPNMSPNMSesceecssccesesccsscUUcpcp (1.1) 2 where ν1,2,3 are the mass eigenstates and the θ's represent the neutrino mixing angles. The α's are only significant should the neutrinos be Majorana particles and do not play a role in oscillation. The most recent mixing parameters are listed in table 1.1: Parameter Best fit (+/- 1σ) Table 1.1 Current neutrino mixing parameters and uncertainties. Δm2 = Δm312 - Δm212/2 for the normal hierarchy, Δm2 = Δm322 - Δm212/2 for the inverted hierarchy. Bracketed values are for the inverted hierarchy.[4] The confirmation of neutrino oscillation enforces that at least two of the three neutrino mass eigenstates have non-zero masses. The mass differences have been measured to a reasonable degree, also listed in table 1.1. There is uncertainty in the ordering of the masses with two possible orders, usually called the normal hierarchy (NH) where ν3 is the most massive and the inverted hierarchy (IH) where ν3 is the least massive. In both hierarchies, m1 < m2. There is much interest in the value of θ23 and whether it is at maximal mixing value of 45°, and if not, which octant it lies within. 3 A non-zero δcp parameter would signify that neutrinos violate CP symmetry, an important theoretical symmetry which has implications in leptogenesis and cosmology. CP symmetry is the statement that physics should be the same under the combination of charge conjugation symmetry in which all particles are switched with their anti-particles and vice versa, and parity symmetry in which all particles are mirrored. This asymmetry was discovered in the hadronic sector but does not exist to the degree required to explain several cosmological questions. Several theoretical models propose that CP violation in the lepton sector could provide the additional asymmetry required to explain why the universe is predominantly comprised of matter and not antimatter. The flavour content of a beam can be predicted as a function of the neutrino energy and distance. For T2K, the most relevant formulae are: (1.2) where Δm2=Δm322 for the normal hierarchy and Δm2=Δm132 for the inverted hierarchy. Electron appearance from a muon neutrino beam can be calculated using: (1.3) Both these equations contain higher order terms which can be included but are dominated by the terms shown. 1.2 Science goals of T2K The T2K experiment has three main goals: - To observe the appearance of electron neutrinos in a muon neutrino beam, 4 - A precise measurement of θ23 using muon neutrino disappearance, - To be sensitive to the δcp parameter of the PMNS mixing matrix, and - To probe neutrino-nucleus interaction physics. Many improvements have already been made toward all of these goals, the first of which has already been achieved! [5] 1.3 T2K overview To make the measurements required to understand the problems posed in the previous section, T2K is comprised of three main elements: the beamline at J-PARC in Tokai, Ibaraki, Japan, the near detectors designed to characterize the beam (ND280), and the far detector called Super-Kamiokande in Kamioka, Gifu, Japan. Figure 1.1 The T2K neutrino beam is created in J-PARC on the east coast of Japan. 280 metres downstream is the ND280 detector suite. On the west side of Japan lies Super-Kamiokande, the far detector. Not to scale.[6] In the simplest view, a muon neutrino beam is produced in the beamline with as few non-muon-type neutrinos as possible. As the beam passes the near detector, multiple measurements are made to characterize the beam profile and to better understand the neutrino interaction cross-5 sections on various elements. Upon reaching the far detector, the contents of the beam are again measured to characterize the oscillation of the neutrinos in flight, providing us with detailed information on several of the PMNS mixing parameters. The neutrino beam produced at J-PARC is purposefully aimed 2.5° away from Kamioka and ND280. Although this decreases the total neutrino flux, it narrows the energy distribution of these neutrinos to be more closely centred at the energy causing maximum νμ → νe oscillation over 295 km (commonly referred to as the oscillation maximum). 1.4 J-PARC Beamline The J-PARC accelerator is comprised of three segments, a linear accelerator (LINAC), a rapid cycling synchrotron (RCS), and the main ring synchrotron. Protons are accelerated through each of these stages to 30 GeV before being extracted from the main ring to the primary neutrino beamline. In the primary beamline, the protons are turned 2.5 degrees off-axis from Kamioka and focused. After this, the protons enter the secondary beamline where the beam position is measured using the OTR detector[7] before they hit a graphite target. From this interaction many different particles are produced, however only the charged pions are of interest. Three magnetic focusing horns are used to collect and focus positive pions produced by the collision. The pions are then sent down a 96 metre steel tunnel known as the decay volume. Within this volume, a positive pion will decay primarily into antimuons and muon neutrinos, the latter of which are the focus of the experiment. To study antineutrinos, the polarity of the magnetic focusing horns is reversed focusing negative pions which ultimately produce muons and muon antineutrinos. 6 Figure 1.2 The T2K energy spectrum of the muon neutrino beam. This is for an off-axis angle of 2.5° and a horn current of 250 kA, which corresponds to much of the accumulated data so far.[6] These neutrinos are measured to have an energy peak around 700 MeV with a range of approximately 300 MeV to a few GeV. The remaining hadrons and muons then reach the beam dump, a wall of cooled graphite designed to stop the leftover particles. Muons of momenta higher than approximately 5 GeV/c can penetrate the beam dump where they are measured by arrays of ionization chambers, silicon PIN photodiodes and nuclear emulsion known collectively as the muon monitor. The muon monitor[8] along with the INGRID detector[9] to be mentioned later are designed to study the beam direction, energy, and profile. 1.5 ND280 ND280 is a name applied to a suite of detectors positioned 280 m from the target designed to characterize the neutrino beam before oscillation can occur. 7 Figure 1.3 An exploded view schematic of ND280. The neutrino beam is traverses the detector from left to right. The two FGDs (in green) located between the three TPCs (in orange) constitute the tracker. Upstream of the tracker lies the P0D (in turquoise). The P0D a and tracker is surrounded by the ECal in all directions except on the most upstream side. The SMRD lies within the magnet yoke which encircles the rest of the ND280 detectors.[6] The INGRID subdetector is an independent detector that uses neutrino interactions in iron to perform a high statistics measurement of the beam profile and flux. Measurements are taken using extruded scintillator bars with a central wavelength shifting fibre attached to a multi-pixel photon counter (MPPC). The beam centre is determined to within 10 cm on a daily basis. INGRID is the only detector beyond the main beamline that is not off axis.[9] 8 The remainder of the ND280 detectors are placed within a 0.2 T magnetic field oriented horizontally and perpendicular to the beam direction (the positive x direction as defined in figure 1.3). The magnet coils to produce this field are in fact reused coils from the UA1 experiment. The pi-zero detector (P0D) is a detector designed specifically to collect measurements of neutrino interactions producing a π0. The detector is comprised of several water filled bags placed between extruded scintillating fibres and lead sheets. The water bags can be filled or emptied allowing a subtraction method for determining water target cross-sections. Accurate measurements of neutrino cross-sections on water is particularly important to make a comparison with the water Čerenkov far detector.[10] The time projection chambers (TPCs) are part of the ND280 tracking system. These are gas filled detectors operating under a large electric field. Passing charged particles ionize the gas and this charge is collected and amplified within the micromegas modules situated along the readout plane of the detector. Any ionization within the micromegas causes the freed electrons to be accelerated to high enough speeds to further ionize other atoms in the gas causing a cascade that can be measured. By comparing the time of charge collection with the hit times in other detectors, the TPCs can determine how long ago the primary ionization occurred. Knowing the drift speed of the charge through the TPC, a full three dimensional reconstruction of the track can be performed. One particularly important function the TPCs provide is a measurement of the curvature of the track which can be directly translated into the magnitude of the momentum of the particle.[11] The fine grained detectors (FGDs) complete the ND280 tracker. They are comprised of thousands (5760 for the upstream FGD, 2688 for the downstream FGD) of long extruded scintillator bars with wavelength shifting fibres arranged down the long axis of the bar. The bars 9 are grouped into layers oriented with their long axes alternating horizontal and vertical in directions perpendicular to the beam. Further discussion of the workings of the FGDs can be found in section 1.7.[12] The electromagnetic calorimeters (ECals) are useful for detecting photons and distinguishing pions and muons. Additionally, they complement the inner detectors by providing information on particles that may have escaped or entered the tracker region. The ECALs are made of many extruded scintillator bars with lead absorber sheets between layers.[13] The side range muon detectors (SMRD) are designed to detect muons that may have escaped from interactions within the inner detectors, along with tagging cosmic muons entering ND280. They are scintillator sheets with an S-shaped wavelength shifting fibre along the interior that is read out by MPPCs. These scintillator modules are placed in gaps within the magnet yoke.[14] 1.6 Super-Kamiokande Super-Kamiokande[15] is a large Čerenkov water detector approximately 1 km underground in Kamioka, Gifu. The detector houses nearly 13000 photomultiplier tubes in a cylindrical vessel designed to pick up the Čerenkov rings from electrons and muons produced by neutrinos as they travel through the water. The detector was originally designed to measure atmospheric and solar neutrinos and has already been in use since 1996 making its properties well understood. Neutrino interactions within Super-Kamiokande are virtually all within the purified water, so understanding neutrino cross-sections on water (namely oxygen) is necessary to compare the observed neutrino flux with that measured at ND280. 10 1.7 The Fine Grained Detectors The FGDs were designed to measure various charged current interactions. The most common is the charged-current quasi-elastic (CCQE) interaction: (1.4) where l denotes a lepton flavour (electron, muon, or tau). When the beam is in antineutrino mode, the reaction becomes: (1.5) By measuring the momentum and energy of the outgoing lepton, the energy of the neutrino can be extrapolated since the incident neutrino direction is assumed to be from the beamline. However, to obtain an oscillation measurement to the sensitivities desired by the T2K experiment, various other charged current interactions must also be understood. Of particular importance is the charged current interaction producing a pion (CC1π+/-): (1.6) or in antineutrino mode: (1.7) where N denotes a nucleon. This interaction occurs via a delta resonance. Another important background is the charged current and neutral current interactions producing a neutral pion (CC1π0 and NC1π0 respectively), but it is unlikely the π0 will be detected within an FGD. The P0D and ECal are relied upon to provide measurements of these backgrounds. For the CC1π interactions, the outgoing pion must also be measured to obtain information about the incident neutrino. To do this, the FGD must be sensitive to all tracks found around the vertex as missing a pion or a proton could result in a misinterpretation of the neutrino interaction. plnl nlpl NlNl11 To accomplish these goals, the FGD has been designed to be thin enough to allow for muons to escape into a TPC where the momentum can be more accurately measured while also having a large mass to increase the neutrino interaction rate. The pions and protons that are also produced in the various neutrino interactions are also detected and can be distinguished from one another by various means including their energy loss (dE/dx). Figure 1.4 The energy deposited by various particles as a function of the distance traversed through the upstream FGD. Monte Carlo simulation. It can be seen that protons can be distinguished from other particles.[12] 12 Figure 1.5 The most upstream FGD (FGD1) during installation.[16] There are two FGDs placed between the three TPCs. An FGD is comprised of thousands of thin and long scintillating plastic bars arrayed in layers with the long axes of the bars perpendicular to the beam direction. Each bar is 9.61 cm × 9.61 cm × (194× 9.61 cm) and has a wavelength shifting fibre situated along the bar's central axis. The outer edges of the bar are coated in TiO2 to improve the reflection of the light off the sides and to increase the chance that the light enters the fibre. The usage of the TiO2 reduces the probability of light traversing from 13 one bar to another to 0.5%. The coating of the bars causes the scintillator to have a rounded square cross-section as can be seen in figure 1.6. There are 192 bars per layer and two layers per module; one layer is oriented with the long axis of the bars horizontal and the other vertical. The FGD closest to the beam source is known as FGD1 and comprises of 15 modules with little else beyond glue and the air gaps between each module. The downstream FGD has 7 modules that alternate with 6 layers of water 2.5 cm thick held at sub-atmospheric pressure. By comparing the rates of various charged current interactions on the upstream and downstream FGD's, one can determine the relative cross-sections of these interactions on water. This is important when determining the number of expected interactions at Super-Kamiokande where the target mass is water. As a particle traverses a bar within the FGD, the energy deposit within the polystyrene scintillator is partially transformed into scintillation light. The energy deposited within a bar by a traversing particle is follows the Landau distribution. It is useful to keep in mind that the Landau Figure 1.6 A cross-section of a typical bar with distance measurements shown in mm. The titanium oxide coating is seen in white surrounding the grey scintillator. The hole in the centre is occupied by the wavelength shifting fibre. The bars extend 1.8m perpendicular to this cross-section.[12] 14 distribution approximates a Gaussian distribution when the thickness of the material being traversed by the particle is not very small. The scintillation light produced is linearly related to the energy deposit except when there is a large energy loss rate. This is known as Birk's law. The amount of scintillation light is modified by the factor: (1.8) where B is the material dependent Birk's constant. The wavelength shifting fibre serves multiple purposes. As implied by the name, these fibres increase the wavelength of the incident photons improving the collection efficiency by the MPPCs. In addition, the fibres act as waveguides, having a much lower light attenuation compared to the scintillating plastic bars themselves. Lastly, the fibres help to focus the light from the bar into a small area which can be covered by the relatively small (1.3mm x 1.3mm) MPPCs. The MPPCs are only on one end of the fibre so the opposite end has been silvered to increase reflectivity. The fibres are not entirely without attenuation and similar energy deposits in the detector can cause varying amounts of charge collected by the MPPCs. This has been studied and the attenuation as a function of the distance to the MPPC is detailed in equation 1.9 below: (1.9) where I(x) is the remaining intensity of the light relative to some normalization I0, x is the distance from the MPPC to the source of the scintillation light, D is the length of the bar, A is the length of the fibre that extends beyond the edge of the bar toward the MPPC, and M, S, and L are constants which have been experimentally tuned. The first bracketed section of the equation 15 describes the inefficiency of light collection from the very edges of the fibre while the second bracket describes the overall attenuation along the length of the fibre. Each bar is read out by a multi-pixel photon counter (MPPC) placed at the end of the wavelength shifting fiber while the other end is coated in reflective aluminum. Each of the MPPC's 667 pixels act as nearly independent avalanche photodiodes. The charge produced by each pixel is independent of the light entering the pixel and thus the electric output from the sensor is easily digitized. The sensor has to be calibrated for variations between each bar and for the possibility of saturation of the MPPC when the number of incident photons is large enough that the likelihood of hitting the same pixel twice becomes significant. Additionally, the pulse height per pixel fire is sensitive to temperature, which also must be accounted and calibrated for. Occasionally the firing of one pixel will cause a nearby pixel to fire in a process known as cross-talk. Lastly, sometimes when a pixel fires there is enough remaining charge by the time the pixel is ready to fire again that a second pulse is created. This process is known as after-pulsing. While the temperature dependency can largely be dealt with through calibration, saturation, cross-talk, and after-pulsing can complicate event reconstruction. Figure 1.7 A typical MPPC. The area covered by the sensor is 1.3 x 1.3 mm2.[12] 16 The design of the FGDs does provide a few less common challenges compared to other detectors with regards to reconstruction. The usage of many independent bars means that tracks through the FGD become a series of discrete hits with a position spacing greater than many gas detectors, for example. In addition, the orientation of the bars can greatly inhibit reconstruction of tracks that travel at high angles relative to the beam's direction as large amounts of the track may lie within one bar. To be able to reconstruct such high angle tracks, the FGDs benefit from the information from other detectors, particularly the ECALs. Another weakness of the FGD's is their decreased ability to measure a track's curvature due to both the discrete nature of the bars and the high probability for scatters within the large mass of the FGDs. This is the primary reason for surrounding each FGD with a TPC that can measure the momenta of particles from their curvature much more effectively. Lastly, unless there is a significant change in the energy deposit of a particle as it traverses the FGD, the direction of the particle along a track is very difficult to determine, further complicating track reconstruction. None of these challenges are insurmountable, but they must be kept in mind when creating a reconstruction algorithm within the FGDs. 17 Chapter 2: ND280 Reconstruction 2.1 What is Reconstruction? In the general sense, reconstruction is the act of taking our input measurements from the MPPCs and attempting to conclude which particles traversed our detector and what kinematic properties they had. Since many similar but different particle trajectories could cause the same detector response, reconstruction typically involves evaluating different hypotheses in the attempt to determine what is most likely. This is done either through some form of pattern recognition or through a likelihood fitting process where the probability of each hypothesis to produce the desired detector response is evaluated until an optimal hypothesis is found. 2.2 ND280 Software and Data Processing Overview To process the data received from ND280, there are several steps involved. Firstly data which is recorded by the detectors is saved in a format readable by ROOT[17], a C++ program and library developed by CERN which is designed for usage in particle physics data. The data, which comprises of simply the hit information in the case of the FGDs, is first passed through a series of calibrations. The calibrations include processing and separating pulses from the MPPCs, a bar-to-bar correction for different light yields, several timing calibrations, a correction for the degradation of the scintillator over time, a correction for saturation, a clustering of hit times to further suppress dark noise, and a cut that all hits with less than 5 p.e. are ignored. Then the data is passed through our reconstruction algorithms for the TPCs, ECALs, FGDs, and all remaining detectors in that order. Using the output of all the sub-detector reconstructions, a global ND280 18 reconstruction is performed, refitting all the data together. After this, the data size is reduced by removing unnecessary and redundant information and prepared for analysis. To better understand the data, T2K makes heavy use of a technique known as Monte Carlo (MC) simulations. Using the program Geant4[18], also designed by CERN, particles are simulated and propagated through a virtual detector to imitate what the data is expected to look like. The MC may be a full simulation of the neutrino beam interactions within the detector or of the cosmic ray particles which are commonly seen when the beam is off. Alternatively, this simulation may be a simple recreation of a series of identical particles with an aphysical origin to provide insight into how the detectors ought to behave (known as a "particle gun"). The MC data is then passed to our detector and electronics simulation which interprets the particles generated and the energy they deposited and returns the hit information of what it expects our data should look like for such events. This is an extremely valuable tool which can be compared to actual data to learn more about where we might be making assumptions or incorrectly understanding either our detector or the neutrinos themselves. To model ND280 for usage in reconstruction, the entire detector is broken up into regions within the framework of ROOT's geometry modeling process (hereby referred to as the ROOT geometry). This process models the detector as several thousand small volumes of relatively homogenous material, such as a scintillator bar or a glue gap between FGD layers. To simplify things for faster computation and easier particle tracing, all volumes have been made as rectangular prisms of varying sizes. This ROOT geometry is helpful for propagating hypothesized particle paths through the detector and in the generation of the MC. 19 2.3 Reconstruction prior to FLikFit[19] Events in the FGD typically originate from two sources: immediate tracks caused by the products of a neutrino interaction (usually a muon and occasionally a proton or pion as well) and delayed events caused by the daughters of the primary interaction, such as the Michel electron from a muon decay. We are interested in separating these two types of events since most of the interest is on the primary interaction. To accomplish this, the FGD records data for a relatively long period of time, 10 μs. Hits that occur within 100 ns of the hit preceding it are grouped together into divisions called time bins until a gap of greater than 100 ns between hits is found. Event reconstruction is run on each time bin separately. This form of time binning occurs only in the FGD; other detectors have their own means of distinguishing hits separated by time. 2.3.1 TPC and ECAL reconstruction The first step in the ND280 reconstruction chain is performing reconstruction within the TPC since the results of this reconstruction are used to seed other processes. A TPC hit provides a two dimensional position of the track based on which Micromegas pad collected the charge. There is no direct way to measure the position along the direction of the electric field, however there is a reliable technique to extract this information. Hit times from faster detectors such as the ECAL and the FGD are used in conjunction with the known drift velocity of charge through the TPC gas to provide estimate the distance the charge drifted, providing the third coordinate of the TPC hit. In this way a full three dimensional track can be reconstructed. To create a track from many hits, the cellular automaton algorithm[19] is used, similar to FGD isorecon which will be discussed in section 2.3.3. The way this algorithm functions is by grouping hits together into clusters which are grouped again into tracks. The waveforms 20 produced by the many Micromegas readout planes are clustered together so long as they overlap temporally and are in adjacent positions. A track is formed when the TPC pattern recognition software finds enough overlapping clusters. The algorithm can also connect clusters separated by small spatial gaps provided they match some specifically chosen criteria. Upon creating the track, a measurement of the curvature of the track is performed. This is directly related to the momentum of the particle, regardless of the particle's identity, although the direction of travel cannot be determined without also knowing the sign of the charge of the particle. As a final step, the TPC attempts to identify the particle creating the track. Recording the ionization charge deposited in a region gives a measure of the dE/dx of the particle with the right calibration. Using the momentum as measured by the curvature, the predictions made by the Bethe-Bloch equation are used to identify the particle. 2.3.2 Kalman Filters When a track in the ECal or TPC exit or enter through a side adjacent to the FGD, an attempt is made to match hits within the FGD to that track. A Kalman filter is used to accomplish this. The Kalman filter propagates the track layer by layer through the FGD. Hits are matched based on their position only. The Kalman filter process is as follows. Firstly, the time from the TPC is compared to the hit times in the FGD to ensure that they belong to the same event. The track is propagated one layer into the FGD and a distance chi square is calculated for each hit in that layer. If the nearest unmatched hit is less than a pre-specified chi square then the hit is "matched". This newly matched hit is used to calculate a new track direction through the layer and the chi squares for 21 the unmatched hits are recomputed and this step is repeated. If no more hits within the layer can be matched then the track is propagated into the next layer where the process is repeated. Once more than one consecutive layer is found to have no hits with a low enough chi square to be matched then the Kalman filter terminates. The exception to this is for tracks with small vertical and horizontal components of its direction as the geometry of the FGD enable these tracks to occasionally traverse several layers without creating a hit. After the Kalman filter terminates, the track is refit using all the matched FGD hits and associated TPC or ECal hits to provide a more consistent result. The unmatched hits are not used in this refit. 2.3.3 Isorecon Unmatched hits within the FGD are passed to the FGD isorecon[20]. This reconstruction attempts to fit tracks that never cross into other detectors. First it groups hits in each view using one of two different algorithms. It then will try to create tracks out of these patterns by removing duplicates and excess hits. This step is known as track-cleaning. At this stage there should be tracks created in both views and the next step is to try to merge these tracks into a full 3D track. The first of the two algorithms is the cellular automaton[19]. The process is very similar to the cellular automaton used in the TPC reconstruction. Hits in adjacent layers are grouped to form segments. These segments are connected if they are capable of forming a somewhat straight line. The connected segments are then grouped to form tracks. This algorithm is effective for horizontal tracks, however tracks moving close to perpendicular to the z-direction will traverse fewer layers and are more difficult to reconstruct. 22 The second algorithm is the Radon transform. This is a transform which takes the locations of the hit bars in one projection of position space (x,z or y,z) and maps them to lines in a polar space (r,θ) where r describes the shortest distance between a predefined origin and the line, and θ represents the angle between a predefined axis and that shortest line. A weighting is given to each line based upon how many hits are traversed by the line. In the transformed space, specific regions of (r,θ) are highly weighted, and these translate to groups of similar lines that traverse many hits. A search is performed for the most highly ranked values of (r,θ) that will yield the candidate 2-D projections of tracks when transformed back to position space. A slightly more sophisticated, but ultimately conceptually very similar method was ultimately employed called the Flexible Radon transform where the weighting function was altered to improve the fitting capabilities of isorecon. Figure 2.1 Coordinates used by the Radon transform. The origin of the new coordinate system for isorecon was chosen to be the same as the origin for the ND280 standard coordinate system, which lies just upstream and slightly offset from the centre of FGD1.[20] 23 One drawback of this method is that it creates more hypothesized tracks than should exist. To remedy this, a final step is performed to remove tracks created from two separate but somewhat co-linear parts of tracks in different parts of the detector. Tracks created from segments of a larger curved track are merged and any duplicate tracks are removed. In putting the two tracks together, there is difficulty in matching different views when tracks are too short or when the tracks travel close to perpendicular to the z direction. The reason for both of these is because the particle does not traverse enough layers. One of the primary motivations for the development of FLikFit is to find these shorter tracks that may have been missed by isorecon. Figure 2.2 The efficiency of various isorecon algorithms at finding a track. SBCAT is the cellular automaton algorithm. Flexible Radon is a slightly altered version of the Radon transform algorithm that was ultimately used in isorecon. Note the drop in efficiency for tracks less than 200 mm long.[20] 24 Chapter 3: FGD Likelihood Fitter (FLikFit) 3.1 Philosophy and concept of FLikFit FLikFit is a likelihood fitter designed to use as much information as possible in determining the parameters related to a track within an FGD. Previous reconstruction algorithms for the FGDs use information on the positions of hits to great effect. They do this primarily by a form of pattern recognition; learning what different particles look like and trying to see how well the hits match this pattern. What sets FLikFit apart from these algorithms is the inclusion of the amount of deposited charge in each individual bar in the fitting process. Although it is not implemented, one could easily expand the fit to include the timing of the charge deposits in each bar as well to provide additional information into the fit. However, the timing resolution of hits within an FGD (approximately 2 ns)[12] is already on the order of the likely amount of time required for a particle to traverse the FGD. The addition of this information would likely be of little benefit as a result. FLikFit can test proton and muon hypotheses and can test hypotheses of an unlimited number of particles. It is assumed that all particles originate from the same location, that they do not produce secondary particles, and that they do not scatter. By deweighting hits beyond a certain distance from the seed, one can effectively reduce this problem by removing hits that likely would have not been created had the particle never scattered. Although this means there is much less information for the fitter to work with, one can compensate by constraining the fit with the results of other fitting algorithms that do use those hits. This is further discussed in section 3.6. For any set of parameters describing the initial conditions of the event, a likelihood can be assigned. To describe a track of n particles, 3×n+3 parameters are needed. These are: 25 The three Cartesian coordinates representing the vertex position within the detector (x, y, z). The two angles describing the initial direction of each particle (θxz, θyz). This is requires two parameters per particle. The formulation of these angles take is described in section 3.5. The initial kinetic energy (KE0) of each particle, which is only one parameter per particle. FLikFit, like all fitters, functions by repeatedly testing and updating these parameters in an attempt to find the parameters that return the maximum likelihood. On each iteration, a predicted path is calculated that a particle with those parameters is likely to take, under the assumptions that there is no multiple scattering or fluctuations in energy loss. For every bar that is traversed by this predicted path, the calibrated charge deposited in that bar is compared with pre-determined probability distribution functions (PDFs) for a charge deposit in a bar for a given path length through the scintillator and kinetic energy upon entering that bar. These path lengths and energies are continuously updated as the predicted path is propagated through the detector. To find a likelihood of these parameters describing the initial conditions of a track, the following is evaluated: (3.1) where qi represents the calibrated charge deposit within the ith bar, PL is the path length of the hypothesized track through the bar, and KE is the kinetic energy upon entering the bar. Technically one should include the probability of obtaining a given pl and KE within a bar based upon the initial conditions of the track, but this is assumed to be unity. Although multiple 26 scattering, secondary interactions, and straggling make this assumption quite tenuous beyond a short distance from the vertex, accommodating this additional probability proves problematic. The second product in equation 3.1 can be simplified greatly since the vast majority of bars not traversed by the hypothesized particle will have no charge. The probability of this occurring is essentially unity because the 5 p.e. calibrated charge threshold for the FGD eliminates virtually all dark noise. The remaining product of the untraversed bars that still contain charge become the "penalty term" discussed in section 3.7. In practice, since fitting algorithms are most robust when finding a minimum of a function with values of similar magnitude, the negative of the logarithm of equation 3.1 is taken. This also transforms the products into sums of the logarithms of the probabilities. This process is repeated for every hypothesis of the number or type of particles. When the fitter has found the maximum likelihood for each hypothesis, the hypotheses are compared and the hypothesis with the highest likelihood is chosen and saved. The other hypotheses are saved as alternates which can be used by later reconstruction and analysis if desired. FLikFit calculates this predicted track under the assumption that there is no multiple scattering of the particle. This makes electrons particularly difficult to be reconstructed and no effort has been made to incorporate these particles in the fitter. In addition, muons that travel beyond several FGD bar widths often have scattered more than a bar width from their predicted path. To compensate for this, the likelihood contribution of a bar is deweighted depending on its distance to the seed vertex. In the future, the fit could and likely should be constrained using previous reconstructions. This will improve fits of particles that exit the FGD. Currently the results of other fits are used to seed FLikFit but are not included in the likelihood function itself. 27 3.2 Creation of PDFs The basis behind the likelihoods are the PDFs: descriptions of the probability of a particle with a certain energy upon entering the bar and a path length through the bar to deposit a certain amount of charge in the MPPC. In the language of equation 3.1, this is represented by P(q|KE,pl). By calculating the value of the PDF in each hit bar using the kinetic energy and path length from the predicted track for that particle, and by multiplying these values together as in equation 3.1, one can obtain the likelihood of the entire track. To create the PDFs, knowledge of the true probabilities of charge deposits based on path lengths and kinetic energies is necessary. To accomplish this, Monte Carlo particle gun simulations were used. In using a Monte Carlo method the true path length of the particle though the bar and the true kinetic energy upon entering the bar are known since the entire event is simulated. The number of occurrences for a hit to have a certain charge, kinetic energy, and path length are stored in what is essentially a three dimensional histogram. This histogram is then fit one dimension at a time and normalized to obtain a smooth function which can return a likelihood from any input path length, kinetic energy, and charge. The actual physical effects involved in the charge measured by an MPPC include the Landau distribution energy deposit, the Birk's law scaling, the attenuation scaling (complicated by the reflection of half the scintillation light off the silvered end of the fibre), and the binomial detection efficiency of the MPPCs (complicated by saturation, after-pulsing, and cross-talk). These effects are described to greater detail in section 1.7. Ultimately, although the relationship may be complicated, the charge deposited in a bar theoretically should only depend on the kinetic energy of the particle, the type of particle, the path length through the bar, and the distance from the MPPC that the bar was traversed. This last effect is easily removed by reusing 28 equation 1.9 from the calibration step of the software processing to allow us to account for the attenuation of scintillation light along the length of the bar. In both the creation of the PDFs and in the actual fit, the average between the entry and exit points in the bar are used to calculate the distance to the MPPC sensor. When fitting the PDFs, the true values of the particle are not actually known so the definition used to define whether a particle traversing a bar counts as "stopping" or "non-stopping" is whether the following is true or false respectively: (3.2) This method of separating stopping and non-stopping particles was used instead of analyzing the true stopping point as it is more similar to how the fitter will in practice perceive stopping and non-stopping particles. This particular value was chosen because it was found that the probability of obtaining (dE/dx)*(path length)/(kinetic energy) had a minimum at 0.82 and that the majority of particles beyond this point were truly stopping within or negligibly close to the bar. 29 Figure 3.1 Distribution of the stopping power parameter of equation 3.2 for muons. Muons with a parameter larger than the small dip at 0.82 typically stop within the scintillator bar. Protons have a similar distribution. 3.3 Form of PDFs Although one could fit the full multidimensional charge PDF in all dimensions simultaneously, this is not done due to the complicated nature of the function and the limitations of fitting such large dimensional histograms. To overcome these limitations, the charge per path length is fit for each bin of path length and kinetic energy. The independent variable in the fit is divided by the path length despite already being binned by path length so as to make the fit parameters more similar in each bin. These fit parameters are fit with respect to the path length for each kinetic energy bin. The parameters of this second fit are then fit once more with respect 30 to the kinetic energy. In this way each dimension of the histogram is fit individually and one could reconstruct the whole PDF from only the fit parameters of the third fit. Due to each particle type having different energy losses and interaction modes, a different PDF is created for each particle type. Only muons and protons have been created so far since they are the most common particles to be found in an FGD while still being easy to fit. PDFs for charged pions could be similarly created, but since the mass of the charged pions is very similar to the mass of the muon, in principal the muon PDF could be re-used for charged pions. Theoretically the distribution of charge deposited for given a value of kinetic energy and path length is a Landau distribution. However, after pulsing, cross-talk, saturation, and the conversion of the pulse seen by the MPPC into a single value serve to smear this distribution, making it more normal or Gaussian. We approximate this distribution as Gaussian near the mean and two exponential tails taking over beyond one sigma. This gives five free parameters. (3.3) where the Pi values are various fitting parameters, x is the charge per path length. To fit this function, first initial guesses for the Gaussian mean (P2) and sigma (P3) are found by fitting the distribution with an ordinary Gaussian. After this, the whole function is fit again to acquire the exponential decay parameters (P0 and P4). Lastly the function is normalized providing P1. The function is fit in this manner to ensure a successful fit as fitting all parameters simultaneously cannot easily be done without a precise advance knowledge of a good seed. 31 Figure 3.2 An example normalized histogram of the charge in a given kinetic energy and path length bin for non-stopping muons. Although the x-axis is displayed as charge per path length, since all events are within a given path length bin, the path lengths for all events are similar and the only significant variation comes from the charge. The red line is the fitted curve to this histogram. For computational reasons, the value of this curve has been set to zero in bins with no entries, as can be seen. Each parameter of the above is fit as a function of path length for every kinetic energy bin. The functional forms used in this second fit is chosen to provide the best match for the Monte Carlo data used to fit these parameters vary for each particle and for whether the particle stops within the bar or continues on. 32 Figure 3.3 The parameter P0 as defined in equation 3.3 for all histograms within an example kinetic energy bin. This example is for non-stopping muons. This parameter was fit with a constant, however other parameters were fit with more complicated functions. Given errors are those reported by the previous fits. Each parameter from the second fits are fit as a function of kinetic energy. In the same way, these functions are not consistent between particle types and whether or not the particle stops within the bar. So long as one knows the functional forms of each of these fits and the fit parameters of the third fit, the entire smoothed PDF can be reconstructed. 33 Figure 3.4 The parameters of the fit along path length are next fit along kinetic energy as seen. This plot shows the fit for all non-stopping muons. Given errors are those reported by the previous fits. Due to the drastic change in energy loss per distance travelled (dE/dx) for particles as they stop, separate PDFs have been created for each scenario. This distinction means that an additional PDF was created for each particle type defining the probability that the particle stopped within the bar for a given path length and kinetic energy. When this probability is significantly different from either zero or one, a weighted average of the two PDFs is evaluated: (3.4) In the FGDs, any charge less than 5 photo-electron equivalent units after calibration is ignored. This is to make dark noise in the FGDs essentially negligible. To save time in the Normalizatoin constant of path length fit [mm/p.e.] 34 evaluation of the PDFs and to ensure the PDFs are as smooth as possible, a separate PDF has been created to evaluate the probability of there being no charge deposit in a bar for a given particle kinetic energy and path length. Separate PDFs were created for muons and for protons since they interact very differently. Although creating a separate set of charged pion PDFs could easily be done, given their similar mass to the muon, the muon PDFs could in principle be reused for pion studies. 3.4 Particle Propagation Before being able to evaluate PDFs to obtain a likelihood for a track, it is necessary to predict that track through the detector, keeping track of both the position and energy along the way. Because two or more particles may traverse the same bar requiring special treatment of the way the PDFs are evaluated, FLikFit determines the predicted tracks of all particles it is fitting before determining a likelihood for the bars. The particle's predicted energy loss is calculated after traversing each volume in the ROOT geometry. To accomplish this, a range table[21][22] is used using values as measured for polystyrene since that is what comprises the FGD scintillator, and scaling by density for other materials. This provides an approximation of the Bethe-Bloch dependency on matter. When determining the position of a particle as it enters and leaves a volume, a helical path is used. This is because all the particles being studied are charged and the entirety of ND280 is typically kept in a constant magnetic field. The radius of curvature of the helix is calculated using the momentum the particle had upon entering the volume. As such, it is assumed that the energy loss through the volume is negligible. Since all volumes in the ROOT geometry for the ND280 software are rectangular prisms of various dimensions, one set of calculations can handle 35 all possible scenarios. There is also the capability to assume the magnetic field has been turned off for the detector, in which case a straight line is used to calculate the trajectory. Determining the path length through a volume is important both for determining the energy loss of the particle and the likelihood of obtaining a given charge in the bar. Path length calculations within each volume are performed by taking the helix entry and exit points and drawing a straight line between them. The calculation computes the distance through the bar's coating, scintillator, and fibre hole. When using the path length to determine the likelihood of obtaining a charge, only the distance traversed through the scintillator is used. For determining energy loss, the coating and scintillator path lengths are both used with their respective densities. There is assumed to be no energy loss within the fibre hole because both the fibre hole is mostly air and because it is incredibly unlikely for a particle to spend much time in the fibre holes. Due to the complicated geometry of the bar at the level of scintillator and coating, only a straight line approximation has been used. This minimizes the number of scenarios, if-statements, and consequently computing time compared to doing a full exact calculation using a helix. While a more precise calculation of path length could be performed, the straight line approximation should suffice as the bar width is small and particles are likely to not have enough time to curve significantly within the bar. Each projected path is computed until the energy loss required to reach the far end of a volume in the ROOT geometry is greater than the kinetic energy of the particle. When this occurs, a new path length is computed travelling only as far as the particle's hypothesized stopping point. 36 3.5 Fitting algorithm and considerations There are multiple methods for a fitter to find a minimum for some likelihood surface. FLikFit uses Nelder and Mead's simplex algorithm[23]. This algorithm is unique since it is both commonly used and does not compute the gradient of the likelihood surface. Because a small change in the fitting parameters can cause the track to go through different bars, potentially causing a large change in likelihood, the likelihood surface is very discrete with many jumps and plateaus. A fitting algorithm which takes advantage of the gradient of the likelihood surface, such as the commonly used MIGRAD, has great difficulties overcoming the shape of the likelihood. On the other hand, the simplex algorithm chooses n+1 points in a fit with n dimensions and makes large jumps between these points to evaluate which direction should be taken next, and should therefore be less sensitive to local variability in the likelihood surface. Another consideration when fitting comes from the parameters themselves. Directions can be represented with two parameters, such as the well-known θ and φ used in spherical coordinates. However, due to the geometry of the FGDs as a series of stacked horizontal or vertical bars, to move in a direction along a bar requires changing both of these variables in a specific, coupled way. For the problem of fitting, it is much more useful to use angles such that changing one allows for movement along a vertical bar and changing the other allows for movement along a horizontal bar. In doing this, the correlations in the likelihood surface between the two variables have been removed, which should make it easier for the fitter to navigate to find its minimum. These angles can be thought of as the angle of a line on a projection onto the x-z plane (θxz) and the angle of a line projected onto the y-z plane (θyz). 37 Figure 3.5 An example of the plot of the negative log likelihood surface with the two angles varied and all other variables held at the truth. The dashed lines represent the true values and can be seen to coincide with the largest likelihood, or as seen above, the lowest negative log likelihood. Note that there is little correlation between the two angles. θyz θxz 38 Figure 3.6 A plot of the likelihood surface in θ and φ as conventionally defined in spherical coordinates. Dashed line represents true values. This plot is from an older version of the software but the correlations between θ and φ are still apparent. The top crescent represents constant θxz and the bottom crescent represents constant θyz. Note the large correlation between the two parameters. 3.6 Deweighting Because the particle propagation nor the evaluation of the PDFs take into account multiple scattering, range staggering, nuclear recoils, or particle decays, it can be reasoned that the farther the bar is from the vertex, the less likely it is to be traversed by the projected path. To lessen the influence of these bars, a deweighting has been applied to all bars based upon how distant they are from the vertex position. Proton and muon particle guns within FGD1 were used to evaluate how frequent bars were missed by the projected path and the results are shown in figures 3.7 and 3.8. The dependency of hits being traversed by the projected path on the distance from the vertex ɸ θ 39 were fit by a simple exponential decay. The fit values come from the muon results, but they fit proton values quite well and so the same function is used regardless of the particle identity. The increase in the negative log likelihood from each bar is multiplied by this function so that the farther a bar is from the seed vertex, the less it impacts the final likelihood of the track. Figure 3.7 The number of hits traversed by the projected proton path relative to the total number of hits for true protons within the FGD shown as a function of distance from the vertex. The projected path is based on true vertex values. 40 Figure 3.8 The number of hits traversed by the projected muon path relative to the total number of hits for true muons within the FGD shown as a function of distance from the vertex. The projected path is based on true vertex values. 3.7 Penalty terms Once all the projected particle paths are computed, it may be found that there are bars with charge in them that were not traversed by any path. The only ways that charge can be deposited in a bar that does not lie on the path of a particle is dark noise and if our hypothesis is insufficiently complex. With the 5 p.e. threshold for charge, dark noise is extremely unlikely in the FGDs. The most common reasons for charge to be deposited in a bar is not traversed is delta rays, secondary particles occurring within the time bin, and most often multiple scattering. At this point, the motivation for the size of the penalty term no longer comes from a physical basis but from a computational one. The fitter is more likely to fail if the likelihood surface changes drastically with small changes in the fit parameters. In the design of FLikFit, much 41 effort has been expended to keep the likelihood surfaces as smooth as possible. One way to obtain a smooth likelihood surface is to observe the change in likelihood as the fit parameters are varied such that a particle's trajectory is moved outside a hit bar and to set the penalty term to be similar to the likelihood value as the particle is just barely entering the bar. In other words, the penalty term has been tuned to be similar to the likelihood of a particle with a negligible path length through a bar. Unfortunately this value depends on the charge in the bar and the kinetic energy of the particle entering the bar, but for an approximation, we have set the penalty term to be within an order of magnitude of these values. For bars with no charge in them and that are not on the track, the probability of obtaining no charge is incredibly close to unity and as such, these bars can be ignored from the fit. 3.8 Multiple Particle fits The multiple particle fit functions very similar to the single particle fit. Currently the fitter assumes all particles originate from the same location leading to a total of 3 + 3×(number of particles) fit parameters. The parameters represent the following: three for the vertex position, two for the directions of each particle, and one for the kinetic energy of each particle at the vertex. It is assumed that all particles originate from the same vertex and therefore one vertex position describes the starting point of all tracks. Each particle's hypothesized trajectory is traced through the detector and all the kinetic energies, path lengths, and charges from the traversed bars are stored. If there was only one track in a bar, the likelihood contribution for that bar is evaluated using same method as the one particle fit. When multiple particles traverse the same bar, there are several possible avenues that were considered. Although creating new PDFs would likely lead to the most exact answer, this was 42 undesirable due to the number of dimensions in this PDF and the amount of Monte Carlo required to obtain it. Since the total energy deposit of each particle is independent of the other particles in the bar, one could attempt to perform a convolution of the single particle PDFs, (3.5) where Q is the total charge deposited in the bar, KEi is the initial kinetic energy of the ith particle, and PLi is the projected path length of the ith particle. This is solvable analytically and would only slow down the calculation due to increased calls to the error function. However the following integral is obtained when attempting to generalize to three particles, (3.6) This integral is not solvable analytically and using a numerical method would be either time consuming, inaccurate, or a mixture of both. In addition this method would not account for saturation in the MPPCs, even in the two particle case. An approximation was developed where the energy deposits of all the particles is combined. Then the PDF of a new hypothetical particle is evaluated where this particle has the same energy deposit as the sum of the other particles. For example, if n particles with kinetic energies KE1, KE2, ... , KEn and predicted path lengths PL1, PL2, ... , PLn traverse the same bar, then their predicted energy deposits are calculated using a range table. (3.7) where Range(x) is a function which translates a kinetic energy into the range of a particle starting with that kinetic energy, and Range-1(x) is just the inverse of that function. The final PDF evaluated has the particle type of the particle with the highest energy loss (let's call this Particlej). Then the evaluated initial energy, KE, and path length, PL, are chosen so that the final energy of this particle is the same as the final energy of Particlej but with the same energy loss as Q qQ dqdqkeplqqQPkeplqPkeplqP0 0 1233212221111 ),|(),|(),|(43 the sum of all the particles' energy losses. In this way if there is one particle that dominates the energy deposits, the PDF should look very similar to the PDF that would be evaluated if no other particles existed. Additionally this is a safe way to ensure that a calculation can be performed for any amount of energy deposit without pushing the path length of a non-stopping particle beyond the point where it would have stopped. Formally, (3.8) (3.9) (3.10) To validate whether this approximation would be sufficient, some validation studies were performed. These tests were to check if particles with a variety of different initial kinetic energies and projected path lengths have similar looking PDFs. Birk's effect corrections have been accounted for. It was found that this approximation is reasonably accurate although slightly biased for all but the particles with the largest energy deposits (i.e.: stopping protons). 44 Figure 3.9 A display of many PDFs overlaid atop one another, all of which have an energy deposit of 5 MeV. Solid curves are protons, dotted curves are muons. Each colour represents a different initial kinetic energy from 40 MeV to 400 MeV in steps of 40 MeV. The solid black curve is for 40 MeV protons. 45 Figure 3.10 The means of the Gaussian part of many PDFs displayed as a function of the initial kinetic energy of the particle with path lengths adjusted so that all particles lose 10 MeV of energy. The red line represents muons, the blue line represents protons. 46 Figure 3.11 The sigmas of the Gaussian part of many PDFs displayed as a function of the initial kinetic energy of the particle with path lengths adjusted so that all particles lose 10 MeV of energy. The red line represents muons, the blue line represents protons. 3.9 Seeding Information from other reconstruction algorithms are used to seed the fit. Since the TPC-FGD matching is the most reliable fit, it is used if it is available. In the case that there is no TPC-FGD matched track but there is an ECal-FGD matched track, the latter is used to seed. Failing this, the FGD isorecon is used to seed the fit. Should none of these other reconstructions be available, the fitter will seed with the truth information if it is a Monte Carlo event or skip the event if from actual data. In the future, the fitter could be programmed to make an estimate for the seed simply based on the general locations of the hits, but this has not been implemented. 47 In the case of TPC-FGD and ECal-FGD matched tracks, the node farthest into the FGD is assumed to be where the track begins and is used as the seed. These fits provide position and direction information, but the momentum information only exists within the other detectors. To achieve a reasonable kinetic energy seed, range tables are used to estimate the energy loss between the seed position and where the particle entered the FGD. Doing this requires knowledge of the particle identity so this calculation is performed separately for both protons and muons. 48 Chapter 4: Performance/Results There are many metrics to measure the performance of FLikFit. One of these is to determine the ability of the fitter to reconstruct the true properties of the particle(s) from various seeds. This reveals the ability of the fitter to converge upon a more likely solution. Another metric is to test the likelihoods of different hypotheses on the event to gain understanding in whether the fitter is able to correctly calculate a likelihood that can discern different particle types and numbers of particles. All these tests were performed on Monte Carlo simulated data where the true parameters of the particles were known. In this way a comparison between the true and reconstructed properties or hypotheses is possible. 4.1 Displacements from seed The ability of the fitter to reconstruct the true values depending on the quality of the seed was evaluated by fixing all parameters at their true values with the exception of one parameter which was assigned a random seed. An ideal fitter will have a small difference between the reconstructed value and the true value for a wide variety of seeds. Figures 4.1 through 4.4 demonstrate that although this is seen, it is not the dominant result. In fact, it appears that the fitter is very likely to return a reconstructed value very similar to the seed value, regardless of the quality of the seed. This is evidence of some sort of convergence problem, often caused by an uneven likelihood surface creating many local minima from which the fitter cannot escape in its search for the global minimum. Some solace can be taken in the fact that when the fitter does deviate from the seed, it usually does so in a way that properly fits the particle. 49 Figure 4.1 Effect of displacing the seed in the x-direction for a single muon fit. All values are relative to the true vertex position. An ideal fit would show a horizontal line at zero displacement between the reconstructed and true vertexes. A fit which returns the seed would show the diagonal line seen where the displacement of the seed from the truth is similar to the displacement of the reconstructed vertex from the truth. 50 Figure 4.2 Effect of displacing the seed in the x-direction for a single proton fit. All values are relative to the true vertex position. An ideal fit would show a horizontal line at zero displacement between the reconstructed and true vertexes. A fit which returns the seed would show the diagonal line seen where the displacement of the seed from the truth is similar to the displacement of the reconstructed vertex from the truth. Of 5000 events, 716 muon fits (14.3%) and 662 proton fits (13.2%) that had x position seeds with a difference of greater than 10mm from the true value had a reconstructed value with a smaller than 10mm difference from the truth. This is represented by a horizontal line at zero in figures 4.1 and 4.2 and indicates a good fit. 2733 muon events (54.7%) and 2318 proton events (46.4%) had x position seeds with a difference of greater than 10mm from the true value had a reconstructed value with a smaller than 10mm difference from the seed. This is represented by 51 the diagonal line in figures 4.1 and 4.2 and indicates the fitter was unable to move far from the seed. Figure 4.3 Effect of displacing the seed of the kinetic energy at the vertex for a single muon fit. All values are relative to the true kinetic energy. An ideal fit would show a horizontal line at zero change between the reconstructed and true values. A fit which returns the seed would show the diagonal line seen where the change of the seed energy from the true energy is similar to the change of the reconstructed energy from the true energy. 52 Figure 4.4 Effect of displacing the seed of the kinetic energy at the vertex for a single proton fit. All values are relative to the true kinetic energy. An ideal fit would show a horizontal line at zero change between the reconstructed and true values. A fit which returns the seed would show the diagonal line seen where the change of the seed energy from the true energy is similar to the change of the reconstructed energy from the true energy. Of 5000 events, 95 muon fits (1.9%) and 350 proton fits (7.0%) that had kinetic energy seeds with a difference of greater than 10% from the true value had a reconstructed value with a smaller than 10% difference from the truth. This is represented by a horizontal line at zero in figures 4.3 and 4.4 and indicates a good fit. 3746 muon events (74.9%) and 2580 proton events (51.6%) had kinetic energy seeds with a difference of greater than 10% from the true value had a reconstructed value with a smaller than 10% difference from the seed. This is represented by the diagonal line in figures 4.3 and 4.4 and indicates the fitter was unable to move from the seed. It 53 seems the muon kinetic energy likelihood surfaces is particularly more difficult to fit than the proton kinetic energy likelihood surfaces, although both are not converging as often as desired. 4.2 Likelihood surfaces To understand issues in fitter convergence, it is often useful to know the shape of the likelihood surface. Figures 4.5 - 4.12 show the likelihood surfaces for a typical, fully-contained, proton event. For each plot of the likelihood surfaces, all parameters are held at the true value except the varied parameter shown on the horizontal axis. 54 Figure 4.5 The event display for the likelihood surfaces in figures 4.7 - 4.12 as shown on an x-z projection with the cross-sections of half the FGD bars shown. The horizontal layers are not shown but should be understood to exist between the vertical layers and with the long axes of the bars along the x-direction. Blue crosses represent the true proton positions with the particle starting at the green star and coming to rest at the yellow star. Green triangles represent the isorecon reconstruction which was used as the seed for this event. The magenta triangles and the line connecting them represent the final reconstructed trajectory for this event. The bars with deposited charge have large black dots in their centres. 55 Figure 4.6 The event display for the likelihood surfaces in figures 4.7 - 4.12 as shown on an y-z projection with the cross-sections of half the FGD bars shown. The vertical layers are not shown but should be understood to exist between the horizontal layers and with the long axes of the bars along the y-direction. Blue crosses represent the true proton positions with the particle starting at the green star and coming to rest at the yellow star. Green triangles represent the isorecon reconstruction which was used as the seed for this event. The magenta triangles and the line connecting them represent the final reconstructed trajectory for this event. The bars with deposited charge have large black dots in their centres. 56 Figure 4.7 The negative log likelihood surface when the x parameter of the vertex is varied. The seed, reconstructed, and true values are shown by the vertical solid blue line, dashed red line, and dashed black line respectively. 57 Figure 4.8 The negative log likelihood surface when the y parameter of the vertex is varied. The seed, reconstructed, and true values are shown by the vertical solid blue line, dashed red line, and dashed black line respectively. 58 Figure 4.9 The negative log likelihood surface when the z parameter of the vertex is varied. The seed, reconstructed, and true values are shown by the vertical solid blue line, dashed red line, and dashed black line respectively. 59 Figure 4.10 The negative log likelihood surface when the θxz parameter of the particle at the vertex is varied. The seed, reconstructed, and true values are shown by the vertical solid blue line, dashed red line, and dashed black line respectively. θxz [radians] fsdsfsdf sf[ééfsdklsdjflsdkjf[radians] 60 Figure 4.11 The negative log likelihood surface when the θyz parameter of the particle at the vertex is varied. The seed, reconstructed, and true values are shown by the vertical solid blue line, dashed red line, and dashed black line respectively. θyz[radians] 61 Figure 4.12 The negative log likelihood surface when the kinetic energy of the particle at the vertex is varied. The seed, reconstructed, and true values are shown by the vertical solid blue line, dashed red line, and dashed black line respectively. Figures 4.7 - 4.12 give an impression of the typical likelihood surfaces used by FLikFit. In most cases it is seen that the true value lies at the most likely point, but figures 4.1 - 4.4 indicate that the fitter often has trouble migrating its hypothesis to these values when the seed is displaced. In almost all cases, the likelihood surface, although showing a general trend toward increasing likelihood near the true value (or decreasing negative log likelihood as plotted), has a very uneven surface with a multitude of local minima farther from the true value. The periodic shape of the likelihood surface far from the true value as seen most clearly in figure 4.7 is caused by the varying amount of scintillator in bars without charge traversed by the 62 projected path. This periodic tendency is common to all position parameters when no hit bars are traversed. 4.3 Hypothesis testing The fitter's ability to discern different particle hypotheses was examined by performing fits on the same events multiple times, each with a different hypothesis. Single track muons and protons were evaluated using both a lone proton and lone muon hypothesis. Their likelihoods were compared in figures 4.13 - 4.15. These fits were performed using seeds based on previous fits as described in section 3.9. The best fit likelihoods were compared and it seems evident that most particles are either identified correctly or are too similar for the fitter to distinguish as seen by the large peak near zero. 81% and 74% of true protons and true muons had likelihood ratios in favour of the correct particle identity. Investigation of the events with large likelihood ratios in favour of the incorrect particle reveals that they typically are caused by either poor seeds from the preceding reconstructions or by the fitter failing to converge, usually by blowing up the kinetic energy of the particle. It is not known what causes this problem. 63 Figure 4.13 Likelihood ratio for single proton tracks fit with both a muon and proton hypothesis. Tracks considered more likely to be a proton are found on the right hand side. Of 3850 events, the fitter correctly concluded identified the proton 81% of the time. 64 Figure 4.14 Likelihood ratio for single muon tracks fit with both a muon and proton hypothesis. Tracks considered more likely to be a muon are found on the left hand side. Of 4911 events, the fitter correctly concluded identified the muon 74% of the time. To further the discriminating power of short proton tracks, a decision on which type of particle was observed could be made at a value other than whether the likelihood ratio is exactly unity. Figure 4.15 shows both the likelihood ratios of both cases of the true particle identity overlaid. In this case, it seems there is little discriminating power for the log of the likelihood ratio between -10 and 10, however outside this range FLikFit is fairly adept at identifying the particle. . 65 Figure 4.15 The likelihood ratio between the proton hypothesis and the muon hypothesis. The red curve had true protons, the blue curve had true muons. Both curves are normalized. Since the fitter is expected to run primarily on events consisting of a long muon track with the possibility of a short proton track near the vertex, the fitter's ability to detect that short proton were performed. The muon was seeded using previous reconstructions in all cases. In the case where there is a true proton, the 1 mu + 1 p hypothesis had the proton seeded by its true direction and energy although the vertex was the same as the muon fit. In the case where there was no true proton, this hypothesis applied a random proton direction and gave it a random kinetic energy with a uniform distribution of 0 to 300 MeV. 66 Figure 4.16 Likelihood ratio for an event with a true long muon and a true short proton track fit with both a muon hypothesis and a muon + proton hypothesis. Tracks considered more likely to include a proton are found on the right hand side. Of 5000 events, the fitter correctly concluded that there was a proton 73% of the time. Figure 4.16 indicates that given a good seed, the fitter can correctly identify the proton approximately 73% of the time. This must be taken with some slight trepidation as these fits seeded the proton with true values, but a well-written seeding algorithm to find the best seed for a proton (such as the one described in section 5.2) should not decrease this value greatly. 67 Figure 4.17 Likelihood ratio for a single lone muon track fit with both a muon hypothesis and a muon + proton hypothesis. Tracks considered more likely to include a proton are found on the right hand side. Of 4995 events, 85% correctly conclude that there is no add additional proton. Figure 4.17 indicates that FLikFit is capable of rejecting the possibility of a non-existent proton approximately 85% of the time. Again, a proper proton seeding algorithm will find a seed for the proton direction and energy that would cause the proton track to not cause as large of an impact on the likelihood relative to the random sampling that was used in this study. Regardless, it is clear that FLikFit could be developed into a powerful hypothesis testing tool. To further the discriminating power of short proton tracks, a decision of whether to include the proton could be made at a value other than whether the likelihood ratio is exactly unity. 68 Figure 4.18 shows both the likelihood ratios with and without the additional proton overlaid. One possible choice of cut-off point between the two hypotheses could be placed at (4.1) where, when moving in increasing likelihood ratio favouring the proton, it becomes more likely that the event included a proton than that it did not. The decision of where to place this cut-off should depend on the consequences of a falsely identifying a non-existent proton relative to missing an existing proton. A study of these consequences has not been performed. Figure 4.18 The likelihood ratio of the hypothesis including the short proton and the hypothesis not including the proton. There existed a true proton in creating the blue histogram, there was no true proton for the red histogram. Both curves are normalized. 69 Chapter 5: Conclusion 5.1 FLikFit's potential as a useful reconstruction algorithm This fitter was designed to search for low energy protons within the FGD's. At its present state, it appears to be unable to achieve this goal although it shows some promise that with further development this goal could be reached. Currently the inability of the fitter to navigate away from local minima to find the true minimum seems to be the primary obstacle in accomplishing this task. It appears as though the fits are often falling into the many local minima and becoming stuck, unable to progress toward the true global minimum of the negative log likelihood surface. If this problem was fixed, a proton seeding algorithm could be written and optimized. The ability of the fitter to distinguish protons and muons (figures 4.13 - 4.15) and to distinguish whether there exists any additional protons on a long muon track (figures 4.16 - 4.18) are key elements in the goal of being useful in reconstruction in the FGD. In addition, the fitter is able to reconstruct some events which the previous reconstructions are unable to solve while causing a much smaller number of previously well fit events to become poorly reconstructed. FLikFit could be implemented even in its current state and would cause more good than harm to the reconstruction efficiency of the FGD, although the effect would likely not be very significant. With a very small strain on computational time, FLikFit has the potential to become a useful part of standard FGD reconstruction should the convergence issues be solved. 5.2 Proton seeding The fitter currently lacks a method to search for a seed for previously undetected protons. Several options exist which would need to be compared and optimized. One option is to sample 70 several different directions and kinetic energies and to choose to fit on the option with the highest likelihood. To determine the number of samples required (effectively finding the spacing of a three-dimensional grid of the proton angles and kinetic energy) would require an analysis of how far off a seed can be and still lead to an accurate fit. Since the fitter is seen to stick near the seeds regardless of the quality of that seed, this evaluation cannot be performed until the convergence problem is solved. A second option is to try to aim the proton track in a direction which contains hits that are either not traversed by the muon's predicted path using seed parameters or toward hits that have an amount of charge which is unlikely to be explained by the muon alone. One of the drawbacks of this approach is that the seed from previous reconstructions, especially seeds from either the ECAL or TPC matched tracks, often are positioned in a way that obscures much of the proton track. This is because the Kalman filter algorithm bends the muon track in the direction of nearby hits, even if those hits came from a short-ranged proton. An example of this can be seen in figure 5.1 where the Kalman filter curved its track in the negative x direction to accommodate hits that actually originated from the proton. 71 Figure 5.1 The track matched from the TPC is in green with the node used as the seed for FLikFit denoted by the large red star. Hit bars have black circles in their centres. The black line and magenta triangles represent nodes created by FLikFit's final reconstruction. Blue and black crosses denote the true path of the proton and muon respectively. 5.3 Potential avenues for improvements FLikFit's primary drawback is its inability to find the global minimum and to frequently return the seed which was supplied to it. In the course of this work, the standard simplex algorithm as supplied by the MINUIT package in ROOT was used almost exclusively. MIGRAD, the most common and well optimized fitting algorithm used by MINUIT is unable to handle this problem since it relies on the gradient of the likelihood surface and the surfaces 72 created by FLikFit are far too noisy and step-like to support this technique. However, there are other algorithms that could be tested for their ability to supply the fit. One of the most promising candidates is simulated annealing. In short, this algorithm functions by examining a set of fitting parameters similar to the current set and calculating a probability that it should replace the current parameters with the new ones. This probability depends on the "temperature", a parameter internal to the fit which is decreased at a fixed rate, and the evaluated likelihoods of the two sets of fitting parameters. One of the key strengths of simulated annealing is that the probability that the fit will jump to a new set of parameters is reduced, but not zero, if the new parameters return a lower likelihood than the current ones. This allows the fit to migrate out of local minima more easily. The temperature is decreased quickly if a fast, computationally inexpensive fit is desired, and decreased slowly if a robust and accurate fit is needed. Such an algorithm could work well on the reconstruction problems supplied by FLikFit. Since the PDF's of the fitter are constructed in terms of kinetic energy and path length, it is important to know these parameters well. The path length can become misjudged due to multiple scattering, a nuclear interaction, or a decay, even if the true properties of the particle at the vertex are known. Similarly, the kinetic energy estimate can become inaccurate due to staggering. By deweighting bars farther from the vertex these effects should be minimized. This deweighting could be caused to depend on the seed kinetic energy as well as the seed vertex and upon the particle type, although the difference found was minimal. A reformulation of the likelihoods would cause a great deal of change to the existing software, but could eliminate many of the issues noted above. In its present form, the fitter calculates a likelihood based on the probability of a certain charge in a bar given the vertex 73 parameters . It relates these two through the kinetic energy and path length through each bar as in equation 5.1: (5.1) where ke' and pl' are dummy variables for the kinetic energy and path length respectively. Currently the assumption is that the trajectory and energy of the particle along its path can be uniquely determined from the initial conditions. In the language of the previous equation: (5.2) where δ represents the Dirac delta function and ke and pl are determined by tracing the path of the particle from the vertex until it reaches (or does not reach) the bar in question. There are certainly other ways to parameterize this second probability taking into account that the path a particle follows cannot be completely predicted, and finding a formulation for P(ke',pl'| ) that includes multiple scattering, recoils, etc. is not overly complicated. However it is more difficult to invent a method that does not neglect the information that the particle does need to follow one specific path and therefore the kinetic energy and path length found for all the different bars should be very closely related. One potential issue is kinks and splits in the track. These could be caused by a sharp recoil, some other form of nuclear interaction, or a particle decay. Additional parameters could be added to the fit for every kink or split. For each kink, three additional parameters would be able to describe the new path: the distance along the initial path from the vertex, and two parameters for the new angle taken. Alternatively, something akin to a Markov chain could be implemented to allow the path of the particle to change as it propagates, although this would require substantial changes to the existing fitting function. 74 5.4 Final Comments FLikFit is a fitter designed to solve the problem of finding short tracks near a vertex within the FGD. It employs a likelihood based approach where on every iteration the particle path(s) are projected through the detector and PDFs are evaluated for each hit bar. In its current state, the fitter seems to provide good particle identification and hypothesis testing for previously well fit tracks, and further development will likely be able to improve its ability to converge. In understanding the activity near the neutrino vertex, the type of neutrino interaction can be determined with greater confidence. This sort of development will be a great help to many oscillation and cross-section studies performed by the T2K collaboration. Bibliography [1] E. W. Otten, C. Weinheimer. Neutrino Mass Limit from Tritium Beta Decay. Rept. Prog. Phys. 71:086201. 2008. URL http://arxiv.org/abs/0909.2104 [2] SNO collaboration. Direct Evidence for Neutrino Flavor Transformation from Neutral-Current Interactions in the Sudbury Neutrino Observatory. Phys. Rev. Lett. 89, No. 1, 011301. 2002. [3] Super-Kamiokande Collaboration. Evidence for Oscillation of Atmospheric Neutrinos. Phys. Rev. Lett. 81. 1998. Pages 1562-1567. [4] K.A. Olive et al. (Particle Data Group). Neutrino Mass, Mixing, and Oscillations. Chin. Phys. C, 38, 090001. 2014. URL http://pdg.lbl.gov/2014/reviews/rpp2014-rev-neutrino-mixing.pdf [5] T2K Collaboration. Observation of Electron Neutrino Appearance in a Muon Neutrino Beam. Phys. Rev. Lett. 112, 061802 (2014). URL http://arxiv.org/abs/1311.4750 [6] T2K Collaboration. The T2K Experiment. Nucl. Instrum. Meth. A 659, 106. 2011. doi: 10.1016/j.nima.2011.06.067. URL http://arxiv.org/abs/1106.1238 [7] S. Bhadra et. al. Optical Transition Radiation Monitor for the T2K Experiment. Nucl. Instrum. Meth. A 703, 45. 2013. doi: 10.1016/j.nima.2012.11.044. URL http://arxiv.org/abs/1201.1922 [8] K. Matsuoka et. al. Design and Performance of the Muon Monitor for the T2K Neutrino Oscillation Experiment. Nucl. Instrum. Meth. A 624, 591. 2010. doi: 10.1016/j.nima.2010.09.074. URL http://arxiv.org/abs/1008.4077 [9] T2K collaboration. Measurement of the T2K Neutrino Beam Properties Using the INGRID On-Axis Near Detector. Nucl. Instrum. Meth. A 694, 211. 2012. doi: 10.1016/j.nima.2012.03.023. URL http://arxiv.org/abs/1111.3119 [10] S. Assylbekov et al. The T2K ND280 Off-Axis Pi-Zero Detector. Nucl. Instrum. Meth. A 686, 48. 2012. doi: 10.1016/j.nima.2012.05.028. URL http://arxiv.org/abs/1111.5030 [11] N. Abgrall et al. Time Projection Chambers for the T2K near detectors. Nucl. Instrum. Meth. A 637, 25. 2011. doi: 10.1016/j.nima.2011.02.036. URL http://arxiv.org/abs/1012.0865 [12] P.-A. Amaudruz. The T2K Fine-Grained Detectors. Nucl. Instrum. Meth. A 696, 1. 2012. doi: 10.1016/j.nima.2012.08.020. URL http://arxiv.org/abs/1204.3666 [13] D. Allen. The Electromagnetic Calorimeter for the T2K Near Detector ND280. JINST 8 P10019. 2013. doi: 10.1088/1748-0221/8/10/P10019. URL http://arxiv.org/abs/1308.3445 [14] S. Aoki. The T2K Side Muon Range Detector (SMRD). Nucl. Instrum. Meth. A 698, 135. 2013. doi: 10.1016/j.nima.2012.10.001. URL http://arxiv.org/abs/1206.3553 [15] The Super-Kamiokande collaboration. The Super-Kamiokande Detector. Nucl. Instrum. Meth. A 501, 418-462. 2003. URL http://www-sk.icrr.u-tokyo.ac.jp/sk/pub/sknimpaper.pdf [16] The T2K collaboration. Private communication. [17] R. Brun and F. Rademakers. ROOT - An Object Oriented Data Analysis Framework. Proceedings AIHENP'96 Workshop, Lausanne. Sep. 1996. Nucl. Inst. & Meth. in Phys. Res. A 389, 81-86. 1997. URL http://root.cern.ch/. [18] The Geant4 collaboration. Geant4 - A Simulation Toolkit. Nucl. Inst. & Meth. in Phys. Res. A506, 250-303. 2003. [19] A. Hillairet, A. Izmaylov, B. Jamieson, T. Lindner, D. Roberge, F. Sanchez, A.C. Villanueva, T. Wachala, and L. Whitehead. ND280 Reconstruction. Technical report, T2K Technical note TN-072 (internal), v1. 2011. [20] D. Brook-Roberge. Measurements of Neutrino Interactions on Water using a Fine-Grained Scintillator Detector with Water Targets. University of British Columbia, Ph.D. thesis. 2012. [21] M.J. Berger, J.S. Coursey, M.A. Zucker and J. Chang. Stopping-Power and Range Tables for Electrons, Protons, and Helium Ions. NIST, Physical Measurement Laboratory. 2005. URL http://www.nist.gov/pml/data/star/index.cfm [22] K.A. Olive et al. (Particle Data Group). Atomic and Nuclear Properties of Materials. Chin. Phys. C, 38, 090001. 2014. URL http://pdg.lbl.gov/2014/AtomicNuclearProperties/index.html [23] J. Nelder, R. Mead. "A simplex method for function minimization". Computer Journal 7: 308–313. 1965. doi:10.1093/comjnl/7.4.308.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Likelihood based reconstruction algorithm for finding...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Likelihood based reconstruction algorithm for finding short tracks within the T2K experiment's fine-grained… McCarthy, Mark John 2014
pdf
Page Metadata
Item Metadata
Title | Likelihood based reconstruction algorithm for finding short tracks within the T2K experiment's fine-grained detectors |
Creator |
McCarthy, Mark John |
Publisher | University of British Columbia |
Date Issued | 2014 |
Description | The T2K experiment is a long baseline neutrino experiment. Rates of different types of neutrino interactions are compared at a near detector and a detector 295 km from the neutrino source which enables neutrino oscillation and cross-section measurements. Understanding the activity near the neutrino interaction vertex in the near detector is crucial to identifying how a neutrino interacted. Current reconstruction algorithms for T2K's Fine Grained Detectors (FGD) have a decreased sensitivity to short tracks (<200mm) such as those created by low energy protons and pions. A new likelihood based reconstruction algorithm entitled FLikFit (FGD Likelihood Fitter) has been written that uses information from the deposited charge and the position of the hits within the FGDs in an attempt to tag these particles. It traces a predicted path of the particle through the detector and calculates expected kinetic energy and path length values which are fed to an existing probability distribution function to obtain a likelihood for each scintillator bar. The fitter is seen to calculate likelihoods effectively and proves to be adept at hypothesis testing, but further development is required to improve the FLikFit's ability to converge. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2015-01-07 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivs 2.5 Canada |
DOI | 10.14288/1.0167650 |
URI | http://hdl.handle.net/2429/51854 |
Degree |
Master of Science - MSc |
Program |
Physics |
Affiliation |
Science, Faculty of Physics and Astronomy, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2015-02 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/2.5/ca/ |
Aggregated Source Repository | DSpace |
Download
- Media
- 24-ubc_2015_february_mccarthy_mark.pdf [ 2.26MB ]
- Metadata
- JSON: 24-1.0167650.json
- JSON-LD: 24-1.0167650-ld.json
- RDF/XML (Pretty): 24-1.0167650-rdf.xml
- RDF/JSON: 24-1.0167650-rdf.json
- Turtle: 24-1.0167650-turtle.txt
- N-Triples: 24-1.0167650-rdf-ntriples.txt
- Original Record: 24-1.0167650-source.json
- Full Text
- 24-1.0167650-fulltext.txt
- Citation
- 24-1.0167650.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0167650/manifest