Open Collections

UBC Undergraduate Research

Accessing the ALPs : Reconstructing merged two-photon decays in the Belle II detector Whiteaker, Kelton 2020-05

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


52966-Whiteaker_Kelton_PHYS_449_2020.pdf [ 6.6MB ]
JSON: 52966-1.0392642.json
JSON-LD: 52966-1.0392642-ld.json
RDF/XML (Pretty): 52966-1.0392642-rdf.xml
RDF/JSON: 52966-1.0392642-rdf.json
Turtle: 52966-1.0392642-turtle.txt
N-Triples: 52966-1.0392642-rdf-ntriples.txt
Original Record: 52966-1.0392642-source.json
Full Text

Full Text

Accessing the ALPs: Reconstructing merged two-photondecays in the Belle II detectorbyKelton WhiteakerA THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFBSc HonsinTHE FACULTY OF SCIENCE(Physics and Astronomy)The University of British Columbia(Vancouver)May 2020c© Kelton Whiteaker, 2020AbstractAxion-like particles (ALPs) are spin 0 bosons proposed to mediate interactionsbetween dark matter and the Standard Model. The Belle II detector at the Su-perKEKB e+e− accelerator in Japan is well-suited to search for an ALP to two-photon decay; however, a large fraction of possible two-photon ALP decays, anda large fraction of Standard Model two-photon decays, remain unresolvable in thedetector due to the small opening angle between their daughter photons. This smallopening angle makes the Belle II photon detector software interpret the pair ofphotons as a single photon: this is called a merge. This project will explore var-ious methods of classification, including a machine learning algorithm, to enableBelle II users to manually distinguish between a single photon and two mergedphotons. The machine learning algorithm presented is able to distinguish the twocases to a high degree of accuracy, but even without its implementation, tools al-ready available at Belle II that can help with this problem are identified.iiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 The Belle II Experiment . . . . . . . . . . . . . . . . . . . . . . 41.3 This Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Theory and Experimental Setup . . . . . . . . . . . . . . . . . . . . 72.1 The Standard Model of Particle Physics . . . . . . . . . . . . . . 72.2 Beyond the Standard Model: Axion-Like Particles . . . . . . . . . 82.3 GEANT and Monte-Carlo Simulation . . . . . . . . . . . . . . . 122.4 The Belle II Electromagnetic Calorimeter . . . . . . . . . . . . . 132.5 Kinematics of Decays to Two Photons . . . . . . . . . . . . . . . 173 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.1 MC Generation and Data Analysis at Belle II . . . . . . . . . . . 253.2 Identifying Single-Photon Events . . . . . . . . . . . . . . . . . . 273.3 Identifying Merged Events . . . . . . . . . . . . . . . . . . . . . 28iii3.4 Quantifying Performance of a Variable . . . . . . . . . . . . . . . 303.5 Convolutional Neural Networks . . . . . . . . . . . . . . . . . . 324 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.1 Classifying Clusters using Statistical Moments . . . . . . . . . . 354.2 Classifying Clusters using a Convolutional Neural Network . . . . 415 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.1 Performance of Variables . . . . . . . . . . . . . . . . . . . . . . 455.2 Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50A CNN Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54B Particle-Generating Script . . . . . . . . . . . . . . . . . . . . . . . 57ivList of TablesTable 2.1 Properties of some candidate ECL scintillator materials [16]. . 14Table A.1 Parameter values for the 5 GeV and 7 GeV CNNs. . . . . . . . 56vList of FiguresFigure 1.1 ALP-strahlung: an ALP a, created by a virtual photon γ∗, de-cays into two photons γγ . By momentum conservation, thesingle photon travels in the opposite direction of the two ALPphotons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Figure 1.2 In simulation, opening angle distributions of true pi0→ γγ de-cays and detector-reconstructed decays. Only 78% of the sim-ulated pions were reconstructed. . . . . . . . . . . . . . . . . 3Figure 1.3 Top-down view of SuperKEKB and the Belle II detector [13]. 4Figure 1.4 3D cross-section of the Belle II detector [4]. . . . . . . . . . . 5Figure 2.1 The particles of the Standard Model and some of their properties. 8Figure 2.2 Two a→ γγ decay modes [10]. . . . . . . . . . . . . . . . . . 9Figure 2.3 When comparing photon fusion and ALP-strahlung at Belle II,(a) more ALPs are produced at all angles by photon fusion thanALP-strahlung, and (b) the longitudinal momentum of ALPsproduced by photon fusion is peaked at low values [10]. . . . 10Figure 2.4 Current ALP mass/coupling constant parameter space constraintsin the Belle II detector [10]. “Merged” ALPs have daughterphotons that are too close to distinguish in the detector. . . . . 11Figure 2.5 An electromagnetic particle shower. . . . . . . . . . . . . . . 13Figure 2.6 Side view of the Belle II crystal array, with a single crystalshown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15viFigure 2.7 A top-down view of the Belle calorimeter, which was usedagain in the Belle II detector: barrel (yellow), forward endcap(blue) and backward endcap (green) [24]. . . . . . . . . . . . 15Figure 2.8 Reconstruction of ECLClusters from energy deposit informa-tion [21]. Clusters are used in data analysis as photon/elec-tron/positron candidates. The splitting in (d) is done by as-signing different weights to each crystal, with the new showersonly keeping a fraction of the shared crystals’ energies. . . . . 16Figure 2.9 Front and side views of the ECL crystal array. Note that (a)is only an approximate maximum separation distance betweentwo merged photons: not all photons with a smaller separationdistance will merge, since they may deposit the majority oftheir energy elsewhere; conversely, some photons with a largerseparation distance may merge as a shared energy deposit maycause a local maximum, as in (b). . . . . . . . . . . . . . . . 17Figure 2.10 Rest frame of parent particle in independent simulation. . . . 18Figure 2.11 Two rest frame decays and their images in the CMS frame,where z is the direction of the parent particle momentum. Pho-tons 1 and 2 will have similar lab frame momenta in (a), but in(b) photon 1 will carry much less lab frame momentum. . . . 19Figure 2.12 Distribution of photon opening angle in 100 000 two-photondecays, from independent simulation. . . . . . . . . . . . . . 21Figure 2.13 Minimum opening angle in two-photon decays. Each datapoint is the result of simulating 1000 decays. . . . . . . . . . 22Figure 2.14 From simulation of e+e− → ωγ , ω → γpi0, momentum dis-tribution of the true pi0 mesons that occurred in simulationfor two different opening angles. From Alon Hershenhorn viaemail on March 10, 2020. . . . . . . . . . . . . . . . . . . . 23Figure 3.1 Belle II analysis: direct information from the real or simulateddetector is saved in raw data files. Events are reconstructedin C++ and useful variables and particle lists are extractedthrough user-written Python steering scripts [25]. . . . . . . . 25viiFigure 3.2 Possible events when generating a single photon. Clockwisefrom top left: the photon doesn’t interact; the photon interactsand produces an e+e− pair, one of which interacts and pro-duces a photon by Bremsstrahlung; the photon interacts andproduces an e+e− pair and neither daughter interacts. . . . . . 27Figure 3.3 Possible events when generating a single pi0. Clockwise fromtop left: a pi0 decay mode that isn’t pi0 → γγ; a pi0 daughterphotons interacts and produces an e+e− pair; the pi0 decays totwo photons that don’t pair-produce, but they’re too far apart tomerge; the pi0 decays to two photons that don’t pair-produce,and they’re close enough to merge. . . . . . . . . . . . . . . . 29Figure 3.4 Distributions of two example variables for merged and single-photon clusters. Whether a cluster is merged or not can bedetermined with 100% accuracy by its Q value, but not by itsenergy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Figure 3.5 Distributions and ROC curves of an arbitrary variable X andthe ideal variable Q. . . . . . . . . . . . . . . . . . . . . . . 32Figure 3.6 Architecture of the Convolutional Neural Network used in thisproject. Output is produced after FC layer 2. . . . . . . . . . 33Figure 4.1 Second moment distributions and ROC curves for approxi-mately 25 000 merged photons and 60 000 single photons. . . 36Figure 4.2 Zernike moment 40 distributions and ROC curves for approx-imately 25 000 merged photons and 60 000 single photons. . . 37Figure 4.3 Lateral moment distributions and ROC curves for approximately25 000 merged photons and 60 000 single photons. . . . . . . 38Figure 4.4 Second, Zernike 40, and lateral moment ROC curves at (a)5 GeV, (b) 3 GeV, (c) 7 GeV. . . . . . . . . . . . . . . . . . . 40Figure 4.5 CNN output distributions and ROC curve for approximately23 000 merges and 23 000 single photons generated with mo-mentum 5 GeV. . . . . . . . . . . . . . . . . . . . . . . . . . 42viiiFigure 4.6 CNN output distributions and ROC curve for approximately36 000 merges and 36 000 single photons generated with mo-mentum 7 GeV. . . . . . . . . . . . . . . . . . . . . . . . . . 43Figure 4.7 Zoomed-in ROC curve of CNN output for merges and singlephotons generated with momentum 5 GeV and 7 GeV. . . . . 44ixAcknowledgmentsI would like to thank my supervisor, Professor Chris Hearty, for the opportunityto work with the Belle II collaboration and for his consistent and always helpfulguidance, and Dr. Ewan Hill for his guidance in writing this thesis and navigatingKEKCC. I would like to enormously thank Abtin Narimani Charan for letting memodify his CNN program, for giving me resources to make my own CNN, andfor spending hours with me to help me understand what I’m doing. Thanks alsoto Miho and Alon from Chris’ Belle II group, who helped me understand essen-tial background information, navigate DESY/KEK, and feel extremely welcomeoverall. Finally, thanks to Professor Rob Kiefl for his guidance in writing andpresenting thesis proposals.xChapter 1Introduction1.1 MotivationMany known physical processes outlined in Section 2.1 result in the production oftwo photons. For example, the pi0→ γγ decay and the Higgs decay H→ γγ . Thereare also processes that result in two photons in extensions to the Standard Modelof Particle Physics, such as Axion-Like Particles (ALPs), which are described indetail in Section 2.2 and which in part motivate this project. ALPs are a possiblemediator between dark matter and the Standard Model, and in the case that theyinteract with photons, the process of ALP-strahlung shown in Figure 1.1 can resultin the production of three photons, two of which have the same parent.In analyzing two-photon decay events, it is very important that the two photons,referred to as “daughter photons”, are well-separated in the detector being used tomeasure them. At Belle II, if the separation distance between the photons is verysmall, the photons’ signatures in the detector will overlap and Belle II’s raw dataanalysis software will be unable to distinguish them from a single photon. Theway in which this error, called “merging”, occurs at Belle II will be described inSection 2.4. The merging of daughter photons makes the task of identifying an ALPmore difficult: Section 2.2 explains that a significant portion of ALP parameterspace remains inaccessible because of the merging of the ALP’s daughter photons.But merging is not just a problem for ALPs: Figure 1.2 compares the truedaughter photon opening angle distribution for simulated pi0→ γγ decays at 4 GeV1aγ∗γγγe+e−(a) ALP-strahlung Feynman Diagram [10]. (b) ALP-strahlung in the Belle II detector [12].Figure 1.1: ALP-strahlung: an ALP a, created by a virtual photon γ∗, decaysinto two photons γγ . By momentum conservation, the single photontravels in the opposite direction of the two ALP photons.with the opening angles reconstructed by the detector’s software. 20 000 (22%) ofthe simulated pions are not reconstructed by the software, and for many of thosethat remain, their opening angles are reconstructed incorrectly. This shows that alarge portion of the two-photon decays that occur are unrecognizable to the detectoras two-photon decays. This project aims to build on the Belle II detector’s currentsensitivity to two-photon decays such that this merging can be identified by theuser.2Figure 1.2: In simulation, opening angle distributions of true pi0→ γγ decaysand detector-reconstructed decays. Only 78% of the simulated pionswere reconstructed.31.2 The Belle II ExperimentThe SuperKEKB collider in Tsukuba, Japan collides 7 GeV electrons, e−, with4 GeV positrons, e+, at a centre-of-mass energy of 10.58 GeV [17]. This colli-sion happens inside the Belle II detector, which surrounds the point of collision,called the interaction point. The accelerator and the original Belle detector weremotivated by precision measurement of CP violation through the production of Bmesons, but the Belle II detector is set up for a wide range of measurements.Figure 1.3: Top-down view of SuperKEKB and the Belle II detector [13].Low-centre-of-mass-energy e+e− colliders are useful sites to search for newphysics because of their lower background compared to hadron colliders: hadronsare composite particles and all evidence supports that electrons are not, so lessinteractions are possible in one beam crossing, leading to a cleaner environment.Therefore, hadron colliders discover new physics by reaching unprecedented en-4ergy scales while lepton colliders do so by measuring known energy scales moreprecisely. The SuperKEKB collider and Belle II detector are shown in Figure 1.3.The Belle II collaboration spans dozens of countries and depends on the work ofhundreds of people.SuperKEKB is particularly well-suited to new physics searches: it has a target40x higher rate of collisions than the KEKB collider, which had the highest instan-taneous collision rate of its time. SuperKEKB’s higher collision rate is due to ahigher current of electrons, and to superconducting magnets that focus the e+ ande− beams near the interaction point [13].This higher collision rate results in a higher number of events: an event isa single e+e− collision and the particles that result from it. The higher numberof events gives more data, reducing statistical uncertainty and making measure-ments more precise. The fact that the collider produces more of everything alsomeans that less-likely processes, like those sought out in beyond-the-Standard-Model searches, will occur more in number.Figure 1.4: 3D cross-section of the Belle II detector [4].A few of the many components of the Belle II detector include [17]: a ver-5tex detector (VXD) around the interaction point (IP) that resolves the collisionpoint of the e+e− beams during charged particle production; a central drift cham-ber (CDC) that resolves the trajectories of charged particles through a magneticfield that points along the path of the e+e− beams (the curved trajectory in themagnetic field helps resolve charged particle momenta); a two-part particle iden-tification system (PID) around the CDC; a K0L and muon detector (KLM); and anelectromagnetic calorimeter (ECL) that resolves the energy and position of pho-tons, electrons, or positrons. Figure 1.4 shows the locations of these componentsin the detector, which is cylindrical, consisting of a barrel and two endcaps. As thegoal of this project is the improvement of two-photon resolution, its focus will beon the ECL.1.3 This ProjectThis project aims to develop a tool, available to Belle II users, that can classifyan ECL signal as two merged photons or a single photon to a percent certainty.Past work on ECL reconstruction [15, 20], that is, the code that reconstructs whathappened using the raw ECL data, supports the use of machine learning for thisgoal: both references successfully provided a tool which can verify the origin of areconstructed cluster and is robust against background.Upon completion, this project has the potential to make available new regionsin ALP parameter space (see Figure 2.4), and Section 2.5 shows that it will greatlyincrease the amount of available two-photon decay data for low-mass and high-momentum parent particles (like the decay in Figure 1.2). This would make anal-ysis involving the pi0 meson, ALP or any other particle which decays into twophotons more accurate. Additionally, fully recovering the true opening angle dis-tribution shown in Figure 1.2 would allow ALP mass specification for ALPs withsimilar momenta: even if energy deposits are indistinguishable, Section 2.5 showsthat opening angle distribution will have a very different minimum value for dif-ferent masses.6Chapter 2Theory and Experimental Setup2.1 The Standard Model of Particle PhysicsThe particles of the Standard Model of Particle Physics and their properties areoutlined in Figure 2.1a. The Standard Model constitutes all fundamental particlescurrently known and verified: six leptons, six quarks, four Gauge bosons, and theHiggs boson. For all quarks and leptons, there exist antiparticles, which have theopposite electric charge1. Gauge bosons mediate interactions between StandardModel particles. Mesons, like the pi+ shown in Figure 2.1b, are quark-antiquarkpairs and can have spin 1 (vectors) or spin 0 (scalars).Many particles, such as the Higgs boson or the pi0 and η mesons, can decayinto two photons. In the case of the pi0, this is its primary decay mode with 98.8%of pi0 decays resulting in two photons [3]. As shown in Section 2.5, depending onthe mass, momentum and spin of the parent particle, these two photons may havea very small opening angle. This will make them hard to distinguish from a singlephoton in many particle detectors.1This is not a definition; electric charge is just one of many properties that are different forantiparticles.7(a) The particles of the Standard Model of Particle Physics andsome of their properties [29].(b) A baryon (three quarks) and a meson(two quarks). The baryon shown here is a protonand the meson is a positive pion [27, 28].Figure 2.1: The particles of the Standard Model and some of their properties.2.2 Beyond the Standard Model: Axion-Like ParticlesSome extensions to the standard model make use of light spin 0 particles calledAxion-like particles (ALPs), which couple to Gauge bosons. Many problems inparticle physics can be resolved by introducing ALPs in the MeV to GeV range.For example, the muon’s experimentally determined anomalous magnetic moment2disagrees with theoretical prediction by up to 4.0σ but can be explained by theeffects of an ALP [19], and in the decay of certain excited nuclei, larger-than-expected invariant mass of an internal e+e− pair is consistent with the productionof a light intermediate boson like the ALP [11]. Centrally to the motivation of thisproject, ALPs acting as mediators between dark matter and the Standard Modelhave the potential to connect Standard Model particles to dark matter particles suchthat the constraints on their observation are consistent with the estimated amountof dark matter in the universe [9].For the purposes of this discussion, focus will be on the interaction of an ALP2Magnetic moment can be defined in terms of a factor of g, with expected g = 2. Anomalousmagnetic moment a≡ (g−2)/2.8a with Standard Model gauge bosons, particularly the case of coupling to photons.Interactions of ALPs with Standard Model fermions or gluons will not be includedas these interactions are already tightly constrained by other searches, whereasprobing the coupling of ALPs to photons requires double-photon searches like theones facilitated by this project [10]. The coupling of ALPs to the weak-force gaugebosons can also be investigated at Belle II, through methods outlined at the end ofthis section.The most important coupling constant for this work is gaγγ , the strength of theALP-two-photon interaction. However, it should be noted that as dark matter-Standard Model mediators, coupling of an ALP with one or more dark matterparticles χ is always possible [10]. When photon coupling dominates, the aγγ-interaction determines the lifetime of the ALP [10]:τa =64pig2aγγm3a. (2.1)In the photon-coupling regime, ALPs can be created through ALP-strahlung,shown in Figure 2.2a, and photon fusion, shown in Figure 2.2b.aγ∗γγγe+e−(a) ALP-strahlung.e+e−γγae+e−(b) Photon fusion.Figure 2.2: Two a→ γγ decay modes [10].Photon fusion, responsible for the majority of ALPs produced at Belle II (seeFigure 2.3a), has a production rate peaked at small ALP momenta (Figure 2.3b)[10]; as described in Section 2.5, this low momentum will result in two photons9with a large opening angle. The image of photon fusion in the Belle II detectorwould be two photons back-to-back in their centre of momentum frame, with aninvariant mass equal to the ALP mass and with no longitudinal momentum3 [10].This would be fully resolved in the detector if not for the fact that for low-massALPs, the photon energy is so low that they mimic the beam-induced backgroundand are not considered a significant signal by the detector [10].Photon fusionALP-strahlung0.1 0.5 1 5 1010-610-510-40.0010.0100.1001ma [GeV]σ[pb]s1/2 = 10.58 GeV, gaγγ = 10-4 GeV-1(a) ALP production rate at all angles as a function ofALP mass.-4 -2 0 2 [GeV]σ-1dσ/dpzs12 = 10.58 GeVma = 1 GeV(b) ALP production rate as a function of longitudinalmomentum (momentum along the z axis of the cylindricaldetector) of the produced ALP.Figure 2.3: When comparing photon fusion and ALP-strahlung at Belle II,(a) more ALPs are produced at all angles by photon fusion than ALP-strahlung, and (b) the longitudinal momentum of ALPs produced byphoton fusion is peaked at low values [10].The large opening angles of photon fusion mean that the process of concern tothis project is ALP-strahlung, as it is more likely to result in two merged photons.Varying ALP mass and aγγ coupling independently, five cases split up the ALPgaγγ -mass parameter space as shown in Figure 2.4:1. Referring to Equation 2.1, an ALP with mass on the order of GeV will decayinstantly inside the detector, and the opening angle between photons will belarge enough to resolve. These photons are distinguishable in the detector,so this case is marked as “resolved” [10].3The scattered electron and positron typically exit the detector, as their momenta transverse to thedirection of collision are small1010-4 10-3 10-2 10-1 100 10110-810-710-610-510-410-310-210-1ma [GeV]g aγγ[GeV-1 ]0.1 m (Belle II lab)3 m (Belle II lab)DisplacedInvisible Merged ResolvedFigure 2.4: Current ALP mass/coupling constant parameter space constraintsin the Belle II detector [10]. “Merged” ALPs have daughter photons thatare too close to distinguish in the detector.2. If an ALP has a lighter mass, ma < 150 MeV, but the coupling constant gaγγis large, the decay will again be instant but the opening angle will be muchsmaller due to the decreased ALP mass. This leads to the two photons’“merged” reconstruction as a single photon [10].3. If an ALP has an even lighter mass than Case 2, with the same couplingconstant, then it will still decay inside the detector – though it may not de-cay instantly and it will be “displaced”. The ECL reconstruction softwareassumes that the origin of a two-photon decay is the e+e− collision point; ifit is not, then momentum and separation angle are reconstructed incorrectly[10]. All that can be seen accurately is the single recoil photon, though it“may be possible to search for displaced clusters in the KLM” [10].4. For gaγγ  1 TeV−1 and for ma  1 GeV, the ALP lifetime 2.1 is largeenough that the ALP decays outside the detector [10]. The image of this is asingle recoil photon and an “invisible” ALP.115. If the ALP decays into two dark matter particles, a→ χχ , this would looklike a single-photon decay and an invisible ALP [10].Other ALP/Standard Model couplings are possible, such as gaγZ and gaγW . Ana→ γZ decay is forbidden for ALPs with mass below 10 GeV, and ALPs withmass over 10 GeV are not expected in an e+e− collider with Ecm = 10.58 GeV. AZ→ γa decay is permitted [10], but the SuperKEKB collider’s energy is too low toproduce a Z boson. Similarly, a→ γW is not possible at SuperKEKB, but an ALPmay be produced from a W boson through the radiative “penguin” B decay shownin ref. [14].This project aims to make available the merged region of parameter spaceshown in Figure 2.4 by allowing the user to verify, to a percent certainty, whether areconstructed photon was indeed a single photon or really a pair of merged photons.2.3 GEANT and Monte-Carlo SimulationThe most sophisticated simulations of particle interactions and decays make use ofMonte-Carlo (MC) simulations. The Belle II experiment uses GEANT4, which isa toolkit designed to simulate the passage of particles through matter and whichhas found applications not only in particle physics but also in medical, nuclear, andastrophysics [1].In MC simulations like GEANT, particles are generated with a given momen-tum and Particle Data Group (PDG) code4, and are transported through a givenmaterial in discrete steps. Because interaction and decay processes are probabilis-tic, the entire life story of a generated particle can be simulated using these relationsand random number selection.GEANT simulations represent reality to a high degree of accuracy, and aretrusted to make critical calculations in space science, medical physics, radiationprotection and nuclear medicine [2]. There is an ongoing effort to improve theiraccuracy, as experimental data provides feedback and GEANT4’s electromagneticand hadronic physics capabilities are extended [2].4The Particle Data group assigns a unique integer to every known particle; see ref. [22] for a listof these codes and the rules by which they are defined.122.4 The Belle II Electromagnetic CalorimeterAs the goal of this project is the improvement of two-photon resolution, its focuswill be on the ECL. However, before describing the ECL, principles of calorimetrymust be outlined.Figure 2.5: An electromagnetic particle shower.When a high-energy photon or charged particle collides with dense matter,it initiates an electromagnetic particle shower, which is how photon, electron orpositron energy is deposited in the calorimeter. Figure 2.5 shows an example of aparticle shower: a photon initiates the production of an e+e− pair, which generateadditional photons by bremsstrahlung5, which engage in more pair production, andso on [16]. If the material is large enough to contain the whole shower, all originalparticle energy will be deposited in it. The amount of material traversed by theshower and the shower radius are respectively dictated by the radiation length X0and the Moliere radius Rm of the material [21]. If the material is a scintillator,like the crystals in the Belle II calorimeter, the energy deposition from the showercauses light emission that can be recorded by photodiodes [16].5Bremsstrahlung is the emission of a photon caused by the deceleration of a charged particle bydeflection with another charged particle.13X0, Rm, scintillation wavelength and cost must all be considered when choos-ing the material of a calorimeter. Ref. [16] gives properties of various scintillators,some of which are shown in Table 2.1. CsI(Tl) was used in the original Belle de-tector, and re-used in the Belle II detector, as it had the highest photodiode relativelight output (“RLO”) while retaining a significantly smaller Rm and X0 than itscompetitor, NaI(Tl) [16]. High photodiode light output eliminates the need for anexpensive photomultiplier, while small Rm suppresses merging of particle showersand small X0 reduces production costs by allowing for shorter crystals [16].Table 2.1: Properties of some candidate ECL scintillator materials [16].NaI(Tl) BGO CsI(Tl)Density (g/cm3) 3.67 7.13 4.53X0 (cm) 2.59 1.12 1.85Rm (cm) 4.5 2.4 3.8RLO (photomultiplier) 1.00 0.15 0.40RLO (photodiode) 1.00 0.21 1.37The ECL consists of an array of 8736 CsI(Tl) crystals, arranged non-projectivelyaround the interaction point as in Figure 2.6 such that photons originating at the IPare unlikely to pass between them without detection [21]. The ECL is split intothree parts that are colour-coded in Figure 2.7: barrel (yellow), forward endcap(blue) and backward endcap (green).For each of the 8736 crystals, scintillation light is captured by photodiodes,which produce an electrical current. Crystal-specific calibration constants changecurrent amplitude and time values into energy deposit and time values and storethese in a crystal-specific “ECLCalDigit” [21]. These ECLCalDigit energies arethen used to make “ECLClusters” (or just “clusters”): groups of crystal energieswith a single local maximum. This process is shown in Figure 2.8 and is calledECL reconstruction. Clusters are used in data analysis as particle candidates [21].This project is focused on the case of a merge, in which two photons hit theECL close enough that only one cluster is made, instead of two, as in Figure 2.9b.This is expected when the two photons land within one or two crystals of eachother: in that case, even though they both have high energies, they will form asingle local maximum, leading to a single cluster. The goal of this project is to14(a) Non-projective crystal array geom-etry, viewed in the direction of the e+e−beams [21].(b) Approximate dimensions of a single CsI(Tl) crystal [16].Figure 2.6: Side view of the Belle II crystal array, with a single crystal shown.Figure 2.7: A top-down view of the Belle calorimeter, which was used againin the Belle II detector: barrel (yellow), forward endcap (blue) and back-ward endcap (green) [24].15(a) Crystal array visualized as a gridwith intensity values corresponding toECLCalDigit energies.(b) ECLCalDigits with energies≥ 10 MeV are marked.(c) Connected regions are made fromneighbours with energies above a threshold.(d) Each connected region is split intoseparate showers with one local maxi-mum. These separate showers become“ECLClusters”.Figure 2.8: Reconstruction of ECLClusters from energy deposit information[21]. Clusters are used in data analysis as photon/electron/positron can-didates. The splitting in (d) is done by assigning different weights toeach crystal, with the new showers only keeping a fraction of the sharedcrystals’ energies.16identify merging after all this reconstruction has happened.An approximate upper limit on the separation distance between merged pho-tons is shown in Figure 2.9a. Given the crystal dimensions in Figure 2.6, thiscorresponds to ∼17 cm, though this is a rough estimate as crystals are not exactlysquare. Referring to Figure 2.7, ECL crystals are at different distances from theinteraction point in different parts of the detector, so this separation distance willbe more important to consider in analysis than opening angle.(a) The farthest possible distanceby which two photons can be sepa-rated while still occupying neighbour-ing crystals.(b) Simplified model of a merge, wherelight blue represent the total crystal energies af-ter summing the overlap of the green and blueenergy deposits.Figure 2.9: Front and side views of the ECL crystal array. Note that (a) isonly an approximate maximum separation distance between two mergedphotons: not all photons with a smaller separation distance will merge,since they may deposit the majority of their energy elsewhere; con-versely, some photons with a larger separation distance may merge as ashared energy deposit may cause a local maximum, as in (b).2.5 Kinematics of Decays to Two PhotonsAn independent Python simulation was developed that predicts the distribution ofmeasured opening angles in a two-photon decay given the mass and lab-framemomentum of the parent particle.The reference frames relevant to this problem, and to any particle generated17in the Belle II detector, are: the lab frame, in which the e+ and e− collide withasymmetrical momentum and in which all observables are measured; the centre-of-mass (CMS) frame, in which the e+ and e− collide with equal momentum; andthe rest frame of the parent particle, in which its two daughter photons have equaland opposite momenta.(a) Rest frame of parent particle, with z axis in di-rection of parent particle CMS-frame momentum.(b) cos(θ1) is equally likely to take on anyvalue between -1 and 1, giving θ1 the distributionshown in red.Figure 2.10: Rest frame of parent particle in independent simulation.Figure 2.10 shows the rest frame of the parent particle: θ1 is the angle from thez axis of the first photon γ1, where the z axis is defined as the CMS frame directionof motion of the parent particle. In the rest frame, the daughter opening angle isfixed at 180◦ by momentum conservation. Since ALPs and pi0 mesons are spin 0particles, the simulation was designed such that the parent particle has spin 0. Thismeans that for each decay simulated, cos(θ1) is chosen from a uniform distributionbetween -1 and 1; this leads to the θ1 distribution in Figure 2.10b, which is peakedat pi/2.Figure 2.11 shows the CMS frame opening angles that can result from two par-ticular choices for θ1: θ1 = 90◦ leads to the minimum possible CMS frame openingangle and equal photon energies, while θ1 close to 180◦ leads to the maximum pos-sible CMS frame opening angle and one photon with much more of the parent’senergy than the other. The lab frame opening angles are similar to those of theCMS frame, but slightly Lorentz-boosted in the direction of the e+e− beams.After the rest frame decay is simulated, the 4-momenta of both photons are18(a) A rest frame decay that results in a small opening angle in the CMS frame .(b) A rest frame decay that results in a large opening angle in the CMS frame.Figure 2.11: Two rest frame decays and their images in the CMS frame,where z is the direction of the parent particle momentum. Photons1 and 2 will have similar lab frame momenta in (a), but in (b) photon1 will carry much less lab frame momentum.Lorentz-transformed to the lab frame and their opening angles are calculated usingvector identities. The distribution of lab-frame opening angles and the minimumopening angle are returned. The distribution of lab frame opening angles is shownfor parent particles of fixed mass 135 MeV in Figure 2.12a and fixed momentum5 GeV in Figure 2.12b. Given a centre-of-mass energy of 10.58 GeV in every col-lision at SuperKEKB, 4-5 GeV is roughly the range of momentum expected fora particle coming directly from the interaction point. Note that the p = 5 GeV,m= 0.135 GeV distribution in Figure 2.12 reproduces the true opening angle dis-tribution in Figure 1.2. Note also that the distributions are peaked at their lowestvalues, so if the lowest opening angle results in a merge, then a large fraction of19information will be lost.Figure 2.13 shows the dependence of minimum separation angle on parentmass or momentum for the distributions in Figure 2.12: minimum opening an-gle of a two-photon decay is directly proportional to the parent particle mass andinversely proportional to its momentum. Keep in mind, however, that particle de-cay is a probabilistic process: even though low-mass and high-momentum particlesare more likely to decay with small opening angles, not all of them will. Hence,the distributions in Figure 2.12 take on many possible values.The benefit of this simplified simulation is that it can be run much faster thanMC simulation: each point in Figure 2.13 is calculated from the opening angledistribution of 1000 parent particle decays; 10 000 parent particle decays take 1-2hours to simulate in GEANT4, whereas the plots in Figure 2.13 were generated inless than 5 minutes.Since opening angle is peaked at its minimum value, and minimum openingangle is smallest for low parent mass or high parent momentum, improving mergedetection can greatly increase the amount of data available in low-parent-mass orhigh-parent-momentum two-photon decays. An example of a high-momentum andlow-mass parent particle is the pi0 in the ω → γpi0 decay in Figure 2.14, where theω was produced by e+e− → ωγ . Figure 2.14a shows that for daughter openingangles smaller than 5◦, there are no parent pi0 mesons with momentum less than3 GeV; this is because the orange distribution in Figure 2.12 has no entries below5◦, so it is kinematically forbidden for 3 GeV pi0 mesons to have any opening an-gles in this region. The momentum distribution in Figure 2.14b also shows that,in realistic events like this one (e+e− → pi0, for example, is not realistic to ex-pect), there are far fewer pi0 mesons with momenta > 6 GeV than with ≤ 6 GeV6,justifying the optimizations done at ∼ 5 GeV in Chapter 3.6Since the pions in the list are the true pions generated by GEANT and were not formed byreconstruction of ECL data, this lack of data in the high-momentum range is definitely not due tomerging.204 6 8 10 12Opening angle [deg]025050075010001250150017502000CountOpening angle distribution (100000 decays), fixed parent m=0.135 GeVp=2 GeVp=3 GeVp=4 GeVp=5 GeV(a) Opening angle distribution at fixed parent mass.2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5Opening angle [deg]0200400600800100012001400CountOpening angle distribution (100000 decays), fixed parent p=5 GeVm=0.135 GeVm=0.400 GeVm=0.600 GeVm=0.800 GeV(b) Opening angle distribution at fixed parent momentum.Figure 2.12: Distribution of photon opening angle in 100 000 two-photon de-cays, from independent simulation.210.0 0.2 0.4 0.6 0.8 1.0mparent [GeV]020406080Minimum opening angle [deg]Minimum opening angle vsparent particle mass (1000 decays)p=1GeVp=2GeVp=3GeVp=4GeVp=5GeV(a) Dependence of minimum opening angle on parent particle mass.0 1 2 3 4 5pparent [GeV]0255075100125150175Minimum opening angle [deg]Minimum opening angle vsparent particle momentum (1000 decays)m=0.1GeVm=0.3GeVm=0.5GeVm=0.7GeVm=1GeV(b) Dependence of minimum opening angle on parent particle momentum.Figure 2.13: Minimum opening angle in two-photon decays. Each data pointis the result of simulating 1000 decays.22Entries  25382Mean    5.186Std Dev     1.235Underflow       0Overflow        00 1 2 3 4 5 6 7 8 9 10 momentum (GeV)Truth0pi0100200300400500600700800Entries / 0.1 GeV(a) Momentum distribution of the true pi0 mesons that occurred in simula-tion, whose opening angle was less than 5◦.Entries  63211Mean     4.36Std Dev     1.532Underflow       0Overflow        00 1 2 3 4 5 6 7 8 9 10 momentum (GeV)Truth0pi020040060080010001200140016001800Entries / 0.1 GeV(b) Momentum distribution of the true pi0 mesons that occurred in simula-tion, whose opening angle was less than 10◦.Figure 2.14: From simulation of e+e−→ωγ , ω→ γpi0, momentum distribu-tion of the true pi0 mesons that occurred in simulation for two differentopening angles. From Alon Hershenhorn via email on March 10, 2020.23Chapter 3MethodsSection 1.3 described the goal of this project as developing “a tool, available toBelle II users, that can classify an ECL signal as two merged photons or a singlephoton to a percent certainty”. This tool will come in the form of a variable thatcharacterizes the shape, size, energy or some other quantity of the ECLCluster suchthat it has very different values for a merged or single-photon cluster. This chapterdescribes how the best variable was found.Recall that an “event” is a single e+e− collision and all the particles that resultfrom it1. A single “run”, in which the e+ and e− beams are run for a given amountof time, can contain many events, and when the detector is simulated, the number ofevents that occur can be specified. The following sections will describe how eventswere generated using Monte Carlo (MC) simulations and analyzed using Belle IIanalysis software, how a sample of exclusively-merged or exclusively-single pho-tons was procured, which variables were tested and how their performance wasevaluated, and how a convolutional neural network (CNN) was used to improveupon the performance of these variables. Optimization was done for the barrelonly.1Though there is only one main e+e− collision per event, others can happen around the sametime and result in background particles.243.1 MC Generation and Data Analysis at Belle IIIn Belle II analysis, all user interface is done using Python scripts called “steeringscripts” after raw data has been processed in C++2. This process is outlined inFigure 3.1: after reconstruction of raw data from the real or simulated detector,steering scripts change this into a readable file with variables like cluster shape,energy and position, which describe the ECLClusters recorded in the event. Again,the goal of this project is to find one or many of these variables that can distinguishbetween a merged cluster and a single-photon cluster.Figure 3.1: Belle II analysis: direct information from the real or simulateddetector is saved in raw data files. Events are reconstructed in C++and useful variables and particle lists are extracted through user-writtenPython steering scripts [25].After reconstruction, data files are analyzed in steering scripts using “cuts”:criteria that decide which particles, or which events, are kept. An example of a cuton particles is: including particles that have energy above 3 GeV, and discardingthose which do not. An example of a cut on events is: including events where allparticles have energy below 3 GeV, and discarding the events which do not. Cuts2The ECL reconstruction in Figure 2.8 is an example of this C++ processing.25are essential to ensure that the only particles in a sample are the desired particles.Steering scripts can also be written to generate events and simulate detectorresponse using GEANT4, as in the bottom-left of Figure 3.1. When events havebeen generated by the user, the user can access “Monte-Carlo” (MC) variablesthat contain the true variable values that occurred during simulation, in additionto the variable values reconstructed by the detector. For example: a photon mayhave been generated at 4 GeV but, due to energy leakage3, only 3.8 GeV wasreconstructed from the raw data. The MC energy of this particle will be 4 GeV,while the reconstructed energy will be 3.8 GeV. In this study, the cuts on generatedevents were made based on both reconstructed and MC variables.In this project, events were generated with a steering script called (attached in Appendix B for reference): for a single event, it generates Nparticles of a given type specified by their PDG code: 22 for photon, 111 for pi0,and so on.In this analysis, a single particle was generated per event, with its momentumand angular position in the detector chosen from a given probabilistic distribution:fixed momentum, and all angles equally likely (uniform distribution). Momentumwas fixed-value because shower shape was expected to change with momentum,and because separate optimizations were to be done at the momenta where a mergeis more or less likely. Angles were chosen from a uniform distribution to ensurethe optimization worked just as well anywhere in the detector. Events were gen-erated with background4, which consists of an overlay on the ECL array and notadditional particles generated by GEANT. This overlay is made by recording theECL signals of random e+e− collisions, then superimposing these signals on thesimulated event; this results not only in new ECLClusters, but also new low-energyECLCalDigits that can distort the true photon signals.3Energy may be lost in the gaps between crystals, or the crystals may not be long enough tocontain enough of the electromagnetic shower.4Background at Belle II is mostly low-energy photons and electrons/positrons resulting frominteractions of the e+e− beams with the pipes they travel through.263.2 Identifying Single-Photon EventsTo find the right variable that distinguishes between two merged photons and a sin-gle photon, a sample of exclusively merged photons and a sample of exclusivelysingle photons are needed. To get a sample of exclusively single-photon clusters, was used to generate a single photon per event with a fixed mo-mentum; in the following examples momentum will be left arbitrary as “X GeV”.However, even though only a single photon was generated per event, not allevents will result in a single photon hitting the ECL. Figure 3.2 shows some pos-sible events that can result from generating a single photon. In addition to theseevents, there will also be low-energy ECLClusters from the background overlay.Figure 3.2: Possible events when generating a single photon. Clockwisefrom top left: the photon doesn’t interact; the photon interacts and pro-duces an e+e− pair, one of which interacts and produces a photon byBremsstrahlung; the photon interacts and produces an e+e− pair andneither daughter interacts.To ensure that a pure sample of single-photon clusters was procured, the fol-lowing event cuts and list cuts were made. After the cuts in this section, roughly70% of the barrel photons remain.Event cuts; if an event fails any of these cuts, it is discarded:1. Require that there is exactly one MC photon, and that it is in the barrel.272. Require that this photon has no MC mother and no MC daughters (preventsthe e+e− pair production events in Figure 3.2).Cuts made on the list of reconstructed photons in the event; the resulting list isthe one that is kept:1. The photon has an energy close to X GeV (to cut out background photons).2. The crystals used to reconstruct the photon contain a sufficient amount of theactual energy deposited by the one MC photon in the event.3.3 Identifying Merged EventsTo get a sample of merged clusters, one pi0 meson was generated per event withthe same momentum as the single-photon case; the pi0 was chosen because itsprimary decay mode is into two photons and it has low mass, meaning a higherproportion of its two-photon decays will merge (Section 2.5). The single photonand the pi0 have the same momentum to ensure that the merged cluster and thesingle-photon cluster have the same energy. If their energies are different, anydifference in variables tested may just be reflecting a difference in energy ratherthan distinguishing merged/single-photon clusters.As in the single photon case, a lot can happen when a single pi0 is generatedper event. Figure 3.3 shows some possible events that can result from generating asingle photon. Note that, even after eliminating all events in which pair productionor other decay modes occur, the sample will not necessarily be a sample of puremerges. As Figure 2.12 showed, only a subset of all X GeV pi0 → γγ decayswill have an opening angle small enough to merge, and thus only a subset of thegenerated events will contain two merged photons.To ensure that a pure sample of merged clusters was procured, the followingevent cuts and list cuts were made. After the cuts in this section, roughly 30% ofthe barrel pi0 mesons remain.Event cuts; if an event fails any of these cuts, it is discarded:1. Require that the MC pi0 has exactly two daughters, and that these daughtersare photons (this eliminates other pi0 decay modes), and that they are in thebarrel.28Figure 3.3: Possible events when generating a single pi0. Clockwise fromtop left: a pi0 decay mode that isn’t pi0 → γγ; a pi0 daughter photonsinteracts and produces an e+e− pair; the pi0 decays to two photons thatdon’t pair-produce, but they’re too far apart to merge; the pi0 decays totwo photons that don’t pair-produce, and they’re close enough to merge.2. Require that the daughter photons have no daughters (this prevents e+e− pairproduction and bremsstrahlung before hitting the ECL).3. Require that each event has exactly two MC photons, which are daughters ofthe pi0, and which have no daughters.4. Require that there is only one reconstructed cluster that has roughly the en-ergy of the pi0, per event. In most cases, this cluster indicates a merge.5. Require that the MC distance between the two daughter photons is less than517 cm.6. Require that the second highest energy cluster that is within 20 cm of thehighest-energy cluster has energy on the order of background.The last two event cuts can be explained by considering the asymmetrical two-photon decay in Figure 2.11b: by energy conservation, the first photon has energy5For these cuts, distances between reconstructed clusters were calculated as: (MC or recon-structed angle between the two clusters)×(distance of one of the clusters from interaction point)29E1 = Epi−E2, where E1 and E2 are the energies of γ1 and γ2 in Figure 2.11b. FromSection 2.5, the higher the lab-frame angle of separation between the two daughterphotons, the more asymmetrical their energies. This means that event cut 4 willnot cut out the case where E2 ' Epi , leaving unmerged daughter photons. Eventcut 5 discards the case where E1 in Figure 2.11b is on the order of background; onthis scale, it will be far from the E1 cluster. Now that the two clusters are withinmerging distance and E1 is greater than background, event cut 6 discards the eventswhere they do not merge6.Cuts made on the list of reconstructed photons in the event; the resulting list isthe one that is kept:1. The “photon” is the highest-energy reconstructed cluster (to cut out back-ground).3.4 Quantifying Performance of a VariableOnce merged/single photon events have been isolated, whether or not they canbe distinguished by a given variable depends on how distinct their distributionsin this variable are. An example of a good variable is shown in Figure 3.4a: thedistributions for the merged and single-photon cases are well separated so, given avalue of Q, it can be determined right away whether the cluster is merged or single.An example of a (very) bad choice of variable is energy, shown in Figure 3.4b: theγ and pi0 were generated with the same momentum, so the resulting clusters willhave very similar energy distributions.In a real dataset, the merged/single-photon distributions will not be colouredseparately and will overlap, like the variable X in Figure 3.5b. If the distributionsare distinct enough, a single value can be chosen, on one side of which are all thesingle-photon cases, and on the other side are the merged cases. A good X valuefor this purpose might be X = 7. If all X values to the right of X = 7 are definedas merged, Figure3.5a shows that this correctly identifies 90% of the merges asmerges and correctly rejects 80% of the single photons as not merges. To see how6The 20 cm distance was determined by monitoring the peak at 1 of the second moment distribu-tion: second moment is normalized to 1 for single photons, so a peak at 1 indicates surviving singlephotons.30(a) Distribution of some ideal variable Q formerged and single-photon clusters.(b) Energy distribution for merged and single-photonclusters generated at 5 GeV.Figure 3.4: Distributions of two example variables for merged and single-photon clusters. Whether a cluster is merged or not can be determinedwith 100% accuracy by its Q value, but not by its energy.good a variable is to distinguish cases overall, the point A is scanned across allvalues in the distribution, plotting the fraction of successful single-photon rejec-tions vs the fraction of successful merge identifications at each value of X usingMC information. This makes an “ROC curve”, shown in Figure 3.5c for X and inFigure 3.5d for the ideal case Q.The cluster variables that will be tested in this study are statistical moments,which describe the shape of a given cluster. An example of a statistical momentis the second moment, also known as the variance (but in two dimensions): itdescribes the spread of a dataset about its mean value. Another statistical momentthat will be investigated is the “Zernike moment 40” (that is, n = 4 and m = 0),which quantifies the rotational symmetry of a shower and is rotation-invariant. Theformula for Zernike moment (n,m) can be found in ref. [18]. The third statisticalmoment whose performance will be investigated is the “lateral moment”, which isa measure of radial symmetry.31(a) Distribution of some variable X for mergedand single photons.(b) Distribution of X as it would appear if MCinformation were not known.(c) ROC curve of X. (d) ROC curve of the ideal variable Q.Figure 3.5: Distributions and ROC curves of an arbitrary variable X and theideal variable Q.3.5 Convolutional Neural NetworksIn this study, a Convolutional Neural Network (CNN) was used to distinguish be-tween merged and single-photon clusters in the case where statistical moments per-formed poorly. CNNs can be used in image processing, and previous work showedthat distinguishing energy deposits on the grid of ECL crystals can be treated as animage classification problem [20].A trained CNN classifies an image using different layers of analysis. There32is much variation in the architecture of CNNs, but the setup used in this thesisis outlined as follows. The first layer is convolutional: the image, a 5× 5 arrayof crystals, each with their own ECLCalDigit value, is iterated over using N×N“filters” that search for particular features [8]. This iteration, called convolution,returns a single value for each part of the image the filter steps through, reduc-ing the size of the image and keeping only the important features. Many filterscan be implemented, each picking out different features of the image [8]. Afterthis stage, the CNN flattens the output of the convolutional layer and passes theflattened array into two additional layers called “fully-connected layers”, in whichclassification attempts are made and corrected a number of times, the model is up-dated accordingly, classification attempts are again made and corrected, and so onfor a specified number of “epochs” [23]. Before flattening, the output of convolu-tion passes through an “activation function” that changes the values it contains ina specified way; this has been shown to improve learning abilities in the layers thatfollow [26]. This process is shown in Figure 3.6.ConvolutionFlatten & Activation Fn......FC Layer 1...FC Layer 2Figure 3.6: Architecture of the Convolutional Neural Network used in thisproject. Output is produced after FC layer 2.Additional variables associated with the image, such as cluster second moment,can be passed into the fully-connected layers alongside the flattened array, givingthe network more information to use in its classification. These are called extrainputs, and different combinations of extra inputs were attempted in this project.All the parameters associated with the CNN, such as number of epochs, filter size,33and extra inputs, were varied and the ones that optimized the network’s classifica-tion abilities were kept. The final parameter values used in this project are listedin Appendix A. Variation was guess-and-check and not many combinations wereattempted, so CNN performance may be greatly improved by automating the vari-ation of parameters as described in Section 5.2.Classification in this project was binary: whether an image contained a mergedcluster or not. For each image, the CNN returned an output between 0 and 1,corresponding to the percent certainty that the given cluster was merged. ThisCNN output was another variable whose performance was evaluated by the meansdescribed in Section 3.4. A dataset of roughly 300 000 merged clusters and 300 000single-photon clusters was procured, and a 5×5 grid of ECLCalDigits centred onthe most energetic crystal in each cluster was passed as input to the CNN. Datasetsmust be of equal size to avoid a bias toward either image type.34Chapter 4ResultsNow that clean samples of single-photon clusters and of merged clusters have beenprocured, the ability of the cluster variables described in Section 5.1 to distinguishmerged/single-photon clusters must be evaluated. In the following sections, theperformance of select statistical moments and of a convolutional neural networkare evaluated at various momenta through the plotting of distributions and ROCcurves.4.1 Classifying Clusters using Statistical MomentsThe second moment distributions of single-photon samples and merged samples at5 GeV, as well as their ROC curves at select momenta, are shown in Figure 4.1.The merged and single-photon distributions of Zernike moment 40 at 5 GeV andtheir ROC curves at select momenta are shown in Figure 4.2, and the same is shownfor lateral moment in Figure 4.3. Finally, the three moments are compared at selectmomenta in Figure 4.4.35(a) 5 GeV second moment distributions, scaled such that the distributions integrate to 1.(b) ROC curve for second moment at select momenta.Figure 4.1: Second moment distributions and ROC curves for approximately25 000 merged photons and 60 000 single photons.36(a) 5 GeV Zernike moment 40 distributions, scaled so the distributions integrate to 1.(b) ROC curve for Zernike moment 40 at select momenta.Figure 4.2: Zernike moment 40 distributions and ROC curves for approxi-mately 25 000 merged photons and 60 000 single photons.37(a) 5 GeV lateral moment distributions, scaled such that the distributions integrate to 1.(b) ROC curve for lateral moment at select momenta.Figure 4.3: Lateral moment distributions and ROC curves for approximately25 000 merged photons and 60 000 single photons.38(a)(b)39(c)Figure 4.4: Second, Zernike 40, and lateral moment ROC curves at (a) 5 GeV,(b) 3 GeV, (c) 7 GeV.404.2 Classifying Clusters using a Convolutional NeuralNetworkThe CNN described in Section 3.5 was trained on momentum 5 GeV and 7 GeVsamples. The three previous moments were passed into the 5 GeV CNN as extrainputs, but only cluster angular position was passed into the 7 GeV CNN. Theseextra inputs were used because they optimized CNN performance. Cluster angularposition was a candidate for extra input because, referring to Figure 2.7, certainangular positions will result in larger distances between daughter photon clustersgiven the same initial opening angle; hence, less merging is expected at angularpositions close to the endcaps. The CNN output distributions and their ROC curvesfor single-photon samples and merged samples are shown separately in Figures 4.5and 4.6, and their ROC curves are compared in Figure 4.7.41(a) CNN output distributions at 5 GeV, scaled such that the distributions integrate to 1 andplotted with log scale on vertical axis.(b) ROC curve for CNN output at 5 GeV.Figure 4.5: CNN output distributions and ROC curve for approximately23 000 merges and 23 000 single photons generated with momentum5 GeV.42(a) CNN output distributions at 7 GeV, scaled such that the distributions integrate to 1 andplotted with log scale on vertical axis.(b) ROC curve for CNN output at 7 GeV.Figure 4.6: CNN output distributions and ROC curve for approximately36 000 merges and 36 000 single photons generated with momentum7 GeV.43Figure 4.7: Zoomed-in ROC curve of CNN output for merges and single pho-tons generated with momentum 5 GeV and 7 GeV.44Chapter 5Discussion5.1 Performance of VariablesOf all the statistical moments tested, Figure 4.4 shows that second moment per-forms best at all momenta, with lateral moment performing second best. Fig-ure 4.1b shows that, at 5 GeV, cluster second moment distributions allow for roughly95% of merged clusters to be successfully identified while rejecting roughly 95%of single-photon clusters. High performance at 5 GeV is good because merges aremore likely to happen at this high momentum, while pi0 mesons with momenta anyhigher are less likely to occur in the real detector. The majority of merged pi0→ γγdecays can therefore be distinguished with second moment. Figure 4.1b showsthat at momenta ∼ 3 GeV, nearly 100% of merged clusters can be identified with anearly 0% chance of incorrectly identifying a single-photon cluster as merged.The change in performance of second, lateral and Zernike 40 moments respec-tively indicate a change in shower spread, radial symmetry, and rotational symme-try as a function of momentum. When two photons deposit their energies adjacentto each other (as in a merge), radial symmetry and data spread are expected todiffer from the single-photon case because of the branching-out of a new particleshower; hence, second moment and lateral moment would be best at distinguishingthe two cases. It is, however, surprising that there is a larger difference in dataspread (second moment) than radial symmetry (lateral moment) between a mergedand a single-photon cluster.45The ROC curves of all three variables stray farther from the ideal case as mo-mentum increases and merges become more likely to occur: Figure 4.1b shows thatat momenta ∼ 8 GeV, at most 90% of merges can be identified while incorrectlyincluding almost 20% of the single-photon clusters in the merged demographic. Asis the case for most of the ROC curves shown, by choosing a cutoff correspond-ing to a point on the far right of the 8 GeV ROC curve in Figure 4.1b, it is stillpossible to correctly identify nearly 100% of the merged clusters, but at the risk ofincorrectly labelling roughly 50% of the single-photon clusters as merged.Instead of classifying the image indirectly using the three statistical momentsin Figure 4.4, classifying it directly using a CNN allows for merged cluster iden-tification probability and single-photon cluster rejection probability greater than99.5% and 99.5% at 5 GeV, and roughly 99% and 98% at 7 GeV. This is shown inFigure 4.7. The fact that the three moments perform differently when used sepa-rately indicates that they describe different parts of the cluster and that there wassome benefit to combining them; this benefit may be a contributor to the high per-formance of the 5 GeV CNN, which took the three moments as extra inputs.5.2 Next StepsEven without implementation of the CNN, cluster second moment approximatesthe ideal case at momenta ≤ 5 GeV and is already available to Belle II users.However, making the output of the CNNs used in Section 4.2 available to Belle IIusers would be a huge step not only toward making the merged region in the ALPparameter space of Figure 2.4 explorable at Belle II, but also toward increasingthe accuracy of any analysis involving two-photon decays with low parent particlemass or high parent particle momentum, as shown in Figure 1.2 and Section 2.5.There are, however, difficulties associated with adding CNN output as a newvariable: its value needs to be calculated during the reconstruction stage of Fig-ure 3.1 because it requires ECLCalDigits, which are not readily available to theuser after the reconstruction of real data1. The resulting data files will also takeup more storage space, and require more processing time to make: in this study itwas found that the time needed to add CNN output as a variable can be up to 451They take up too much space to keep.46seconds per 1000 events. Adding any new variable can also introduce incompati-bilities between new and old data, so to justify adding CNN output as a variable,the CNN must be as polished as possible and present a clear improvement over thealready-built-in second moment for both simulated and real data. The next stepsfor this project are therefore described as follows:• In this project, separate CNNs were made at 5 GeV and 7 GeV, but for thereasons listed above, adding more than one new CNN variable is not practi-cable. CNNs must be trained at other momenta, and if a CNN trained on onemomentum performs poorly at others, a choice must be made as to whichmomentum is the most important for merge identification. At higher mo-menta, a merge is more likely and statistical moments perform worse, but itis less likely to encounter a particle with high momentum in the real detector.• It is important to continue varying CNN parameters to maximize output per-formance, and to test it on real data: the ROC curve must be as close toideal as possible for both simulated and real data before the network is im-plemented.• It is also important to vary CNN parameters to minimize output processingtime. Variables such as number of epochs can have a significant effect on thetime it takes to calculate CNN output for a single image.• It would be very helpful to develop a more thorough and precise method ofoptimizing CNN parameters with respect to the two goals listed previously.The current method was guess-and-check but to, for example, maximize out-put performance, the optimization could be automated by varying parameterscontinuously such that the distance between the top-right of the ROC curveand the point (1, 1) is minimized.47Chapter 6ConclusionThe goal of this project was to find or develop a variable available to Belle II usersthat can distinguish between merged or single-photon clusters. This was motivatedin part by a region of a dark matter mediator’s parameter space that remains diffi-cult to explore at Belle II: for certain masses and two-photon coupling constants ofthis Axion-Like Particle (ALP), one of its two-photon decay modes has daughterphoton opening angles too small for the Belle II software to be able to distinguishthem from a single photon. This leads to the merged region in Figure 2.4. Thepart of the detector that measures photon energy and position is the Electromag-netic Calorimeter (ECL), so this project was focused on improving its two-photonresolution abilities.To procure a large sample of merged and single-photon clusters, a large num-ber of single photons and pi0 mesons were generated using the GEANT simulationsoftware described in Section 2.3. It was found that roughly 30% of the singlephotons did not reach the ECL; instead, they interacted and produced unwantedparticles. In the case of the pi0, even though its primary decay mode is to two pho-tons, it was found that in roughly 70% of events these photons either did not reachthe ECL (again, because of the production of unwanted particles) or did not merge.The cuts in Sections 3.2 and 3.3 eliminated the unwanted cases, producing puresamples of either single-photon or merged clusters. Cluster shape was character-ized indirectly through statistical moments, and directly using the ConvolutionalNeural Network (CNN) described in Section 3.5, which returned a value between480 and 1 representing the probability of a cluster being a merged pair of photons.ROC curves were made using the distributions of these variables, statistical mo-ments and CNN output, to quantify their performance as described in Section 3.4.After plotting the ROC curves of the three statistical moments that were tested,it was found that second moment performs best, and lateral moment second best,at all momenta. However, statistical moment performance decreased as the mo-mentum of the generated particle increased, indicating a change in shower shape.This drop in performance necessitated a new variable for cluster classification: thedirect image classification method of the CNN was used. At 5 GeV and 7 GeV, theCNN returned near-ideal ROC curves.Further work must be done to make this CNN output available to Belle II users.Most importantly, it must be decided which momentum to train it on, taking intoconsideration the likelihood of a two-photon decay and a merge occurring at thegiven momentum. Parameters must also be varied methodically in order to mini-mize training time, minimize output time, and maximize output performance.In any case, regardless of the final form of the CNN being used to identifymerged clusters, this study has shown that machine learning (in particular, a Con-volutional Neural Network) is very well-suited to this goal and can accomplish itto a higher accuracy than the tools already available to Belle II users.49Bibliography[1] S. Agostinelli et al. [GEANT4 Collaboration], “GEANT4: A Simulationtoolkit,” Nucl. Instrum. Meth. A 506, 250 (2003).doi:10.1016/S0168-9002(03)01368-8 → pages 12[2] J. Allison et al., “Geant4 developments and applications,” IEEE Trans. Nucl.Sci. 53, 270 (2006). doi:10.1109/TNS.2006.869826 → pages 12[3] M. Tanabashi et al. (Particle Data Group), “Particle listings: pi0,” Phys. Rev.D 98, 030001 (2018) and 2019 update. → pages 7[4] Belle II Collaboration, “Super KEKB and Belle II,” Belle II, (2020).Accessed 2020, April 21.URL: kekb and belle ii/ → pages vi, 5[5] J. Brownlee, “A Gentle Introduction to Early Stopping to Avoid OvertrainingNeural Networks,” Machine Learning Mastery, (2018, Dec 7). Accessed2020, April 21. URL: → pages55[6] J. Brownlee, “Difference Between a Batch and an Epoch in a NeuralNetwork,” Machine Learning Mastery, (2018, July 20). Accessed 2020,April 21. URL: → pages55[7] J. Brownlee, “Understand the Impact of Learning Rate on Neural NetworkPerformance,” Machine Learning Mastery, (2019, Jan 25). Accessed 2020,April 21. URL: → pages5550[8] A. Deshpande, “A Beginner’s Guide To Understanding ConvolutionalNeural Networks,” (2016, June 20). Accessed 2020, April 19.URL: → pages33[9] M. J. Dolan, F. Kahlhoefer, C. McCabe and K. Schmidt-Hoberg, “A taste ofdark matter: Flavour constraints on pseudoscalar mediators,” JHEP 1503,171 (2015) Erratum: [JHEP 1507, 103 (2015)]doi:10.1007/JHEP07(2015)103, 10.1007/JHEP03(2015)171[arXiv:1412.5174 [hep-ph]]. → pages 8[10] M. J. Dolan, T. Ferber, C. Hearty, F. Kahlhoefer and K. Schmidt-Hoberg,“Revised constraints and Belle II sensitivity for visible and invisibleaxion-like particles,” JHEP 1712, 094 (2017) doi:10.1007/JHEP12(2017)094[arXiv:1709.00009 [hep-ph]]. → pages vi, 2, 9, 10, 11, 12[11] U. Ellwanger and S. Moretti, “Possible Explanation of the Electron PositronAnomaly at 17 MeV in 8Be Transitions Through a Light Pseudoscalar,”JHEP 1611, 039 (2016) doi:10.1007/JHEP11(2016)039 [arXiv:1609.01669[hep-ph]]. → pages 8[12] T. Ferber, “Dark sector physics at BaBar and Belle II,” [presentation]Flavour and Dark Matter, Heidelberg (2017) doi:10.1007/JHEP11(2016)039→ pages 2[13] C. Hearty, “Searching for new physics with the Belle II experiment,”[presentation] TRIUMF, Vancouver (2019). → pages vi, 4, 5[14] C. Hearty, “Searches for beyond-the-Standard-Model particles at Belle II,”[presentation] FPCP, Victoria (2019). → pages 12[15] A. Hershenhorn et al., “ECL shower shape variables based on Zernikemoments,” BELLE2-NOTE-TE-2017-001 (2017). → pages 6[16] H. Ikeda et al., “Development of the CsI(Tl) Calorimeter for theMeasurement of CP Violation at KEK B-Factory,” Nuclear Instruments andMethods in Physics Research A, 441, 401 (2000).doi:10.1016/S0168-9002(99)00992-4 → pages v, 13, 14, 15[17] E. Kou et al. [Belle-II Collaboration], “The Belle II Physics Book,”arXiv:1808.10567 [hep-ex]. → pages 4, 551[18] A. Khotanzad and Y. H. Hong, “Invariant Image Recognition by ZernikeMoments,” IEEE Trans. Pattern Anal. Mach. Intell. 12, 489 (1990).doi:10.1109/34.55109 → pages 31[19] W. J. Marciano, A. Masiero, P. Paradisi and M. Passera, “Contributions ofaxion-like particles to lepton dipole moments,” Phys. Rev. D 94, no. 11,115033 (2016) doi:10.1103/PhysRevD.94.115033 [arXiv:1607.01022[hep-ph]]. → pages 8[20] A. Narimani, “Low PT µ/pi separation in the ECL using CNNs”[presentation] Lepton ID Meeting, (2020) → pages 6, 32[21] M. De Nuccio, T. Ferber, “Tutorial: The Belle II ElectromagneticCalorimeter,” 4th Belle II Starter Kit Workshop, (2019). → pages vii, 13, 14,15, 16[22] F. Krauss et al., “Monte Carlo Particle Numbering Scheme,” (2019).URL: → pages12[23] S. Saha, “A Comprehensive Guide to Convolutional Neural Networks — theELI5 way,” Towards data science, (2018, Dec 15). Accessed 2020, April 19.URL: → pages33[24] Unknown User, “Software Jupyter Notebooks,” Belle II Confluence,Software Basf2manual (2018). → pages vii, 15[25] H. Wakeling, “Introduction to the analysis package,” Belle II Starter KitWorkshop, KEK, (2019). → pages vii, 25[26] A. Walia, “Activation functions,” Towards data science, (2017, May 29).URL: → pages33[27] Wikipedia contributors, “Pion,” Wikipedia, The Free Encyclopedia (2020).Accessed 2020, April 22. URL: → pages852[28] Wikipedia contributors, “Proton,” Wikipedia, The Free Encyclopedia (2020).Accessed 2020, April 22. URL: →pages 8[29] Wikipedia contributors, “Standard Model,” Wikipedia, The FreeEncyclopedia (2020). Accessed 2020, April 22.URL: Model → pages 853Appendix ACNN ParametersSeparate parameters were used for the CNN trained at 5 GeV and the CNN trainedat 7 GeV. The parameters are described below, and their values for the two networksare given in Table A.1.1. Scale: scales numerical inputs to within (0,1).2. Threshold: if an ECLCalDigit energy is below this, the energy of that crystalis set to 0.3. Image length: if image length is N, the image used is the N ×N grid ofcrystals centred on the cluster’s local maximum.4. Extra inputs: cluster properties passed into fully-connected layers alongsideflattened, convolved image. SM, ZM and LM are second, Zernike and lateralmoments. ClusterTheta is an angle indicating the cluster’s position along thez axis of the cylindrical detector; it’s used because, by nature of it being acylinder with the interaction point along the central axis, low-opening-anglephoton pairs on either end of the detector have farther to travel and are lesslikely to merge.5. Number of features: number of convolutional filters.6. Filter size: size of the filter – the matrix that is convolved with the originalN×N image.547. Stride: how big of a step the filter takes each time.8. Padding: pads the edges of the original image a number of times with zeroes,making it larger. As a result, the filter can scan across the image more finelysee its edges more clearly.9. Drop out rate: periodically dropping out neurons and their connections fromthe FC layers can help prevent overtraining, which is when the network stopslearning generalizable information and instead learns information specific tothe dataset it was given, like statistical noise [5].10. Neurons in fully-connected layers: corresponds to the size of the networkthat learns from the flattened, convolved image.11. Epochs: number of times the network is trained.12. Learning rate: in the FC layers, quantifies the size of the adjustment afterbeing corrected on a guess [6].13. Batch size: number of samples guessed/corrected on before moving on tothe next epoch [7].14. Number of classes: in this project, the two classes are merged or single.55Table A.1: Parameter values for the 5 GeV and 7 GeV CNNs.5 GeV CNN 7 GeV CNNScale False FalseThreshold 1e−3 0Image length 5 5Extra inputs SM, ZM, LM ClusterThetaNumber of features 10 15Filter size 3 3Stride 1 1Padding 1 1Drop out rate 25e2 0Neurons in FC (1, 2) (200, 200) (200, 200)Epochs 100 30Learning rate 1e−3 1e−3Batch size 512 256Number of classes 2 256Appendix BParticle-Generating, used to generate events with GEANT4, is shown below. Ittakes number of events, job number unique to this run, and background BG ornoBG, as inputs. Filenames omitted for concision.import osimport sysfrom basf2 import set_log_level, register_module, process,LogLevel, \set_random_seed, print_params, create_path, statisticsfrom simulation import add_simulationfrom reconstruction import add_reconstruction, add_mdst_outputmomentum = 3 # GeVparticle_PDG = 111# Number of events = first argumentnarg = len(sys.argv)nevt = 100if(narg>=2):nevt = int(sys.argv[1])# Output name, random seed, run number and bg file are taken fromsecond script argumentjobNumber = 0if(narg>=3):57jobNumber = int(sys.argv[2])# if third argument = 0, BG mixing is disabledaddBG = 1if(narg==4):addBG = int(sys.argv[3])set_random_seed(jobNumber)if addBG==0:outputName = "..."bgpath = ""else:outputName = "..."bgNumber = jobNumber%10bgpath = "..."print("outputName = ",outputName)print(bgpath)print("generating ",nevt," events")# suppress messages and warnings during processing:set_log_level(LogLevel.ERROR)particlegun = register_module(’ParticleGun’)particlegun.param(’pdgCodes’, [particle_PDG]) # 111 is pi0, 22 isphotonparticlegun.param(’nTracks’, 1)particlegun.param(’varyNTracks’, False)particlegun.param(’momentumGeneration’, ’fixed’)particlegun.param(’momentumParams’, [momentum]) # GeVparticlegun.param(’thetaGeneration’, ’uniform’)particlegun.param(’thetaParams’, [0,360]) # min 0, max 360particlegun.param(’phiGeneration’, ’uniform’)particlegun.param(’phiParams’, [0,360]) # min 0, max 360particlegun.param(’vertexGeneration’, ’fixed’)particlegun.param(’xVertexParams’, [0])particlegun.param(’yVertexParams’, [0])particlegun.param(’zVertexParams’, [0])particlegun.param(’independentVertices’, False)58print_params(particlegun)# ================================================main = create_path()main.add_module("EventInfoSetter", expList=1003, runList=jobNumber, evtNumList=nevt)main.add_module("Progress")main.add_module(particlegun)if addBG==0:add_simulation(main)else:add_simulation(main,bkgfiles=bgpath)add_reconstruction(main)add_mdst_output(main, mc=True, filename=outputName,additionalBranches=[’ECLCalDigits’])# Process eventsprocess(main)# Print call statisticsprint(statistics)59


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items