UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Comparative genome hybridization reveals widespread genome variation in pathogenic cryptococcus species Liu, Iris 2008

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-ubc_2008_fall_liu_iris.pdf [ 8.52MB ]
Metadata
JSON: 24-1.0067034.json
JSON-LD: 24-1.0067034-ld.json
RDF/XML (Pretty): 24-1.0067034-rdf.xml
RDF/JSON: 24-1.0067034-rdf.json
Turtle: 24-1.0067034-turtle.txt
N-Triples: 24-1.0067034-rdf-ntriples.txt
Original Record: 24-1.0067034-source.json
Full Text
24-1.0067034-fulltext.txt
Citation
24-1.0067034.ris

Full Text

Comparative Genome Hybridization Reveals Widespread Genome Variation in Pathogenic Cryptococcus Species by Iris Liu B.Sc. University of British Columbia, 2006 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in The Faculty of Graduate Studies (Microbiology and Immunology) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) June 2008 © Iris Liu, 2008 ABSTRACT Genome variability can influence the virulence of pathogenic microbes. The availability of genome sequences for strains of the AIDS-associated fungal pathogens Cryptococcus neoformans and C. gattii presented an opportunity to use Comparative Genome Hybridization (CGH) to examine genome variability between strains of different molecular subtypes and ploidy. CGH analysis of 15 strains revealed extensive genomic variation including regions of difference (deletions and amplifications) and chromosome copy number variability. Although no common genomic change was observed for these 15 strains, three key observations came out of these studies. First, CGH identified putative recombination sites and the origins of specific segments of the genome for the common laboratory strain, JEC21. Second, CGH and subsequent PCR-RFLP (PCR-Restriction Fragment Length Polymorphism) analysis on 33 clinical, environmental and laboratory-generated AD hybrid strains revealed that chromosome 1 from the serotype A genome is preferentially retained in clinical strains. Third, CGH and subsequent qRT-PCR (quantitative real-time PCR) analysis revealed disomy for chromosome 13 in two clinical strains: CBS7779 and WM626. Further qRT-PCR and phenotypic studies on CBS7779 revealed a correlation between variable melanin production and disomy. Specifically, highly melanized strains were monosomic for chromosome 13 and less melanized strains were disomic for this chromosome. This correlation, however, only held for the initial CBS7779 isolates. That is, subsequent screens of highly-melanized and less-melanized isolates derived from the initial CBS7779 strain no longer followed this pattern. These subsequent screens, however, did reveal that 1) disomy, once established, was a relatively stable trait and 2) having disomy at chromosome 13 seemed to increase the probability of developing disomy at chromosome 4. Finally, qRT-PCR of 13 additional strains from AIDS patients revealed that disomy of both chromosome 13 and chromosome 4 is common in freshly isolated, clinical strains. Overall, the data presented in this thesis reveal novel aspects of genome variability and lay the foundation for future studies on the relevance of variation in the virulence of C. neoformans. 11 TABLE OF CONTENTS Abstract ii Table of Contents iii List of Tables v List of Figures vii List of Abbreviations viii Acknowledgements ix Dedication x 1. Introduction 1 1.1 Background information on Cryptococcus neoformans 1 1.2 Background information on Comparative Genome Hybridization (CGH) 6 1.3 Karyotype variation, aneuploidy and phenotypic switching in yeast-like fungi 8 1.4 Research objectives and Hypothesis 11 2. Materials and Methods 12 2.1 DNA isolation 12 2.2 NimbleGen arrays (CGH Design) 12 2.3 Creation of a database of genome variation for selected strains of C. neoformans and C. gattii 14 2.4 Confirmation of selected data documented in the database by PCR-RFLP 14 2.5 Isolation of melanin variants of C. neoformans by plating on L-DOPA medium 17 2.6 Quantitative real-time PCR 17 3. Results 20 3.1 Evaluation of genome variation in serotype A and serotype B strains 20 3.1.1 Genome variation in serotype A strains 21 3.1.2 Genome variation in serotype B strains 23 3.1.3 Confirmation of a selected subset of CGH data 26 3.1.4 Overall analysis of genome variation in serotype A and serotype B strains 27 3.2 Examination of a genetic cross and recombination sites for serotype D strains 29 3.3 Examination of serotype AD strains 36 3.4 Chromosome Copy Number Variation (CCNV) 45 3.4.1 Experimental set-up to examine CCNV 45 3.4.2 CCNV in the serotype A strain CBS7779 47 3.4.3 CCNV in the serotype A strain WM626 57 3.4.4 CCNV in selected other strains 61 4. Discussion and Conclusions 67 4.1 Analysis of genome variation in pathogenic Cryptococcus species 67 4.1.1 Annotation of genome variation in C. neoformans and C. gattii 67 111 4.1.2 Examination of a genetic cross and recombination sites for serotype D strains 70 4.1.3 Examination of serotype AD strains 71 4.1.4 Final comments on genome variation 74 4.2 Chromosome Copy Number Variation in C. neoformans 74 4.2.1 CCNV in serotype A strains CBS7779 and WM626 74 4.2.2 CCNV in selected strains of C. neoformans and C. gattii 81 4.2.3 Final comments on CCNV 85 4.3 Conclusions 85 References 87 Appendices 100 Appendix A — Database of genomic variation observed in serotype A strains via CGH 100 Appendix B — Database of genomic variation observed in serotype B strains via CGH 119 Appendix C — Expanded analysis for section 3.2 (Preferential retention of certain chromosomes by Al) strains) 153 Appendix D — Expanded analysis for section 3.4 and 3.5 (Chromosome Copy Number Variation (CCNV) for the serotype A strains CB7779 and W1v1626 and selected additional strains) 163 iv LIST OF TABLES Table 2.1 List of C. neoformans and C. gattii strains used for CGH 13 Table 2.2 List of primers used for confirmation of CGH data 15 Table 2.3 List of C. neoformans strains used for PCR-RFLP 16 Table 2.4 List of primers used for quantitative real-time PCR (qRT-PCR) 18 Table 2.5 List of C. neoformans and C. gattii strains used for qRT-PCR 19 Table 3.1 Summary of the total number of deletions (Log Ratio (LR) -0.5) or amplifications (LR 0.5) in four serotype A strains relative to the genome of the strain H99 21 Table 3.2 Comparison of Log Ratios (LR) and Standard Deviation (SD) for all 14 chromosomes of four serotype A strains that were hybridized to an array of the H99 genome 22 Table 3.3 Summary of the total number of deletions (LR < -0.5) or amplifications (LR> 0.5) in six serotype B strains relative to the genome of the strain WM276 24 Table 3.4 Comparison of LR and SD for all chromosomes of six serotype B strains that were hybridized to an array of the WM276 genome 24 Table 3.5 SD was used as a measure of LR (Log Ratio) divergence for each chromosome in the parental strains N111433 and NIH 12 upon hybridization to the JEC21 array 31 Table 3.6 Summary of the average LRs and standard deviation between N1F1433 and NIH12 in selected areas across the genome 32 Table 3.7 Average LR of each of the chromosomes for the serotype AD strains 40 Table 3.8 Summary of the serotype-specificity of CHR 1 in selected strains 44 Table 3.9 Quantitative RT-PCR (qRT-PCR) analysis of gene copy number in strains JEC21 and CBS7779 relative to the H99 genome 49 Table 3.10 qRT-PCR results for a select number of black and white strains derived from the original CBS7779 stock 51 Table 3.11 qRT-PCR results for a select number of black and white strains derived from characterized black and white colonies selected from screen #1 (Table 3.10) 54 Table 3.12 qRT-PCR analysis of gene copy number in strains JEC21 and WM626 relative to the H99 genome 59 V Table 3.13 Summary of the copy number of CHR 13 and CHR 4 for a select number of WM626 isolates 61 Table 3.14 Summary of the copy number at SMG1 (CHR 4) and CNNOO82O (CHR 13) for a set of strains from AIDS patients 63 Table 3.15 Average copy number of two loci in strain JEC2 1 relative to the strain H99 65 vi LIST OF FIGURES Figure 3.1 Confirmation of the presence of a deletion in the serotype A strain BT63 26 Figure 3.2 An overview of the hybridization ofNIH433 and NIH12 DNA to a JEC21 array 30 Figure 3.3 Analysis and confirmation of the N1H433 / NIH12 hybridization patterns to the JEC21 array 33 Figure 3.4 Overview of the CGH data for the serotype AD hybrids strains 38 Figure 3.5 Analysis of serotype AD strains, with an emphasis on CHR 5 41 Figure 3.6 Analysis of serotype AD strains, with an emphasis on CHR 1 42 Figure 3.7 Analysis of serotype AD strains, with an emphasis on CHR 1 of single re-isolated colonies of various strains 43 Figure 3.8 Testing the efficiency and sensitivity of qRT-PCR for the primer CNA04650 inH99 46 Figure 3.9 CGH data of CHR 10 to CHR 14 of the strain CBS7779 hybridized to the H99 array 48 Figure 3.10 Photographs of —‘1000 CBS7779 isolates on L-DOPA medium after three days of growth at 30°C 50 Figure 3.11 Photographs of two colonies (CBS7779 plated on L-DOPA) displaying the sectoring phenotype for melanin 52 Figure 3.12 Summary of the copy numbers of CHR 4 and CHR 13 in three successive screens and its melanin production phenotype in the strain CBS7779 56 Figure 3.13 CGH data of CHR 10 to CHR 14 of the strain WM626 hybridized to the H99 array 58 Figure 3.14 Photographs of —1000 WM626 isolates on L-DOPA medium after three days of growth at 30°C 60 vii LIST OF ABBREVIATIONS AVG LR Average Log Ratio CCNV Chromosome Copy Number Variation CGH Comparative Genome Hybridization CI{R Chromosome CSF Cerebral Spinal Fluid FACS Fluorescence Activated Cell Sorting gDNA Genomic DNA GFP Green Fluorescent Protein LR Log Ratio PCR Polymerase Chain Reaction qRT-PCR Quantitative Real-Time PCR RFLP Restriction Fragment Length Polymorphism SD Standard Deviation viii ACKNOWLEDGEMENTS First of all, I wish to thank Dr. James Kronstad for his unprecedented patience and guidance throughout the entire process. Not only has he shown me how to think meticulously and design well thought-out experiments, but he has been a wonderful mentor. Moreover, I want to thank my labmates for their thoughtful comments and useful troubleshooting tips. In particular, Dr. Guanggan Hu was especially helpful with teaching me how to do qRT-PCR, virulence assays, biolistic transformation and many more techniques. I also wish to thank Dr. Phil Hieter for letting me use the various microscopes that made this thesis possible. In addition, my committee members, Dr. Erin Gaynor and Dr. Michael Murphy have provided the much needed constructive feedback. My brothers and sisters in Christ (Nouver, Jessica, Meng, Linda and many more) deserve a big hug of thanks. They have prayed for me, listened to my concerns and put up with my ravings on the beauties and wonders of biology. Last, but not least, special thanks are owed to my parents, who have supported me throughout the years and put up with my odd hours at the lab. ix He was there since the very beginning. He placed the stars in the sky, Set the planets spinning in motion, Carved out mountains and valleys, And designed the blueprint for life. Yet He knows me by name. He held my hand throughout it all. His nail-pierced hands testify His love for me. He provided me with everything that I needed, And will forever welcome me with open arms. Abba Father, to You, and You alone, I dedicate this thesis. x INTRODUCTION 1. INTRODUCTION 1.1 Background information on Cryptococcus neoformans The fungal pathogen Ciyptococcus neoformans is the most common cause of meningitis among patients infected with HIV (Hakim et aL, 2000). Cryptococcal disease starts as a respiratory infection (Hull & Heitman, 2002), but it can then lead to either a latent infection or it can spread to other organs with a predilection for the central nervous system (Lin et al., 2007). Symptoms include headache, fever, malaise, altered mental status for several weeks and, in more serious cases, the patient may develop raised intracranial pressure (Bicanic & Harrison, 2004). In some areas of the world such as sub-Saharan Africa, up to 50% of patients with H1V/AIDS may be co-infected with C. neoformans (Schutte et al., 2000). Thus, it is not surprising that in one study, cryptococcal disease was the most common cause of death among patients infected with HIV (Corbertt et al., 2002). Another area of concern is that one variety of C. neoformans, now classified as the separate species C. gattii, has recently been found to be capable of infecting and causing disease in immunocompetent individuals. An outbreak of this pathogen on Vancouver Island that started in 1999 had resulted in at least 59 hospitalizations by 2004 (Hoang et al., 2004). Additional cases continue to occur at a rate of -25 per year and at least eight people have died since the start of the outbreak (MacDougall & Fyfe, 2006). Normally, eliminating the source(s) of infection can be used to combat disease outbreaks. This method, however, is not applicable with C. neoformans and C. gattii because of the prevalence of the fungus in a wide range of environmental niches: pigeon guano, trees, decaying wood and soil (Lin & Heitman, 2006). 1 iNTRODUCTION Because of these characteristics, it is difficult not only to locate the source of the outbreak but also to eliminate the spread of this pathogen. Unfortunately, even if treatment for Cryptococcal meningitis is available, mortality can still be as high as 25% (Horst et al., 1997), and for those patients who do recover, the rate of relapse is 12.8% (Antinori et al., 2001). As a result, most patients require long-term antifungal treatments (Antinori et al., 2001). However, current drugs such as amphotericin B and fluconazole have toxic effects including nausea, elevated serum creatine levels, hypokalemia, rash, headache, hemolytic anemia and gastrointestinal hemorrhage (Horst et al., 1997). Meanwhile, various isolates around the world are gaining resistance to commonly used antifungal drugs such as fluconazole and voriconazole (Chandenier et al., 2004). Unfortunately, C. neoformans is also highly mobile across the globe (Kidd et al., 2005), thereby suggesting that localized resistance in specific areas may spread to other parts of the world. Thus, there is a need to develop new drugs to combat this pathogen. One of the reasons for why Cryptococcus species are successful pathogens is because they have developed a number of virulence factors that allow survival and growth while inside the host (Idnurm et al., 2005). First, the fungus has the ability to grow at 37°C, thereby allowing it to proliferate efficiently at host temperature. Second, this pathogen produces a polysaccharide capsule that blocks phagocytosis. And even if it is engulfed, the capsule promotes the survival of the fungus while it is inside phagocytes (e.g. macrophages and neutrophils). This capsule also has the ability to down-regulate the cellular and the humoral immune responses. For example, it has been documented that the capsule can deplete complement proteins in the host (Janbon, 2004), thereby increasing its ability to evade the immune system. Finally, even if C. neoformans and C. gattii are ingested by macrophages, the fungi have the ability to produce melanin, a black 2 INTRODUCTION polymer in the cell wall that provides protection against oxidative killing (Zhong et a!., 2008). Interestingly, the production of melanin may explain why C. neoformans can rapidly proliferate in the central nervous system and cerebral spinal fluid (CSF), which often has a deadly outcome (Clancy et al., 2006). Specifically, melanin can be produced from the precursors dopamine and epinephrine, two neurotransmitters which are found in the brain (Lin & Heitman, 2006). In any case, the end result is that the production of capsule and melanin has given C. neoformans and C. gattii the ability to survive and replicate inside phagolysosomes (Bicanic & Harrison, 2004). Aside from the main three virulence factors (ability to grow at 37°C, capsule and melanin production), another important virulence-associated factor for C. neoformans and C. gattii is the ability to mate. In strains of C. neoformans that have the D capsule serotype, MATa strains are more virulent than MATa strains, and in all serotypes (A, B, C and D), the majority of clinical isolates are MATa (Hull & Heitman, 2002). When MATa and MATcx strains of the A serotype are co-inoculated into mice, MATa strains will outcompete MATa strains for entry into the brain, suggesting that the ability to mate (or mating-type specificity) is an important aspect of virulence (Nielsen et al., 2005a, Nielsen et a!., 2005b). This idea is also supported by the fact that the recent outbreak on Vancouver Island generally involves only MATa strains (Fraser et a!., 2003). Furthennore, some studies have linked fertility to clinical prevalence. For example, one study found that 12 of 16 human clinical isolates were fertile, whereas only 7 of 55 environmental strains were weakly fertile (Campbell et al., 2005). The skewed distribution of fertile isolates in the clinic versus those that are found in the environment further supports the idea that mating is an important component of pathogenesis. This concept can also be seen in other pathogens such as Toxiplasma gondii (Grigg eta!., 2005). 3 INTRODUCTION Although several virulence factors have been discovered and characterized over the past few decades, there is evidence to suggest that there are still many undiscovered factors that contribute to disease (McClelland et al., 2005). Previously, researchers have shown that during the passage of various strains of C. neoformans through mice, the strains became more virulent (McClelland et al., 2004). Later, when the differences between the passaged and non-passaged strains were compared in terms of known virulence factors (McClelland et a!., 2005), it was discovered that these known virulence factors could not fuiiy explain the virulence differences that are observed between the two sets of strains. As a result, the authors concluded that there could be other unknown virulence factors that affect the pathogenicity of C. neoformans. In order to study Cryptococcus more efficiently, a classification system based on serological and molecular methods has been developed. Isolates of C. neoformans can be divided into the four serotypes mentioned above (designated as A, B, C, D) and into a fifth hybrid serotype called AD. These serotypes are identified by antigenic differences in the capsular polysaccharide that surrounds the fungal cells (Bose et al., 2003). In general, serotype A isolates have a worldwide distribution, serotype D strains are found in Europe and North America, and serotypes B and C strains (now classified as C. gattii) are found in tropical and subtropical regions, although they can also be isolated in temperate climates (Bartlett et al., 2008, Bennett et a!., 1977, Kwon-Chung & Varma, 2006, Mitchell & Perfect, 1995). PCR fingerprinting with minisatellite Ml 3- and microsatellite (GACA)4-specific primers have also been used to group hundreds of isolates into nine major molecular genotypes (Ellis et a!., 2000, Meyer et a!., 1999). These molecular subtypes are defined such that VNI, VNII and VNB are for serotype A strains; VNIII is for serotype AD strains (a hybrid, diploid serotype); VNJV is for serotype D strains, and VGI to VGIV are for serotypes B and C strains. Because most strains 4 INTRODUCTION that infect AIDS patients are serotype A strains, it invariably means that most strains that infect AIDS patients are in the categories of VNI and VNII. However, VNB is not usually found among clinical strains and it has only been recently discovered in a clinic in Botswana (Litvintseva et a!., 2005, Litvintseva et a!., 2006, Litvintseva et al., 2003). In studying the role of serotype and virulence of Cryptococcus, the AD hybrid strains of C. neoformans present an interesting and special case for analysis. C. neoformans is usually haploid and it is only transiently diploid during the sexual cycle. However, it has been discovered that some of the strains that infect AIDS patients have the hybrid AD serotype and that many of these strains are aneuploid (Cogliati et a!., 2001). These AD strains, many of which are from Africa, are thought to result from mating interactions between serotype D and serotype A parental strains. They have a wide range of virulence levels in animal models of cryptococcosis, as do the serotype A and D strains. More interestingly, these strains may retain chromosomes either from both serotypes or from only one serotype (Lengeler et a!., 2001). Because of the medical importance of these AD strains, deciphering the genomic make-up of the hybrids in comparison with haploid isolates will be important in the battle against C. neoformans and will lead to more information regarding how genomic differences correlate with differences in virulence. To summarize the background information on C. neoformans and C. gaul!, it is clear that these fungi have and will continue to have a devastating impact on patients with AIDS. In addition, C. gattii is now being found in temperate regions (such as Vancouver Island) and has become an emerging pathogen. Its ability to cause disease in immunocompetent people indicates that it has unusual virulence properties compared with C. neoformans. Therefore, a better understanding of this pathogen is clearly needed. 5 INTRODUCTION 1.2 Background information on Comparative Genome Hybridization (CGH) The goal of the work in this thesis was to use Comparative Genome Hybridization (CGH) to study genomic variation in selected strains representing different serotypes and molecular subtypes of C. neoformans and C. gattii. The CGH technique can rapidly survey an entire genome without the need to sequence the strain of interest. To perform CGH, the DNAs from a reference strain and from the strain of interest are first labeled with different fluorescent dyes. The labeled DNAs are then hybridized to a microarray containing DNA probes from the genome of the reference strain. The competing signals between the labeled DNAs can then be analyzed to reveal areas of conservation, amplification, divergence and deletion. Although a lot of information can be gained regarding the genomic variability of the strains, one of the caveats of CGH is that it cannot detect genomic rearrangements, such as translocation events. Also, it cannot it identify genes that are not present in the reference strain. This technology, however, is quite sensitive because it is capable of detecting sequence divergences that are as low as 2% (Taboada et al., 2005). As a result, CGH is an excellent method for examining genome variability when reference genomes are available. It can also provide a way to examine sites of sexual recombination on a genomic scale (Gressmann et al., 2005); this information could be important for pathogens such as C. neoformans, in which fertility and mating type are associated with virulence. In the past, CGH has been used to study several different pathogens. For example, in 2005, CGH was used to study the gain and loss of genes in Helicobacter pylon, the pathogen responsible for ulcers (Gressmarm et a!., 2005). In addition, CGH has been used to detect areas of conservation in Campylobacterfejuni, a pathogen that causes severe diarrhea and is the most common cause of acute bacterial enteritis (Taboada et al., 2005). Furthermore, CGH has been 6 INTRODUCTION used to identify genomic regions that are related to virulence. For example, Herbert et a!. (2005) used CGH to identify pathogenicity islands in Group B Streptococcal strains. The ability of CGH to identify evolutionary changes was also used to identify genes that resulted from horizontal gene transfers in a number of Brucella species (Rajashekara et al., 2004). In general, CGH has been successfully used to study genomic variation in a number of bacteria, fungi, parasites and higher eukaryotes. Although CGH is a powerful tool, its novelty means that control experiments must be conducted to determine the sensitivity, reproducibility and the relevance of the CGH data. Fortunately, data from CGH experiments are usually highly reproducible, especially when compared with microarray experiments for gene expression (Watanabe et aL, 2004), and data generated by different laboratories can be pooled together for meta-analyses (Taboada et a!., 2004). As a result, the application of CGH to fungal pathogens such as Candida albicans and C. neoformans can contribute to a detailed understanding of genome variability and this information can potentially contribute to insights into virulence and drug resistance. In general, genomic variation between strains has been a well known predictor for phenotypic differences in a wide range of organisms including bacterial, fungal and viral pathogens (Powell et a!., 2008, Koide et a!., 2004, Ellison et a!., 2008, Harriff et a!., 2008, Jiang et a!., 2006, Allen & Nuss, 2004, Nunes et a!., 2003, Harvala et a!., 2002). In fact, in Xy!ella fastidiosa, genomic variation caused by transposable elements and other mobile elements (which contribute up to 18% of the genome) plays a major role in determining the virulence of this bacterial pathogen (Nunes et a!., 2003). In summary, CGH is a great tool for studying genomic variation. Although it has a few caveats, its many attributes atone for these issues. For example, CGH can rapidly survey an 7 INTRODUCTION entire genome and it is highly reproducible. Thus, CGH was chosen as the tool to study genome variation in Cryptococcus species. 1.3 Karyotype variation, aneuploidy and phenotypic switching in yeast-like fungi. In the study of genomic variation, it is important to remember that dynamic chromosomal rearrangements are common in fungi. Although the exact purpose of chromosomal rearrangement in fungi is unknown, it has been speculated that it may be involved in pathogenesis (Brandt et a!., 1996; Fries et a!., 2001). Specifically, it was reported that new patterns in the electrophoretic karyotypes were found in sequential isolates from 8 of 33 patients (Brandt et al., 1996). In this case, one of these patient samples had karyotype changes that correlated with changes in colony morphology (large and small) (Brandt et al., 1996). Yet, both of the karyotypes for these colony size variants were different from the initial karyotype of the isolate (Brandt et a!., 1996). These types of studies illustrate that chromosomal rearrangements (including aneuploidy as found in AD hybrid strains) may be very dynamic in C. neoformans and may affect phenotype. This idea is intriguing because C. neoformans and C. gattii are known to switch phenotypes during the course of an infection (Guerrero et a!., 2006) and it is possible that this phenomenon is mediated by chromosomal rearrangements. Common phenotypes that may switch during infection include switching from a mucoid to a smooth or to a serrated colony morphology (Guerrero et a!., 2006, Fries et a!., 2005). Less common phenotypes that may switch include differences in growth rate, cell wall composition and capsule size. Presumably, phenotype switching allows the fungi to evade the immune system by making it more difficult for the host to generate antibodies and reactive T-cells (Pietrella et a!., 2003). In any case, 8 INTRODUCTION infections in which phenotype switching has been documented often lead to poorer outcomes (Guerrero et al., 2006). The hypothesis that specific karyotype changes may regulate colony morphology switching is also supported by the fact that this phenomenon has been well-characterized in C. albicans, another fungal pathogen (Rustcheriko-Bulgac et a!., 1990, Suzuki et a!., 1989). In fact, phenotype changes, chromosome copy number polymorphisms and other chromosomal rearrangements have been well documented in C. albicans following passage through an animal host (Rustchenko-Bulgac et al., 1990; Chen et al., 2004). In these studies, changes in the copy number of specific chromosomes have been linked to a number of phenotype changes. For example, the copy number of CHR 3 determines the ability of C. albicans to utilize L-sorbose as a carbon source (Janbon et a!., 1998). More interestingly, it has been recently documented that a specific segmental aneuploidy of CHR 5 results in C. albicans resistance to azole, a drug commonly used to treat fungal infections (Selmecki et al., 2006). In this case, the aneuploidy allows for the up-regulation of efflux pump genes (Selmecki et al., 2006) and various other transcription factors related to azole drug resistance (Coste et a!., 2006). Although it may seem that aneuploidy gives C. albicans an advantage during infection, this is not always the case because trisomy of CHR 1 in C. albicans reduces its virulence (Chen et a!., 2004). As an additional example of the complexity behind aneuploidy and fitness, it is also known that defects in C. albicans genes involved in double-strand break repair will lead to an increase in genomic instability (which should be disadvantageous for the cell). As a result, these defects also make the strain more sensitive to the flucanozole (Legrand et a!., 2007). Also, these mutants are more sensitive to oxidative damage and grow slowly in culture. However, mutations in double-strand break repair also result in a higher frequency of drug-resistant mutants in comparison to wild- 9 iNTRODUCTION type strains (Legrand et a!., 2007), therefore suggesting that these mutations could be advantageous for the species in some situations. Thus, aneuploidy and its role in pathogenicity can be very complicated. CGH has been extensively used to study genomic variation in studies on Saccharomyces cerevisiae, breast cancer cells, Arabidopsis thaliana and C. albicans (Myers et al., 2004, Voullaire & Wilton, 2007, Henry et al., 2006, Selmecki et a!., 2005b). Through the use of CGH and other recently developed techniques, aneuploidy and chromosome instability have been particularly well-characterized in S. cerevisiae. For example, it was discovered that yeast can compensate for a number of laboratory-generated mutations by gaining an extra copy of the chromosome in which the mutation resides (Hughes et al., 2000). In fact, these strains had a selective advantage over the mutants that did not gain an extra chromosome, such as the ability to grow faster in culture (Hughes et a!., 2000). However, this compensation with an extra chromosome results in genome-wide changes in expression profiles. That is, the genes on the disomic chromosome generally produce double the number of transcripts regularly expressed. This suggests that there is no global dosage-compensation to normalize expression for each gene at the transcriptional level (Hughes et a!., 2000, Torres et al., 2007). In cases where the aneuploidy was not caused by mutations in other genes, however, strains with extra chromosomes grew slower than wild-type strains in culture (Torres et a!., 2007). Tn any case, changes in chromosome copy number result in altered gene expression profiles for the chromosome in question. Aneuploidy in yeast can also cause an up-regulation of genes involved in ribosomal biogenesis, ribosomal RNA processing, nucleic acid metabolism and carbohydrate metabolism (Torres et a!., 2007). 10 INTRODUCTION In summary, previous studies have shown that chromosomal changes in yeast-like fungi such as S. cerevisiae and C. albicans will result in changes in the phenotypes and gene expression patterns. In addition, the evidence that chromosomal changes occur in C. neoformans during infection, in association with phenotypic switching, suggests that genome variability may be important for virulence. 1.4 Research objectives and hypothesis The main objective of the work in this thesis was to evaluate genome variation in selected strains of C. neoformans representing serotypes A B, D and AD. A major discovery from this work was that some strains of serotype A showed elevated copy number for specific chromosomes. This discovery led to the second objective to investigate Chromosome Copy Number Variation (CCNV). This work resulted in the discovery and characterization of a relationship between CHR 13 and formation of the virulence factor melanin. It also laid the foundation for future studies to investigate CCNV in C. neoformans strains during the infection of mammalian hosts, including AIDS patients.’ 1 Portions of the work presented in this thesis were performed in collaboration with Dr. Guanggan Hu and appeared in the following publication: *Hu, G., *Liu, I., Sham, A. Stajich, J.E., Dietrich’ F.S., and Kronstad, 3.W. 2008. Comparative hybridization reveals extensive genome variation in the AIDS-associated pathogen Cryptococcus neoformans. Genome Biology 9: R41. * co-first authors 11 MATERIALS AND METHODS 2 MATERIALS AND METHODS 2.1 DNA isolation Genomic DNA (gDNA) isolation was performed with a previously published method (Pitkin et al., 1996). In brief, the method involves vigorous vortexing of cells with glass beads in a phenol:chloroform:isoamyi alcohol solution followed by a series of ethanol (70%) precipitations. The precipitated DNA was resuspended in distilled water. 2.2 NimbleGen arrays (CGH Design) To perform the CGH analysis, gDNA was sent to NimbleGen Systems Inc. TM (Madison, WI, USA) (http://www.nimblegen.com/products/cghJindex.htrnllast accessed: May 2008) for hybridization to high density tiling arrays (Hu et al., 2008). The processing of these arrays was as described by Seizer et a!. (2005). These high-density arrays each contained roughly 386,000 probes; the average length of the probes was 50 bp (range 45 to 85 bp) and the average melting temperature of the probes was 76°C. The array spanned all 14 chromosomes for each genome; however, regions that were highly repetitive, such as regions in the telomeres and the centromeres, were not represented on the array. The resulting data were viewed either with Microsoft Excel or with NimbleGen’s SignalMapT GFF visualization software. One hybridization experiment per genome was performed, followed by confirmation by PCR and sequencing for specific regions. Table 2.1 presents a list of the strains and genomes used for hybridization. 12 MATERIALS AND METHODS Table 2.1 List of C. neoformans and C. gattii strains used for CGH. The first strain listed for each serotype has a sequenced genome and was used as the reference strain for the hybridization experiment. For the AD strains, two reference strains (H99 and JEC2 1) were used. Strain Serotype Molecular Source Reference(Mating type) subtype H99 A (a) VNI (AFLP 1) 3. Heitman; Clinical isolate (Heitman et al., 1999) (Boekhout & van CBS7779 A (a) VNI (AFLP 1) T. Boekhout; Clinical isolate (Brazil) Belkum, Belkum, 1997) Bt63 A (a) VNB J. Heitman; Clinical isolate (Botswana) (Litvintseva et al.,2006) 125.91 A (a) VNI (AFLP 1) J. Heitman; Clinical isolate (Tanzania) (Lengeler et aL, 2000) WM626 A (a) VNII (AFLP 1A) W. Meyer; Clinical isolate (Australia) (Meyer & Mitchell,1995) J. Kwon-Chung; Laboratory generated,JEC21 D (a) VNIV (AFLP 2) (Heitman et a!., 1999) rogeny ofNIHl2 and N1H433 J. Kwon-Chung; Clinical isolate (UnitedSIIH12 D (a) VNIV (AFLP 2) (Heitman et al., 1999)States) J. Kwon-Chung; Environmental isolate (Heitman et al., 1999)4IH433 D (a) VNIV (AFLP 2) (Denmark) CDC228 A (a), D (a) VNIII (AFLP 3) J. Heitman; Clinical isolate (Lengeler et a!., 2001) KW5 A (a), D (a) VNIII (AFLP 3) J. Heitman; Clinical isolate (Lengeler et a!., 2001) CDC3O4 (a), D (a) VNIII (AFLP 3) 3. Heitman; Clinical isolate (Lengeler et a!., 2001) WM276 B (a) VGI (AFLP 4) W. Meyers; Environmental isolate (Fraser et a!,, 2004)(Australia) 3. Heitman; Environmental isolate (Fraser et a!., 2004)E566 B (a) VG? (Australia) K. Bartlett; Clinical isolateR794 B (a) VGI (Kidd et aL, 2005)(Vancouver Island) K. Bartlett; Environmental isolate K. Bartlett strainKB3864 B (?) VGI (Vancouver Island) collection K. Bartlett; Environmental isolateKB7892 B (a) VGI (Kidd et al., 2005)(Vancouver Island) T. Boekhout strainRV66095 B (a) VGI T. Boekhout; Clinical isolate (Argentina) collection 3. Kronstad strainWM276 GFP2 B (a) VGI Constructed in this lab collection 13 MATERIALS AND METHODS 2.3 Creation of a database of genome variation for selected strains of C. neoformans and C. gattii To characterize regions of difference that were identified by CGH, sequence identities were calculated with the BioEdit program (developed by Tom Hall of Ibis Biosciences), which is available at http://www.mbio.ncsu.edu/BioEdit/bioedit.htrnl (last accessed May 2008). This program as been used in a number of previous studies (Hu et a!., 2008; Mao et a!., 2007; Wang et a!., 2007; Wu et a?., 2007) Microsoft Excel was used to calculate the average LR (AVG LR) and the standard deviation (SD) for the CGH hybridization data. Only the following values for regions of difference were recorded in the annotation work: AVG LR ± SD LR 0.5 or AVG LR ± SD LR < -0.5. This cutoff was chosen because it corresponds to a sequence identity of approximately 95% (Hu eta?., 2008). The annotations are summarized in Appendices A and B. Sequence information was retrieved using Duke’s University Resources for Fungal Comparative Genomics, which is available at http://fungal.genome.duke.edu/ (last accessed Dec. 2007). Using the GLEAN models for genes at this website, sequences were retrieved and BLASTn was used to determine if there were any orthologous genes or functional information associated with the sequence. When available, GO annotations were also recorded for additional insights into the function of the genes in the regions of difference. 2.4 Confirmation of selected data documented in the database by PCR - RFLP Specific regions of difference identified by CGH were analyzed by PCR amplification, sequencing and restriction enzyme digestion. Primer design was accomplished with the program Primer3, which is available at http://frodo.wi.rnit.edu/cgi-bin/primer3/primer3 .cgi (last accessed 14 MATERIALS AND METHODS April 2008; Koressaar, T. & M. Remm, 2007) (see Table 2.2 for a list of primers). Taq DNA Polymerase from Invitrogen was used in accordance with the manufacturer’s instructions for all PCR reactions on purified gDNA. The resulting amplicons were then purified with either the Qiagen Gel extraction kit or the Qiagen Nucleotide removal kit depending on whether or not multiple bands were present. TaKaRa Ex Taq from TaKaRa Biotechnology (Dalian) CO., LTD. was used for colony PCR according to the manufacturer’s instructions. Products were sent to NAPS (Nucleic Acid Protein Service Unit) at UBC for sequencing. PCR-RFLP (Restriction Fragment Length Polymorphism) analysis was conducted with various restriction enzymes (listed in the figures legends) according to the manufacturer’s instructions. Designs for the RFLP analysis were made via the Sequence Manipulation Suite, which is available at: http ://www .ualberta.caJ’-stothard1j avascript/index.html (last accessed May 2008). Table 2.2 List of primers used for the confirmation of the CGH data. Table 2.2 A List of primers used for sequencing Primer Name Sequence Forward Sequence Reverse ino3phos CGCTGACATGGGCTACTACA AGATGGGCACAAAGACATCC thiolox CTACCCGCTCCAGGAGAATG CGCGACTAAGGATGGTAGGG acidphos CCAGCTTfACGGCCTCAAC GCCACTTGTCTCCACCATGA CNIO 1750 TAGATATGAGACACCCACCACACC AACCTCGGTACTAGTGACCACCAT CNGOO78O TCGCTACCATCGAGAGTGTCATAG GTCTGCCTACCCTCCTATCAAG Table 2.2 B List of primers used for detecting regions of deletion [irimer Name Sequence Forward Sequence Reverse CNC06920_2 GATCCCTTCTCCGCG1TGA CCTYI’ACTCCCATGTCGCTGCT 0423 9H99 GCCGTCATACGCCTGTATC GGGAAGACGATCACCGAAT Table 2.2 C List of primers used for RFLP (AD strain analysis) Primer Name Sequence Forward - Sequence Reverse acidphos CCAGCTTfl’ACGGCCTCAAC GCCACTTGTCTCCACCATGA CNE04380 GAACGACGCTCCTTTGATACC CTCCTCTTCGATCTCATCTCTTCC CNAO 1230 GGTGTCAAGACTGTCACCATCTAC AACGACCTGAGTATGGCCTTG CNBO1 970 GTCATGAGAGACAACGTGCAG CGATATCCTCACTAATGACGAAC 15 MATERIALS AND METHODS PCR-RFLP was used to confirm the preferential retention of specific chromosomes in the AD strains. In addition, this method was used to screen 33 strains from clinical, environmental and laboratory-generated sources (Table 2.3) to determine whether or not there was preferential retention of the serotype A version of CHR 1 in clinical strains. Table 2.3 List of C. neoformans strains used for PCR-RFLP. . SerotypeStrain . Source {eference(Mating type) MAS920005 .D (?) T. Mitchell; Clinical Isolate T. Mitchell culture collection MAS920026 .D (?) T. Mitchell; Clinical Isolate C. Mitchell culture collection MAS920046 AD (?) T. Mitchell; Clinical Isolate T. Mitchell culture collection MAS920047 AD (?) T. Mitchell; Clinical Isolate T. Mitchell culture collection MAS920062 AD (?) T. Mitchell; Clinical Isolate T. Mitchell culture collection MAS920066 AD (?) T. Mitchell; Clinical Isolate T. Mitchell culture collection TvIAS920074 AD (7) T. Mitchell; Clinical Isolate T. Mitchell culture collection vIAS92O 174 \I) (?) T. Mitchell; Clinical Isolate T. Mitchell culture collection 4AS9201 81 .D (7) T. Mitchell; Clinical Isolate r. Mitchell culture collection 1AS920 190 D (7) T. Mitchell; Clinical Isolate f. Mitchell culture collection MAS920280 AD (7) T. Mitchell; Clinical Isolate T. Mitchell culture collection MAS920283 AD (?) T. Mitchell; Clinical Isolate T. Mitchell culture collection MAS920328 D (7) T. Mitchell; Clinical Isolate T. Mitchell culture collection MAS920354 AD (?) T. Mitchell; Clinical Isolate T. Mitchell culture collection MAS920355 AD (7) T. Mitchell; Clinical Isolate T. Mitchell culture collection 4AS920383 AD (?) T. Mitchell; Clinical Isolate r. Mitchell culture collection XL1552 (a), D (a) 3. Heitman; Lab. generated (H99 and JEC21) (Lin et at., 2008) XL1548 (a), D (a) 3. Heitman; Lab. generated (H99 and JEC2I) (Lin et at., 2008) XL1462 A (a), D (a) 3. Heitman; Lab. generated (H99 and JEC21) (Lin et a!., 2008) XL1514 A (a), D (a) J. Heitman; Lab, generated (H99 and JEC21) (Lin et a!., 2008) KBL-AD 1 A (a), D (a) 3. Heitman; Lab, generated JEC1 71 and H99 (Lengeler et at., 2001) nc5-19 A (a), D (a) 3. Heitman; NC (USA), environmental strain (Lin et at., 2007) nc6-20 . (a), D (a) J. Heitman; NC (USA), environmental strain (Lin et a?., 2007) nc42-10 (a), D (a) 3. Heitman; NC (USA), environmental strain (Lin et al., 2007) ICB 134 A (a), D (a) 3. Heitman; Pigeon dropping, Brazil (Barreto de Oliveira et al., 2004) it752 A (a), D (a) T. Mitchell; Italy, clinical (Litvintseva et al., 2007) zg287 A (a), D (a) T. Mitchell; China, clinical (Litvintseva et a?., 2007) nc34-21 A (a), D (a) T. Mitchell; NC (USA), environmental strain (Litvintseva et at., 2007) mrnRLl365 A (a), D (a) T. Mitchell; USA, clinical (Litvintseva et at., 2007) nc35-5 AD (7) T. Mitchell; NC (USA), environmental strain (Litvintseva et at., 2007) Zg290 (a), D (a) T. Mitchell; China, clinical (Litvintseva et a?., 2007) it756 A (a), D (a) T. Mitchell; Italy, clinical (Litvintseva et at., 2007) iclO-23 U) (7) f. Mitchell; NC (USA), environmental strain (Litvintseva et at., 2007) 16 MATERIALS AND METHODS 2.5 Isolation of melanin variants of C. neoformans by plating on L-DOPA medium In order to screen for strains that had switched phenotypes (i.e. from high melanin production to low melanin production or vice versa), single colonies were first selected and re streaked onto a L-DOPA plate for single colonies. This colony isolation procedure was conducted at least three times in each successive screen to ensure the purity of each strain. After isolation, the strains were grown overnight in YPD (Difco) at 3 0°C, cells were washed in water and plated on L-DOPA medium at 500 cells per plate as described previously (D’Souza et a!., 2001). Changes in melanin production (darker or lighter colony pigmentation) were identified visually. 2.6 Quantitative real-time PCR Quantitative real time PCR was used to determine the copy number of CHRs 4 and 13 in selected strains. An ABI 7500 instrument (Applied Biosystems) was used along with the Power SYBR Green PCR Master Mix from Applied Biosystems according to the manufacturer’s instructions. Each of the amplifications was conducted in triplicate. Primers were designed using the Primer Express program from Applied Biosystems (Table 2.4). In order to ensure the reliability of these primers, they were designed only in highly conserved (100% identity) regions between the genomes of strains H99 and JEC21. In addition, the primers were designed using only exon regions (according to GLEAN models that were available at the Duke’s University Resources for Fungal Comparative Genomics). Also, the primer sequences were used in BLAST searches against the H99 genome (NCBI database) in order to ensure that they would (theoretically) bind to only one region in the genome. That is, 17 MATERIALS AND METHODS only primers whose E-value hit to any off-target sequence was above 0.3 were used (See Appendix D, Table D.1). Two controls were used in the qRT-PCR experiments and these were performed by amplification with primers designed for the actin gene CNA04650 on CHR 1 and the SMGJ gene on CHR 4. The CNA04650 gene (Genbank ID: XP_566591) encoding actin is present in only one copy in the genome of strain H99 (according to BLASTn searches). The SMGJ gene was chosen because it is duplicated in the genome of strain JEC21 (Fraser eta!., 2005b). It therefore represents a useful positive control for identifying two copies of a sequence in a genome. To calculate the normalized cycle number (or iCt) at the regions that were being amplified by qRT-PCR, the following formula was used: zCt = I\Ctgene - ACtgenome (Livak & Schmittgen, 2001, Ferreira eta!., 2006), where ACtgene = average cycle of the test gene — average cycle of actin gene and ACtgo average cycle of H99 gene — average cycle of the actin gene (H99). The copy number was then calculated with the following formula: Copy number = 2’ (Fraser et al., 2005b). Only triplicates that gave a SD less than 0.2 were included in the analysis. Table 2.4 List of primers used for quantitative real-time PCR (qRT-PCR) 1 Sequence Forward Sequence Reverse Name LesLeu CNA04650 1 CGTCACAAACTGGGACGACAT GCGACACGGAGCTCAYI’GTA SMG1 4 CCCCGTAAGGGCCTGAT TGGGCCAGAGTCTCGATGAG CNNOO82O 13 CTGCTGCAGTCCAAGTTGATG CCTITGCCACCTrGAGTTTCTI’ CNNOO15O 13 TCAGTGGCTCTGGCCTCH TI’ACGGACCAGGTACCATGGA CNNO 1890 13 GCGTCACGAGAGGGAAAATACT TGGTAGTGACATAAGTGTAGTGAGGAA CNNO2400 13 GGCCCCTACGCCAGCTt CTGACCm’AGCGAACCAAGATC Neo *13 CGACAAGACCGGCflCCA AAGCGAAACATCGCATCGA 00707 4 GATATCGAGCCCAAGCACAAG TCGGCTACTGGG1T17CG *For one of the strains, a neomycin marker was inserted into CHR 13. 18 MATERIALS AND METHODS A number of strains (in addition to the ones used for CGH) were tested for possible disomy of CHR 4 and CHR 13 by qRT-PCR. These strains are listed in Table 2.5. Table 2.5 List of C. neoformans and C. gattii strains used for qRT-PCR. . Serotype MolecularStrain . Source Reference(Mating type) subtype 1FM46084 D (?) E. Tanaka (Tanaka et a!., 2005) IFM5 1645 B (Untypable) ?________ E. Tanaka (Ohkusu et al., 2002) IFM5 1621 A (a) E. Tanaka (Obkusu et al., 2002) IFM5 1627 A (a) E. Tanaka; Clinical strain (Obkusu et a!., 2002) IFM5 1636 A (a) ? E. Tanaka; Clinical strain (Ohkusu et a!., 2002) IFM5 1650 A (Untypable) ? E. Tanaka; Clinical strain (Ohkusu et a!., 2002) IFM5 1666 \. (a) E. Tanaka; Clinical strain (Ohkusu et a!., 2002) IFM5 1640-Li . (a) E. Tanaka (Ohkusu et al., 2002) FM5 1649-Li A (a) E. Tanaka (Obkusu et al., 2002) IFM5 1654 B (Untypable) ? E. Tanaka (Obkusu et a!., 2002) 1FM51658-1 A (a) E. Tanaka (Ohkusu et a!., 2002) IFM5 1677-Li A (a) ?________ E. Tanaka; Clinical strain (Ohkusu et a!., 2002) 1FM5 1678-Li . (a) E. Tanaka; Clinical strain (Ohkusu et a!., 2002) piO86 A (a) [NI T. Mitchell; Clinical strain (Japan) Shigefumi Maesaki, arg1373 A (a) VNI r. Mitchell; Clinical strain (Argentina) T. Mitchell Culture Collection arg1366 A (a) VNI T. Mitchell; Clinical strain (Argentina) T. Mitchell Culture Collection ig2467 A (a) VNI T. Mitchell; Clinical strain (Uganda) Messer, S. A., Univ. of Iowa n2637 (a) VNI T. Mitchell; Clinical strain (India) Gugnani, H. C., Univ of Delhi tn470 A (a) VNI r. Mitchell; Clinical strain (Thailand) (Archibald et aL, 2004) )t9 (a) VNI r. Mitchell; Clinical strain (Botswana) (Litvintseva et a!., 2003) bt68 A (a) VNI T. Mitchell; Clinical strain (Botswana) (Litvintseva et al., 2003) T. Mitchell; Clinical strain (Northc8 A (a) VNI . (Litvmtseva et a!. 2005)Carolma) RTC23 -i . (?) Fresh clinical isolate T. Mitchell Culture Collection TC23 -2 . (?) ?_________ resh clinical isolate T. Mitchell Culture Collection TC23 -3 (?) resh clinical isolate r. Mitchell Culture Collection RTC3 1 — Mix A (?) ‘resh clinical isolate T. Mitchell Culture Collection RTC3 1 — 1 A (?) Fresh clinical isolate T. Mitchell Culture Collection 19 RESULTS 3. RESULTS 3.1 Evaluation of genome variation in serotype A and serotype B strains The initial goal of the research in this thesis was to use CGH to characterize genome variation in C. neoformans. The rationale for this work was that the analysis of genome variation has previously provided a way to make detailed comparisons of pathogen isolates, and to potentially identify genomic regions associated with differing levels of virulence (Gressmann et at., 2005, Nash et a!., 2006, Herbert et at., 2005, Rajashekara et at., 2004, Selmecki et at., 2005a, Taboada et a!., 2004, Ellison et a!., 2008). Tn this thesis, the characterization of the genomes of isolates from different serotypes and molecular subtypes of C. neoformans and C. gattii will provide a foundation to understand the background level of variability in these pathogens, and to evaluate genomic differences that might be associated with known and new virulence factors. As mentioned earlier, the three main virulence factors in C. neoformans are the ability to grow at 37°C, the production of melanin to combat the oxidative killing in the host and the production of the polysaccharide capsule (Idnurm et al., 2005). In the context of these main virulence factors, one might expect avirulent strains to have genomic differences in the specific subsets of genes that contribute to these traits. In addition, an understanding of genome variation might help to identify genes or gene families that were not previously known to influence virulence. The work presented here illustrates the power of CGH and establishes a database of genomic variation for C. neoformans and C. gattii. 20 RESULTS 3.1.1 Genome variation in serotype A strains Initially, CGH was used to examine the C. neoformans genomes of selected isolates representing the three molecular subtypes of serotype A strains (called VNI, VNII and VNB). DNA from each of the four serotype A strains (BT63, 125.91, CBS7779, WM626) was hybridized to NimbleGenTarrays containing oligonucleotide probes for the genome of strain H99, a well-characterized and highly virulent serotype A strain (Giles et a!., 2007, Nazi et a!., 2007, Wright et a!., 2007, Bahn et al., 2005). The genome of H99 was sequenced at the Broad Institute and Duke University, with contributions from the Kronstad Laboratory and the Michael Smith Genome Sciences Centre. The results are summarized in Tables 3.1 and 3.2. Table 3.1 Summary of the total number of predicted deletions (Log Ratio (LRa) - 0.5) or amplifications (LR > 0.5) detected in four serotype A strains relative to the genome of the strain H99. BT63 125.91 CBS7779 WM626 (VND) (VNI) (VNI) (VNII) # of proposed deletions 193 73 10 206 # of proposed amplifications 19 17 5 11 The LR cutoff values were chosen based on a dataset that compared sequence identity with the LR for —20 genes at the MAT locus (Hu eta!., 2008). For CBS7779, CHR 13 was excluded from the analysis of the total number of amplifications, because it appears to be disomic. 21 RESULTS Table 3.2 Comparison of Log Ratios (LR) and standard deviations (SD) for all 14 chromosomes of four serotype A strains that were hybridized to an array of the H99 genome. BT63 VNB) 125.91 (VNI) CBS7779 (VNU WM626 (VNII CHR Average SD Average SD Average SD Average SDLR LR ER ER 1 0.007 0.674 -0.019 0.438 -0.011 0.212 -0.014 0.634 2 0.013 0.634 0.007 0.312 -0.013 0.159 -0.019 0.637 3 0.034 0.600 0.020 0.272 -0.018 0.190 -0.011 0.660 4 -0.051 0.733 -0.002 0.329 -0.018 0.170 -0.105 0.778 5 -0.093 0.900 -0.097 0.623 -0.005 0.212 0.006 0.622 6 0.019 0.632 0.023 0.290 -0.035 0.263 -0.042 0.687 7 0.013 0.686 0.016 0.338 -0.029 0.258 -0.022 0.715 8 -0.048 0.773 0.020 0.323 -0.014 0.183 -0.032 0.724 9 0.031 0.625 0.018 0.292 -0.036 0.246 -0.026 0.643 10 0.050 0.612 0.033 0.310 -0.081 0.522 0.034 0.611 11 0.013 0.653 0.012 0.289 -0.019 0.212 -0.047 0.726 12 0.019 0.668 0.075 0.441 0.005 0.232 -0.085 0.827 13 0.025 0.627 -0.026 0.345 0.562 0.214 0.721 0.884 14 0.026 0.622 -0.006 0.328 -0.027 0.232 -0.060 0.685 Overall 0.000 0.689 0.000 0.377 0.000 0.267 0.000 0.705 Chromosomes with suspected disomy are highlighted in yellow. As shown in Table 3.1, the number of regions of difference (e.g. amplifications, deletions, or areas of divergent sequence) was greatest when the VNB and VNII strains were compared with the VNI strains. This was expected because all of these strains were compared to the strain H99, which has the VNI molecular subtype. That is, the amount of variation and deletion corresponds directly with previously known data regarding the molecular subtypes of these strains. Specifically, variation is seen such that VNII>VNB>VNI, relative to the H99 genome (Meyer et al., 1999, Litvintseva et al., 2006, Litvintseva et al., 2003). The specific regions of difference identified in the CGH analysis of all four strain are presented in Appendix A. It is interesting to note that CBS7779 was previously reported to have a small genome of 15 Mb compared to the average size of -49 Mb for C. neoformans (Boekhout et a?., 1997). Based on the CGH data shown in Table 3.2, however, it appears that CBS7779 has all of the sequences found in H99. That is, the results in Table 3.2 suggest that the genome size for 22 RESULTS CBS7779 is at least 19 Mb (the size of the H99 genome) and, contrary to previous reports, there are no large missing segments of gDNA in this strain (Boekhout et at., 1997). This result probably reflects the fact that electrophoretic karyotyping, which was used by Boekhout et a!. (1997), may not always yield accurate information regarding the genome size of a strain. In fact, contrary to the data of Boekhout et aL,(1997), the results in Table 3.2 revealed that strain CB57779 (and WM626) had an elevated LR for CHR 13 suggesting that the strain may have been disomic for this chromosome. Further analysis of this observation is presented later in this thesis in sections 3.4, 3.5 and 4.3. 3.1.2 Genome variation in serotype B strains CGH was also used to examine genome variability in serotype B strains of C. gattii. DNAs from six serotype B strains (R794, KB3864, KB7892, RV66095, E566, WM276-GFP2) were hybridized to NimbleGenTm arrays prepared with the genome of WM276, an environmental isolate (Hu & Kronstad, 2006, Chaturvedi et at., 2005). The genome of strain WM276, which has a VGI molecular subtype, was sequenced by the Michael Smith Genome Sciences Centre. All of the strains which were hybridized to the WM276 array have the VGI molecular subtype (Fraser et a!., 2004, Kidd et a!., 2005). This VGI subtype is the most prevalent subtype in serotype B, it is found worldwide and is one of the three molecular subtypes found on Vancouver Island (Kidd et aL, 2007). Overall, this specific set of strains was chosen to gain a better understanding of the genome variation within the VGI subtype, and to make use of the locally generated sequence information (Appendix B, Table B. 1 and Table B.2). A parallel analysis is in progress for strains of the other two molecular subtypes of serotype B strains (VGIIa and VGIIb) found on Vancouver Island. 23 Ta bl e 3. 3 Su m m ar y o f t he to ta ln u m be r o fd el et io ns (E Ra - 0. 5) o r am pl ifi ca tio ns (L R 0. 5) in six se ro ty pe B st ra in sr el at iv e to th e ge no m e o ft he st ra in W M 27 6. R 79 4 K B 38 64 K B 78 92 R V 66 09 5 E5 66 W M 27 6 G FP 2 # o fd el et io ns 11 3 94 93 96 69 18 #o f 9 12 5 13 25 40 aT he LR s fo ri de nt ify in g de le tio ns an d am pl ifi ca tio ns w er e se le ct ed ba se d on co m pa ris on s o ft he se qu en ce id en tit y at th e M A T lo cu s w ith th e co rr es po nd in g LR s de te rm in ed by CG H (H u e ta l., 20 08 ). Ta bl e 3. 4 Co m pa ris on o fL R an d SD fo r a ll ch ro m os om es o fs ix se ro ty pe B st ra in st ha t w er e hy br id iz ed to an ar ra y o ft he W M 27 6 ge no m e. R 79 4 K B 38 64 K B 78 92 R V 66 09 5 E5 66 W M 27 6 G FP 2 C H R A ve ra ge SD A ve ra ge SD A ve ra ge SD A ve ra ge SD A ve ra ge SD A ve ra ge SD ER LR ER ER LR LR A 0. 02 5 0. 31 0 0. 05 3 0. 43 9 0. 05 5 0. 31 6 - 0. 00 8 0. 45 6 - 0. 00 4 0. 26 1 0. 01 7 0. 13 4 B - 0. 00 2 0. 38 5 0. 00 0 0. 47 2 0. 04 5 0. 38 3 0. 00 8 0. 43 1 0. 01 4 0. 23 8 0. 01 0 0. 11 8 C 0. 01 1 0. 35 9 0. 01 1 0. 45 6 - 0. 02 0 0. 38 4 0. 01 3 0. 41 7 0. 01 0 0. 24 8 0. 01 5 0. 13 2 D 0. 01 5 0. 34 1 0. 01 3 0. 46 4 0. 00 8 0. 34 6 0. 01 0 0. 40 4 - 0. 01 0 0. 26 1 0. 01 0 0. 11 9 E - 0. 01 2 0. 46 5 - 0. 02 9 0. 59 4 - 0. 01 8 0. 47 2 0. 00 4 0. 47 4 0. 02 2 0. 25 0 0. 01 2 0. 12 0 F 0. 01 8 0. 34 3 0. 01 2 0. 49 3 0. 01 2 0. 35 0 - 0. 01 3 0. 54 3 0. 02 5 0. 26 5 0. 01 7 0. 12 0 G - 0. 01 1 0. 49 3 0. 03 4 0. 43 1 - 0. 01 6 0. 50 4 0. 00 6 0. 53 7 0. 03 3 0. 24 0 0. 02 0 0. 11 9 H - 0. 01 7 0. 50 7 - 0. 00 8 0. 58 1 - 0. 02 5 0. 51 3 0. 00 6 0. 44 0 0. 00 7 0. 37 9 0. 01 5 0. 12 1 I - 0. 00 5 0. 45 8 - 0. 01 1 0. 58 2 - 0. 01 3 0. 46 5 - 0. 10 0 0. 63 2 - 0. 22 1 0. 95 9 0. 01 0 0. 12 2 J - 0. 11 6 0. 68 9 - 0. 17 1 0. 90 0 - 0. 12 4 0. 69 5 - 0. 03 1 0. 54 5 0. 06 1 0. 33 9 0. 01 8 0. 11 7 K - 0. 00 6 0. 38 3 - 0. 02 3 0. 54 0 - 0. 01 4 0. 38 9 0. 02 8 0. 39 0 0. 03 0 0. 26 8 - 0. 21 6 1. 06 4 L - 0. 01 5 0. 42 3 - 0. 02 2 0. 55 2 - 0. 02 1 0. 42 7 - 0. 05 6 0. 59 1 0. 00 0 0. 41 7 0. 00 8 0. 11 9 M 0. 00 9 0. 36 1 - 0. 06 4 0. 66 6 0. 00 1 0. 36 7 - 0. 05 6 0. 59 2 - 0. 02 1 0. 51 0 0. 01 6 0. 12 2 N 0. 02 0 0. 31 9 0. 02 9 0. 44 0 0. 01 5 0. 31 1 0. 01 8 0. 42 5 0. 06 3 0. 23 8 0. 01 3 0. 11 5 O ve ra ll 0. 00 1 0. 31 9 0. 00 0 0. 52 3 0. 00 1 0. 41 5 0. 00 0 0. 46 7 0. 00 0 0. 36 5 0. 00 0 0. 28 8 Ch ro m os om es w ith su sp ec te d de le tio ns or co n tin uo us di ve rg en ce s o v er 50 kb in le ng th ar e hi gh lig ht ed in ye llo w . RESULTS The results of the CGH analysis with the serotype B strains are summarized in Tables 3.3 and 3.4, and the detailed list of the specific regions of difference between the strains is presented in Appendix B (Tables B.l and B.2). Three main conclusions can be drawn from these data. First, there appeared to be extensive sequence variation in these serotype B strains despite the fact that they all have the same molecular subtype (all are VGI). By comparison, the SD of the LR for CBS7779 and WM626 when hybridized to H99 (all three strains are of the molecular subtype VNI from serotype A) were 0.267 and 0.3 77. The SD for the serotype B strains of the VGI subtype, however, ranged from 0.288 (WM276 — GFP2) to 0.523 (KB3864). Although more strains need to be screened, the results suggest that there could be a wider degree of sequence polymorphism within the VGI molecular subtype of serotype B strains than in the VNI molecular subtype of serotype A strains. Second, none of the chromosomes in these strains showed the higher LRs indicative of CCNV that was found for the serotype A strains CBS7779 and WM626. Finally, the CGH analysis identified LR differences suggestive of relatively large deletions or areas of divergence (greater than 50 kb) in strains E566 and WM276-GFP2. The latter strain is an avirulent mutant that was serendipitously obtained during a transformation experiment to introduce the gene for green fluorescent protein into the strain WM276 (which is virulent in a mouse model of cryptococcosis). Further analysis of the deletion in WM276-GFP revealed that the 75 kb deletion included the telomeric region at the right end of CHR K. This deleted region contained 25 predicted genes (Appendix B, Table B.2). The 88 kb region in E566 with the low LR contains the mating locus and this region is therefore not deleted. Instead it is merely divergent from the reference strain. This result makes sense because E566 is MATa and WM276 is MATa (Appendix B, Table B.l). 25 RESULTS 3.1.3 Confirmation of a selected subset of CGH data In order to confirm the presence of the deletions in the genomes of selected strains, two sets of primers were designed flanking two predicted deletions. The subsequent PCR products were subjected to gel electrophoresis and sequencing. An example of the result for one deletion is shown in Figure 3.1. Similar results were obtained with the other deletion. Fig. 3.1A Fig.3.1B c cy — .. .‘. .. . — 212S chr3 2.0BT63 0 16 — 5S71r (j4 bp 3742 TOLaI 1 0 125.91 — —_--------_-- -- — 2 04GOLp 3731 Ra., Tot.I Pc 0 5 Figure 3.1 Confirmation of the presence of a deletion in the serotype A strain BT63. A. Hybridization of DNA from strains BT63 and 125.91 to the H99 array. The array data suggested that there was a deletion on CHR 3. Each probe (black bar) represents 400 bp, thus providing an estimated deletion size of 1200 bp. B. Confirmation of the presence of a deletion by PCR. Subsequent sequencing of the amplified region in BT63 revealed that 1237 bp were deleted. The average LR in that region was -2.868. The deletion was also surrounded by highly repetitive sequences suggesting that it may have been the result of a recombination event between the repeated sequences. The results shown in Figure 3.1 confirmed the presence of a deletion in strain BT63, as predicted by CGH (see Appendix A). This deleted region was on CHR 3 (at approximately 1443600 - 1445200) and the deletion removes a gene that was identified by the GLEAN prediction model as glean_04239. Based on the Fungal Genome Database at Duke University and available EST 26 RESULTS sequences, glean_04239 is a transcribed gene. A search with psi—BLAST revealed 41% identity with a gene encoding a putative trypsin — a domain. This domain has been implicated in a variety of functions including basal membrane formation, cell migration, cell differentiation, adhesion, signaling and chromosomal stability (Marchler-Bauer et al., 2007). Because there is such a wide range of possible functions, it is difficult to predict the exact influence that this gene deletion might have on the phenotype of BT63. In any case, its deletion and subsequent identification by CGH illustrated the ability of CGH to find regions of difference in the strains of interest selected in this study. 3.1.4 Overall analysis of genome variation in serotype A and serotype B strains In general, the comparisons of the regions of difference within the serotype A strains and within the serotype B strains did not reveal any clear patterns of genomic variations. That is no specific types of genes or chromosomal regions (e.g., centromeres or telomeres) differed between the strains. Instead, it was found that a wide variety of genes were either amplified or deleted between the different strains. One possible exception was that transport and trafficking functions appeared to be commonly deleted or divergent in some of the serotype A and serotype B strains; these included putative drug transporters (missing in BT63 and WM626), trafficking proteins (missing in 125.91 and WM626), a monosaccharide transporter (missing in 125.91), efflux proteins (missing in WM626), a MFS aipha-glucoside transporter (missing in BT63), and hexose transporters (missing in R794, KB3864, 125.91, BT63 and WM626). One additional difference that could be related to virulence was that the gene for a delayed-type hypersensitivity antigen-related protein was missing in 125.91 and BT63. 27 RESULTS One gene of interest that was deleted in BT63 encoded a phosphatidylinositol transporter (GenBank ID: XP_571089) (see Appendix A). This gene may be related to virulence because inositol has been reported to be important for C. neoformans during pathogenesis (Molina et a!., 1999) and a previous study in our laboratory revealed that the addition of lithium, an inhibitor of inositol monophosphatase, decreased capsule production (Hu et al., 2007). In addition, the idea that inositol may play a role in virulence is supported by the fact that one of the most abundant transcripts produced by the strain H99 during the infection of rabbit CSF is a myo-inositol transporter (Dr. Guanggan Hu, unpublished data). Thus, it was predicted that the deletion of the gene for inositol synthetase (which is upstream of the inositol pathway (Saccharomyces Genome Database available at http://www.yeastgenorne.org/ last accessed May 2008)) might also influence the production of capsule (Dwight S. S., eta!, 2002). However, the deletion mutant for the gene (Broad ID: CNC06440) did not reveal any clear changes in capsule size or melanin production and did not display any attenuation in a mouse model of cryptococcosis (I. Liu, data not shown). The mutant, however, failed to grow in the absence of inositol, as expected from similar mutations in other fungi (Ulaszewski et a!., 1978). In summary for this section, the CGH analysis of different serotype A and B genomes revealed an extensive level of genome variation for these strains (See Appendices A and B). The regions of difference provide a foundation for future investigations to link these genes to any differences in virulence amongst these strains. In addition, the analysis of the serotype A strains resulted in the exciting discovery that strains CBS7779 and WM626 may have an elevated copy number for CHR 13. Further work on this discovery is presented in sections 3.4 and 3.5. 28 RESULTS 3.2 Examination of a genetic cross and recombination sites for serotype D strains It has been shown that CGH can detect variation as low 2% between the genomes of closely-related strains (Taboada et al., 2005). Thus, it was reasoned that CGH would be able to reveal the chromosomal sites of meiotic recombination between strains of the same serotype of C. neoformans. To specifically test this idea, the genomes of serotype D strains of mating types MATa and MATa were examined using N1H433 as a representative MATa strain and Nll112 as a representative MATa strain. These strains were originally used as the parental strains in a backcrossing experiment to generate congenic MATa and MATa strains (Fraser et al., 2005b, Heitman et a!., 1999, Kavanaugh et a!., 2006). One of the progeny, JEC21 (MATcr), has been used extensively as a laboratory strain and its genome has been sequenced (Loftus et a!., 2005). This presented an opportunity to compare the NIH 12 and NJH433 parental genomes to the genome of the progeny strain JEC2 1 on a tiling array. The results (Figure 3.2) revealed regions of alternating patterns of divergence and similarity that likely identify which segments of the JEC21 genome originated from which parent. This idea is supported by the observation that if the hybridization pattern of one parent was divergent, then the hybridization of the other parent was more similar. These patterns were revealed by the extent of noise in the LRs whereby the greater the noise, the greater the divergence. In short, the data suggested that recombination sites due to a meiotic event can be mapped throughout the genome by using the CGH method. 29 JE C 21 (C yS ) v s N1 H4 33 o r N IH 12 (c y3 )o n JE C 21 a r r a y (‘l uS - w - - A tm S A ft S fl ij fl A 1r c ch riG C & 10 L Ø fl $4 r ,f lp C -I Cl ii’ 1t , • - - - J It L tI fl h tT h A b U J r A S 4 4 Fi gu re 3. 2 A n o v e r v ie w o f th e hy br id iz at io n o fN IH 43 3 an d N IH 12 D N A to a TE C2 1 ai ta y (a st ra in re su lti ng fro m a cr o ss o f NI 11 43 3 an d N IH 12 ). R eg io ns w ith v ar ia bl e ER re pr es en t r eg io ns o fs eq ue nc e di ve rs ity an d re gi on s w ith LR s cl os e to ze ro re pr es en t r eg io ns o fc o n fo rm ity . Co m pa rin g th es e di ffe re nt re gi on s re v ea le d pu ta tiv e re co m bi na tio n sit es (se e T ab le 3. 5) _ L ” i / / I f / / f f / / f / / I / f / / I r n c ’ t - L I , 1- . u J S 4 ’ — w— I (li i’ 2 c i k r L q j4 . 4f rf l4 PU *L 44 41 t Ch r’S -” t i * * q fl u - I _ _ • y- n - C hr 4 JE C2 1 N 1H 43 3 N IH 1 2 N1 H4 33 N IH 1 2 N 1H 43 3 N IH 1 2 N 1H 43 3 N TH 12 N 1H 43 3 N IH 12 N 1H 43 3 N IH 12 N 1H 43 3 N IH 12 M al in g Io ni c I - - C ht S - w e _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Ch r6 - t f 4 4 4 0 .. - t’ . $4 ti4 Lt flf l* 1I U i.i Jp L. fig . . , ii i 12 . . . L . - IJ IL - . — r - 1 p L . C 1 1 3 - - t it tr _ L i. a ij u u •: ,i- L r,r L - - ‘ C h f t - . — - ‘ - i - 4 - j _ n iw -- as -a aj - 40 4 b ’ fl • i. . (‘l u 14 ‘ - . h fl a .u A n w a w a . o 4 i. ii i - ‘ C, , RESULTS . . . AVG LR of SD of AVG LR of SD ofCoordinates Divergent Strain N1H433 Nffl433 Nffl12 N11h12 1 1 -43532 NIH12 -0.032 0.367 -0.593 0.837 1 43576-856009 N1H433 0.001 0.381 0.070 0.181 1 856053-1104183 NIH12 0.073 0.176 -0.015 0.420 1 1104402-1856406 N1H433 -0.015 0.484 0.098 0.207 1 1856450-2299466 Nll{12 0.040 0.337 -0.121 0.608 2 1 - 14866 N1H433 0.066 0.504 -0.205 0.386 2 14910-59776 NIH12 -0.110 0.362 -0.307 0.441 2 59820-79744 N1H433 -0.027 0.402 -0.109 0.367 2 79788-905375 NIH12 0.051 0.190 -0.01 0.412 2 905419-1023729 N1H433 0.017 0.343 0.074 0.164 2 1023773 - 1388827 NIH12 0.070 0.182 -0.021 0.422 2 1388871 - 1433328 N1H433 -0.009 0.356 0.067 0.208 2 1433372-1632054 NIH12 0.020 0.194 -0.219 0.658 3 1 - 1218953 N1H433 -0.003 0.378 0.08 0.187 3 1218997-1656795 NIH12 0.071 0.257 -0.037 0.652 3 1656839-2105433 N1H433 -0.049 0.466 0.016 0.201 4 1-381958 N1H433 -0.156 0.766 0.028 0.202 4 382002-943451 NIH12 0.066 0.178 0.015 0.414 4 943495 - 1668854 N1H433 -0.281 1.023 0.094 0.202 4 1668898 - 1782959 NTH12 -0.042 0.236 -0.4 12 0.750 5 1 - 969309 NIH12 0.059 0.179 -0.062 0.489 5 969423 - 1507393 N1H433 -0.176 0.904 0.109 0.280 6 1 - 75987 NIH12 -0.034 0.209 -0.273 0.433 6 76020-1291564 N1H433 0.044 0.288 0.014 0.382 6 1291702-1431588 NIH12 0.006 0.197 -0.127 0.406 6 1431632-1438769 N1H433 0.178 0.745 0.142 0.526 7 1-230738 NIH12 0.014 0.199 -0.072 0.403 7 230827 - 1346622 N1H433 -0.005 0.421 0.077 0.200 8 1-318879 N1H433 0.058 0.435 0.024 0.210 8 318923-529999 NIH12 0.045 0.175 -0.073 0.420 8 530208-1194145 N1H433 -0.018 0.448 0.043 0.199 9 1-300354 NIH12 0.033 0.193 -0.108 0.460 9 300419 - 487358 N1H433 0.020 0.456 0.097 0.209 9 487446 - 954505 NIH12 0.047 0.174 -0.009 0.369 9 954549-1168552 N1H433 -0.068 0.412 -0.02 0.217 10 whole chromosome N1H433? -0.014 0.397 0.055 0.200 11 whole chromosome NIH12? 0.036 0.191 -0.083 0.455 12 1 - 689977 NIH12 0.023 0.276 0.0 14 0.358 12 690021 - 906616 N1H433 -0.036 0.322 -0.076 0.360 13 whole chromosome NIH12? 0.042 0.206 -0.087 0.584 14 whole chromosome NIH12? 0.055 0.209 -0.063 0.474 Table 3.5 SD was used as a measure of LR (Log Ratio) divergence for each chromosome in the parent strains N1H433 and NIH12 upon hybridization to the JEC21 array. In each region, the higher SD correlated with the more divergent ER as determined by visual inspection (highlighted by yellow). Regions which contradict this idea were highlighted in blue. 31 RESULTS To confirm the CGH results, regions from four chromosomes were selected, amplified by PCR and sequenced. As shown in Figure 3.3 and Table 3.6, regions with highly divergent LRs (higher SD’ s) were correlated with lower sequence similarities and regions with LRs close to zero (lower SD’ s) were correlated with high sequence similarities. Visual inspection revealed that some of the chromosomes did not display a clear divergence in either N1H433 or NIH 12 (CHRs 10, 11, 13, and 14). Thus, an attempt was made to quantify this phenomenon by calculating the average LR and standard deviation (SD) of each of these chromosomes (Table 3.5). Although previous data would suggest that LRs can be used to extrapolate the sequence similarity for a given region (Hu et a!., 2008), in the case of CHRs 10, 11, 13, 14, it was necessary to use SD. The reason behind this is that the average LR for each of these regions was close to zero (Table 3.5), therefore suggesting that there was no sequence divergence. Nevertheless, a visual inspection of the SD data suggested otherwise. That is, it was found that the SD method can be used to reveal the predicted parental origin of each chromosome if a visual inspection of the hybridization data does not reveal any clear information. Overall, the data in Table 3.5 clarified the origin of the gDNA segments in JEC21: in general, regions with LRs close to zero (the parent) correlated with a lower SD. Table 3.6 Summary of the average LRs and SD between NIH433 and NIH 12 in selected areas across the genome. AVG SD of Sequence Sequence Identity of AVG SD of Identity ofPrimer LR ofRegion of Interest LR of N1H433 to LR of LR of N1H433 toName NIH4N1H433 JEC21 Nffl12 Nffl12 JEC2133 (%) (%) acidphos CHR3, 1652975- 1654900 0.686 0.3 19 96.0 0.204 0.195 100 ino3phos CHR3,1870085-1871085 0.236 0.110 100 0.026 0.382 96.8 thiolox CHR 5, 340983 - 342137 0.080 0.170 100 0.066 0.685 98.4 CNGOO7 CHR. 7, 225200 - 227200 -0.017 0.130 100 0.234 0.760 97.1 CNIO17 CHR 9, 487200 - 489600 Did not sequence -1.804 0.950 89.3 Data was generated through PCR, sequencing and aligning the sequence against JEC2 1 via BioEdit. For a pictoriai representation, see Figure 3.3. The more divergent sequence as determined by visual inspection is highlighted in yellow. 32 RESULTS Figure 3.3 A — Chromosome 5 LR 0.179 0.213 0.097 0.285 0.258 -0.129 -0.155 0.069 -0.095 Similarity(%) 100 100 100 100 100 100 100 100 100 #of base pairs 53 45 45 57 58 45 45 45 45 e a’ It It It It It Itfr &‘ fr & fr er P ‘t’ cf’ cc’3 e’3 ‘3 ‘3 ‘5 ‘5 ‘3 1. c N ‘_v ‘5 chr5 45287 anavg - 29145 Rowo Total Pasoons ( 1658-1 - 1507304 Data Values -5.55 Ia 3.73 .14:151 .:: rrur. 67662 no,inalizact - 29145 Rows, Tat Pa Lions) 16684- 1507394 ), Data Values -5.03 to 3.43 LR 0.247 -0.405 -1.014 0.388 0.590 0.405 0.657 0.951 -0.734 Similarity (%) 100 97 95 100 100 100 100 97 97 #of base pairs 53 45 45 57 58 45 45 45 45 Figure 3.3 B — Chromosome 9 ‘3 cP & & & & & & & & & It & & & It It ItIt °‘3 It It & & oP oP oP oP o oP oP oP oP oP ‘3 10 N ‘5 15 1 ‘3 5. 15 ‘3 ‘3 ‘5 ‘5 ‘3 ‘3 ‘3 N N N 2617 chr9 0.050 ci t--ritt 45267 anaa9 -23775 Flows. Talal Pustoans (141-1176553 I. Data Values -2 t’ :a2.61 2000 . . , , , . , . 35110 Wib -2041, 61662 naetcatiied-23175 Rows,Tnlal Positions) 141-1175503), Duta/ues\çtu250 LR -2.080 -2.453 -0.679 -2.158 -0.584 -2.858 Similarity (%) 96 82 91 77 97 93 # of base pairs 50 45 45 45 48 45 Figure 3.3 Analysis and confirmation of the N1H433 / N11112 hybridization patterns to the JEC21 array. Segments of four different chromosomes are shown in the figure. The resulting PCR and sequencing supported the idea that CGH can be used as a tool to identif3’ recombination sites across the genome. That is, regions that CGH revealed to be more divergent (as shown by the greater range of values for the LR) were confirmed by sequencing. 33 RESULTS LR -0.048 -0.168 0.215 -1.093 0.921 1.116 0.697 Similarity (%) 96 96 96 97 100 100 95 #ofbasepairs 59 58 51 45 45 57 47 Figure 3.3 C — Chromosome 7 2 22 o ooo -2 575 61662 no,mai(2e6 -27 74 a. ThtaI Postuor-s) 76- 1246623 Data Votoos -2 56 to 2 22 -4615 45287 onoog -27474 Rows. This’ Posnrs (78- 1346623): Data Values -462 to 203 1 34 Fi gu re 3. 3 D — C hr om os om e 3 LR 0. 69 6 1. 09 2 0. 61 0 0. 81 7 0. 21 7 L R 0. 32 1 0. 36 7 0. 20 0 0. 08 4 0. 29 8 0. 14 7 Si m ila rit y (% ) 91 97 95 10 0 97 Si m ila rit y (% ) 10 0 10 0 10 0 10 0 10 0 10 0 # o fb as e pa irs 46 45 45 53 45 # o fb as e pa irs 45 45 54 46 57 52 \ d’ / / / , / _ , 4, f / . . ‘I SO f 45 28 7 , li .’ a i - 41 07 7 Ft w t.. T tt l Pn o3 oe s 1.1 06 - 21 05 43 41 D at a V at uo - 48 1 to 20 5 - 2 U 61 66 2 ot ;7 LO 77 uw To ta lP oi tl oo 14 06 - 21 05 43 4 D at a V al us -6 21 1o 22 7 LR 0. 37 0 0. 28 1 0. 18 9 - 0. 12 5 0. 30 4 LR 0. 11 1 - 0. 55 0 - 0. 07 9 - 0. 02 3 0. 63 8 0. 05 8 Si m ila rit y (% ) 10 0 10 0 10 0 10 0 10 0 Si m ila rit y (% ) 97 93 98 97 10 0 96 # o fb as e pa irs 45 45 45 53 45 # o f b as e pa irs 45 45 54 46 57 52 H RESULTS In summary for this section, the calculation of the average LRs and the SDs suggested that CGH is a sensitive technique, capable of tracking the recombination history of a strain. This sensitivity was confirmed by sequencing and subsequent sequence alignments. 3.3 Examination of serotype AD strains CGH was also used to determine the chromosome content of the serotype AD hybrid strains. As mentioned, C. neoformans is normally haploid and a given strain will possess only one capsular serotype (e.g. A, B, C or D) (Idnurm et al., 2005). However, clinical strains have been identified which are diploid, and some of these possess two serotypes (such as the AD strains) (Brandt et al., 1993, Tanaka et al., 1999). In general, they are thought to arise from mating events between parental strains of differing serotype (e.g. serotype A and serotype D) (Lin et al., 2008). Tn the past, there have been multiple attempts to characterize AD strains with respect to specific genes (Lengeler et al., 2000; Cogliati et a!., 2001, Lin & Heitman, 2006), but no comprehensive attempt has been made to identify the exact composition of the genomes of these strains. Meanwhile, comparisons of serotype A and D isolates generally indicate that serotype A strains are more virulent than serotype D strains. For example, serotype A isolates predominate in patients with AIDS over serotype D isolates (Ohkusu et al., 2002). Thus, it was hypothesized that clinical serotype AD hybrid strains might preferentially retain the chromosomes from the serotype A parent over those of the serotype D parent. In addition, it was predicted that any serotype A chromosomes that would be preferentially retained may encode functions that would explain the virulence differences between serotype A and serotype D. To test this hypothesis, DNAs from three clinical serotype AD strains (CDC228, KW5 and CDC3O4) were hybridized to arrays for a serotype A genome (strain H99) as well as a 36 RESULTS serotype D genome (strain JEC2 1) to determine chromosome content. An overview of the data is presented in Figure 3.4 and Table 3.7. For comparison, environmental strains were also tested for the predominance for serotype A chromosomes as described below. In this case, PCR-RFLP analysis was used instead of CGH. 37 4 a I. H yb ild iz ed to Je c 21 (S ei’ oty pe D ) CD C 22 8 1 2 3 4 5 6 7 89 10 11 12 13 14 H yb rid iz ed to 11 99 (S ero typ eA ) Sg i.i cn Ia tI on !o rN .N A (ç & OO hi i4 OO Ip ) 4 H - - 1 2 3 4 5 6 7 8 9 10 11 12 13 14 r in ri A i t_ .L fl _ , ‘ J ’ t w ,iI aI Io uh ,r ¼ ff lI F ø1 S M fl1 90 41 1 I. (44 47 9 04 00 1u p) - - - . S o cn u o th rn tb rf li W H 5 S SO N Ii M )U Si % 9i O 4( H fl. p) _ 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 4 - * k !! ” ° ” (o rM )% !IW YN .M fl! 4m 9i4 44 fl9 9_ fl4 oo Ip ) 4- •1 KW 5 in h o Io o M jM Y O M N 9V L( 44 !$ O fi4 !D hp ; 4- 1 Sc !m w II h. Ii oI I Fo r M )M IW ’11 59 S )% iN )11 5H Fb i1 94 O4 IN Jb PF 4 / 4 -- . 1 2 3 4 5 6 7 89 10 11 12 13 14 t j 00 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 Fi gu re 3. 4 O ve rv ie w o ft he CG H da ta fo rt he A D hy br id st ra in s. Ea ch co lo re d se gm en tr ep re se nt s a di ffe re nt ch ro m os om e an d th e c hr om os om al n u m be rs ar e ba se d o n th e TI G R a n n o ta tio ns u se d fo rJ EC 2 1. C C,, RESULTS Visual inspection of the hybridization data (Figure 3.4 and Table 3.7) suggested that chromosomes of both serotypes were not equally represented in the AD strains. Differences were apparent among the three strains for CHRs 1, 6 and 7 upon hybridization to the JEC21 array, and for CHRs 5 and 8 for the H99 array. In order to quantify the data shown in Figure 3.4, the average LR of each chromosome was calculated and any average LR below -0.5 was classified as a deletion. This quantification, shown in Table 3.7, revealed that AD strains do not have equal representations of their genomic content from either the serotype A or serotype D parental strains. In fact, these clinical AD strains seem to preferentially retain the serotype A chromosomes. In the seven instances where only one parental chromosome was retained, five of them retained the serotype A chromosome. For example, strain KW5 appeared to have only the serotype A version of CFTRs 6 and 7 and only the serotype D version of CHR 8. Strain CDC3O4, however, had the serotype D version of CHR 5. Most strikingly, all three strains have LRs < -0.5 at CHR 1 when hybridized to the serotype D array. This suggests that these strains did not retain the serotype D version of CHR 1. The copy number of the remaining serotype A copy of CHR 1 for these strains is not known. In summary, the AD hybrids chosen in this study were mostly diploid with two copies of each chromosome (Lengeler et al., 2001), but the CGH data indicated that each chromosome may not necessarily be represented by both a serotype D and a serotype A version. 39 RESULTS Table 3.7 Average LR of each of the chromosomes for the serotype AD strains CDC228 KW5 CDC3O4 Hybridized JEC21 H99 JEC21 H99 JEC21 1199to CHR 1 V -0.837 0.388 -0.642 0.386 -0.888 0.171 CHR2 V 0.296 0.099 0.271 -0.055 0.462 0.109 CHR3 V 0.120 -0.034 0.281 -0.085 0.061 0.085 CHR4 0.069 -0.075 0.234 -0.098 0.033 0.088 CHR 5 V 0.128 -0.095 0.288 -0.09 1 0.480 -0.983 CHR6 V 0.113 -0.057 -0.741 0.389 0.077 0.083 CHR7 0.115 -0.046 -0.752 0.403 0.079 0.087 CHR 8 0.077 -0.196 0.575 -1.361 0.034 -0.073 CHR 9V 0.108 -0.03 8 0.27 1 0.087 0.068 0.075 CHR 1OV 0.086 -0.036 -0.076 0.321 0.052 0.100 CHR 11 0.117 0.055 0.253 -0.072 -0.016 0.022 CHR 12 0.087 -0.065 0.230 -0.15 1 0.039 0.041 CHR 13 0.118 -0.097 0.285 -0.126 0.038 0.022 CHR 14 0.052 -0.097 0.263 -0.112 0.011 0.020 Serotype D was represented by JEC21 and serotype A was represented by H99. In most cases, each chromosome had both a serotype D copy and a serotype A copy. However, there were some chromosomes where there was only one serotype present. Here, the missing serotype was highlighted in yellow. Red check marks represent chromosomes that have been confirmed by RFLP in this study. Blue check marks represent chromosomes that have been confirmed by PCR in another study (Lengeler et a!., 2001). In order to test the hypothesis that there is preferential chromosome retention from a particular parent, a series of PCR-RFLP analyses were performed. A representative example of one of these experiments for CHR 5 is shown in Figure 3.5. The rest of the data can be found in Appendix B. The amplification of a PCR product from the gDNA of these AD strains and digestion with a restriction enzyme confirmed that only the serotype D version of CHR 5 is present in CDC3O4. 40 RESULTS JEC2 H99 CDC 228 KW5 CDC 304 Sizeinkb 1 2 13 4 5 6 7 8 9 10 11 12 2 — 1.6 — Lack serotype A — 1. DNA ladder 7. CDC228 digested 2. JEC21 (Serotype D) 8. KW5 3. JEC21 (Serotype D) digested 9. KW5 digested 4. H99 (Serotype A) 10. CDC3O4 5. H99 (Serotype A) digested 11. CDC3O4 digested 6. CDC228 12. No template control Figure 3.5 Analysis of serotype AD strains, with an emphasis on CHR 5 using primer pair CNE04380 FIR. As in the CGH analysis, serotype D is represented by JEC21 and serotype A is represented by 1199. The amplicons were digested by Taq I after purification though the Qiagen Nucleotide removal kit. The resulting patterns support the CGH data (Table 3.7). That is, CHR 5 for CDC3O4 appears to come from only the serotype D parent and not from the serotype A parent, whereas the other AD strains, CDC228 and KW5, have CHR5 copies that originated from both serotype D and serotype A. In total, CHRs 1, 2, 3 and 5 were tested with PCR-RFLP and the results agreed with the CGH data (Table 3.7; the analysis of other chromosomes is presented in Appendix C). In these PCR-RFLP studies, CHRs 1 and 5 were confirmed to show preferential retention, and CHRs 2 and 3 were used as controls for the retention of sequences from both A and D parents. In previous studies reported in the literature, CHR 6 in strain KW5 had both copies from the serotype D parent (Lengeler et al., 2001) and this agrees with the CGH data. In the process of studying the serotype A bias of CHR 1 for the three AD strains, however, a minor discrepancy was observed between the CGH data and the PCR-RFLP data (Figure 3.6). 41 RESULTS Sizeinkb 1 2 3 4 5 6 7 8 9 10 11 2 1 ii — jj Serotype 0.5 — 1. DNA ladder 7. CDC228 digested 2. JEC21 (Serotype D) 8. KW5 3. JEC21 (Serotype D) digested 9. KW5 digested 4. H99 (Serotype A) 10. CDC3O4 5. H99 (Serotype A) digested 11. CDC3O4 digested 6. CDC228 Figure 3.6 Analysis of serotype AD strains, with an emphasis on CHR 1 using primer pair CNAO 1230 F/R. As in the CGH analysis, serotype D is represented by JEC21 and serotype A is represented by 1199. The amplicons were digested by Aval after purification though the Qiagen Nucleotide removal kit. As shown in Figure 3.6, it is clear that all three strains have the serotype A version for CHR 1, as predicted by CGH. The faint bands seen in lane 11 for CDC3O4, however, suggest that the story is not that simple. There are three possible ways to interpret these data: 1. The original colony that was used for the PCR-RFLP study in Figure 3.6 was not a pure population. That is, some of the cells had both a serotype D and a serotype A version of CHR 1 and some of the cells (majority) had only the serotype A version of CHR 1. 2. There was a translocation event such that the locus in CHR 1 that was used to confirm the CGH data in Figure 3.6 was duplicated, giving the cell both a serotype A and a serotype D version of this locus. 3. The primer bound to another chromosome to give a non-specific amplification product. 42 RESULTS In order to test these possibilities, several single colonies from CDC3O4 were selected and gDNA was extracted. PCR-RFLP analysis was then performed to again assess CHR 1 as shown in Figure 3.7. Size in kb 2 1.6 1. 1 kb plus ladder 9. CDC228 - original sample (Ava I digest) 2. JEC21 1O.KW5 3. JEC2 1 (Ava I digest) 11. KW5 — original sample (Ava I digest) 4. H99 l2.CDC3O4 5. H99 (Ava I digest) 13. CDC3O4 - original sample (Ava I digest) 6. CDC228 14. CDC3O4 - new single colony (Ava I digest) 7. CDC228 — original sample (Ava I digest) 15. CDC3O4 - new single colony (Ava I digest) 8. CDC 228 — new single colony 16. CDC3O4 - new single colony (Ava I digest) (Ava I digest) 17. 1 kb plus ladder Figure 3.7 Analysis of serotype AD strains, specifically that of CHR 1 through the use of the primer, CNA123O F / R of single, re-isolated colonies of various strains. As in the CGH analysis, serotype D is represented by JEC21 and serotype A is represented by H99. Single colonies were selected for this RFLP analysis. As shown in Figure 3.7, some of the colonies displayed a different RFLP pattern (lanes 13 to 16) and the results therefore support the first possibility that the original CDC3O4 strain was a mixture of cells with different complements of CHR 1. Also, it appeared that the majority of these cells had only the serotype A version of CHR 1 (based on the RFLP analysis and the CGH data). That is, only a minor population of the cells (seen in lane 15) had both a serotype A and serotype D version for CHR 1. This is important because it suggests that the strain started 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 43 RESULTS with both a serotype A and a serotype D copy of CHR 1, but the serotype D version of CHR 1 was lost (perhaps through selective pressure either in the patient or during laboratory passage). This then raised an interesting idea as to whether the prevalence of the serotype A (or AA) pattern for CHR 1 for these clinical AD strains is due to the selective pressure inside the host. That is, it is possible that clinical strains (which may be more virulent than environmental isolates) may have a bias towards serotype A chromosomes. To test this idea further, a number of clinical and environmental isolates were selected and characterized by PCR-RFLP analysis (see Table 2.3 for a list of the isolates). Table 3.8 Summary of the serotype-specificity of CHR 1 ) T in selected strains. -[ As showii in Table 3.8, there appeared to be a bias for clinical strains to have the serotype AA pattern for CHR 1 (12 of 21 strains) compared to the serotype DD pattern (1 of 21 strains). An attempt was made to test whether or not the particular strains which are AA for CHR 1, instead of AD or DD, have commonly enhanced virulence characteristics. However, tests for the three main virulence factors (Idnurm et al., 2005): growth at 37°C, the ability to produce capsule and the production of melanin did not reveal any correlation with chromosome complement at CHR 1 (data not shown). Overall, these results support the idea that there could be preferential retention of specific chromosomes in hybrid strains; this genomic pattern could introduce biases in the molecular AA Clinical strains 1 8 12 Environmental 1 5 1 Lab generated 1 3 1 The data were gathered by performing PCR-RFLP analysis on the CNAO 1230 locus. Restriction enzymes Aval and MspI were used. 44 RESULTS typing of certain chromosomes in C. neoformans, depending on the chromosomal locations of sequences used for strain identification. 3.4 Chromosome Copy Number Variation (CCNV) 3.4.1 Experimental set-up to examine CCNV During the course of the analysis of genome variation in section 3.1, it was noted that two strains, CBS7779 and WM626, appeared to have an elevated copy number for CUR 13 (Table 3.2). As a result of this observation, it was hypothesized that genome variation in the form of disomy for CHR 13, called chromosome copy number variation (CCNV), might affect the expression of virulence factors such as melanin in these two strains. This hypothesis was developed after Dr. Guanggan Hu found that CBS7779 and WM626 showed reduced melanin production when compared with strain H99 (Hu et al., 2008). If there is a link to melanin production, CCNV could potentially be a mechanism to regulate phenotype switching, and hence control the level of virulence factor production. That is, C. neoformans might alter its virulence through CCNV depending on host conditions (e.g., tissue location, extracellular growth vs. growth inside phagocytes). It is important to remember that Cryptococcus is normally a haploid organism (Idnurm et al., 2005) and that disomy had not been previously characterized in this fungus. Because this is the first time that CGH is being used to study C. neoformans, quantitative Real-Time PCR (qRT PCR) was used to confirm the initial findings with strains CBS7779 and WM626. The qRT-PCR method was also used to further investigate CCNV and its relation to virulence factors (specifically melanin production). As a first experiment to test the sensitivity of the qRT-PCR assay, a serial dilution of template DNA was performed to examine how well copy number 45 —— — — I — RESULTS correlated with the amount of gDNA used in the reactions. The primer set used for this experiment (Figure 3.8) was for the gene CNA04650, which encodes actin. This primer set was used as a control for all of the qRT-PCR experiments described below because it is present on CHR 1 as a single copy in both JEC21 and H99. Sensitivity of qRT-PCR 1.00 z--—-----.--- .-.----- - ----- F- 0.10 0.01 0.00 0.01 0,1 1 10 100 ._ - J__....,L_..L.1 I (If Concentration of gDNA added per well (ng/ul Figure 3.8 Testing the efficiency and the sensitivity of qRT-PCR for the primer CNA04650 in H99. The R2 value was 0.9972. Based on the results shown in Figure 3.8, it appears that qRT-PCR is very sensitive. Consequently, within the concentration range of 0.8 ng/il to 80 ng/p.l, qRT-PCR can be used to accurately detect the copy number of the locus in question, because of the high R2 value (0.9972). Thus, qRT-PCR can be used to detect changes in copy number of certain genes in gDNA, a 46 RESULTS finding that is supported by previously published data that used qRT-PCR to detennine gene copy number in other fungi (Selmecki et a!., 2006). Furthermore, the primer efficiency as calculated by the R2 value is within the acceptable range seen in most qRT-PCR experiments (Lim eta!., 2008, Ingham et a!., 2001, Ferreira et al., 2006). To examine CHR 13 in more detail, a series of additional primers were designed along the chromosome (CNNOO82O, CNNO189O and CNNO2400) and these were used to test for disomy by qRT-PCR. An addition primer pair (called SMG1) was designed at the SMG1 locus on CHR 4 to serve as a positive control because this gene is duplicated in the strain JEC21 (Fraser et a!., 2005b), but it is not duplicated in H99 (Fraser et a!., 2005b). Unless otherwise indicated, the same gDNA that was used for the CGH experiments was also used for the qRT PCR experiments. In addition, the analysis of CHR 13 was extended to other strains originating from the CSF of AIDS patients, as well as strains which had been identified to have more than the usual amount of DNA via FACS (Fluorescence Activated Cell Sorter) analysis (Ohkusu et a!., 2002) (see Table 2.5 for a list of these strains). 3.4.2 CCNV in the serotype A strain CBS7779 As mentioned, the initial CGH data suggested that CBS7779 may have disomy at CHR 13 (Figure 3.9 and Table 3.2). The analysis of CBS7779 by qRT-PCR is shown in Table 3.9. The first observation was that the SMG1 primers showed the expected normalized copy number (1.85) in JEC21. This positive control indicated that the qRT-PCR approach can detect CCNV in gDNA. Thus, if there was disomy for CBS7779, then the expected normalized copy number would be 2 for each of the genes of interest on CHR 13 (CNNOO82O, CNNO189O and CNNO2400). As seen in Table 3.9, however, the primers gave normalized copies numbers 47 RESULTS ranging from 1.27 to 1.45. Two possible ways to interpret this data is that only a portion of the cells in the population may have an elevated copy number for CHR 13 or that disomy of CHR 13 in CBS7779 is unstable. 1 4DhrlO V63Tx 1chrll ‘?.$rtrsrLi4ii.ii ,j 61655040080- 3693 Rows. Totsi Powttons (800- 1561954 Oats Values -449 to 114 o -rijiir .1! 1. 3l655_0400p- 1885RosTotlPc&ons(0-773967)DataVa1ues-387to3J5 •chrl3 ____________ o $ .,unurTr%-% lI1r4Ytj 61655040096- 1749 Rows. Total Po&tions (800-755968). Data Values -2.03 to 3.67 2..r14 1 . tJ..J.cfT.I7 1655_040OOp - 2265 Rows. Total PosOons(400- 925192 ). Data Values -332 tot 15 Figure 3.9 CGH data of CHR 10 to CHR 14 of the strain CB57779 hybridized to the H99 array. The red arrow highlights the possible disomy of CHR13. The idea that CBS7779 has a disomic chromosome is supported by the fact that in separate experiments, Dr. Guanggan Hu created a knockout of the gene at CNNOO 1890 (in CHR 13) (Hu et a?., 2008). This work involved the use of Southern blot hybridizations (conducted by Dr. Hu) and qRT-PCR experiments on both the knock-out mutant and the original CBS7779 isolate (see Appendix D, Table D.2). In the experiment, the knockout of the gene at CNNO189O in CHR 13 only yielded transformants that were monosomic at CHR 13 even though qRT-PCR experiments on the original isolate suggested that CBS7779 is disomic at CHR 13. One possible way to resolve this issue is if the disomy is actually unstable in CBS7779. For instance, if the disomy is 48 RESULTS unstable, then it is likely that the original gDNA isolation came from a stock of cells that were a mixed population. Thus, the normalized copy number for the loci on CHR 13 will display such a wide range of copy numbers. That is, if the gDNA was from a mixed population of cells, the presence of cells with monosomic and disomic copies of CHR 13 would distort the normalized copy number. This is exactly what was observed in the data (Table 3.9). Hence, it was hypothesized that purer populations (as isolated by re-streaking the cells) should produce normalized copy numbers which are closer to 2.0, and this is what was observed upon retesting single colony isolates (Table 3.10). In summary, it seems that the original freezer stock of CBS7779 used for both the CGH experiments and the initial qRT-PCR experiments (Table 3.9) was not a pure population of cells. Regardless, subsequent experiments support the idea that CB57779 is disomic at CHR 13. Table 3.9 Quantitative RT-PCR analysis of gene copy number in strains JEC2 1 and CBS7779 relative to the H99 genome. Normalized geneCt(Avg.) Ct (Avg.) . . . copy numnerStrain Primer test gene Actin Act gene ACtgenome AACt relative to H99 (2t) CNNOO82O H99 18.53 (0.03) 19.41 (0.20) -0.88 -0.88 0.00 1.00 JEC21 17.65 (0.03) 18.49 (0.03) -0.84 -0.88 0.04 0.97 CBS7779 17.43 (0.04) 18.65 (0.02) -1.22 -0.88 -0.34 1.27 CNNO 1890 H99 18.69 (0.06) 19.41 (0.20) -0.72 -0.72 0.00 1.00 IEC21 18.03 (0.03) 18.49 (0.03) -0.46 -0.72 0.26 0.84 CBS7779 17.50 (0.04) 18.65 (0.02) -1.15 -0.72 -0.43 1.35 SMG1 H99 19.05 (0.02) 19.41 (0.20) -0.36 -0.36 0.00 1.00 JEC21 17.24 (0.19) 18.49 (0.03) -1.25 -0.36 -0.89 1.85 CBS7779 18.37 (0.03) 18.65 (0.02) -0.28 -0.36 0.08 0.95 CNNO2400 H99 18.81 (0.04) 19.41 (0.20) -0.60 -0.60 0.00 1.00 JEC21 18.28 (0.04) 18.49 (0.03) -0.21 -0.60 0.39 0.76 CBS7779 17.51 (0.05) 18.65 (0.02) -1.14 -0.60 -0.54 1.45 The three genes tested are all on CHR 13 (CNNOO82O, CNNO 1890 and CNNO2400). standard deviations will not be included in subsequent qRT-PCR tables. To simplit the data tables, the 49 RESULTS During the work with CBS7779, it was noted that the strain displayed variable melanin production. That is, approximately 1/1000 colonies were less melanized (white), compared to the normally black or brown colonies formed by C. neoformans on L-DOPA medium (Figure 3.10). Figure 3.10 Photographs of —1000 CB57779 colonies on L-DOPA medium after three days of growth at 3 0°C. The arrows highlight sample colonies that display the variable melanin production phenotype. The photographs were taken using the HP scanner. As mentioned earlier, it was hypothesized that this phenotypic change might be due to the instability of the disomy found for CHR 13. To test this idea, qRT-PCR was performed on black colonies and white colonies with the result that the black colonies were found to be monosomic for CHR 13 and the white colonies were disomic for CHR 13 (representative isolates are shown in Table 3.10). 50 RESULTS Normalized copy numbers which are higher than 1.5 are highlighted in yellow. the locus CNNOO 820 is on CHR 13. The assays shown in Table 3.10 were repeated and similar data were obtained (see Appendix D, Tables D.3 to D.6). In addition, more variants were screened for a total of 12 white and 12 black strains with the same correlation between disomy and melanin formation (Appendix D, Tables D.3 to D.6). The results in Table 3.10 suggest that white isolates (or those which are less melanized) are disomic for CHR 13 and those that are capable of producing more melanin are monosomic for CHR 13. This is particularly interesting as it suggests that the level of melanin production, which is a virulence factor, can be influenced by chromosome copy number. This agrees with published literature in the sense that it is well documented that Table 3.10 qRT-PCR results for a select number of black and white strains derived from the original CBS7779 stock. Ct(Avg.) Ct (Avg.) Normalized gene copy Strain Primer test gene Actin Act gene ACtgenome AACt number relative to H99(2t) H99 SMG1 19.38 20.07 -0.69 -0.69 0.00 1.00 JEC21 SMG1 17.33 19.24 -1.91 -0.69 -1.22 2.33 CBS7779B1 SMG1 17.08 17.91 -0.83 -0.69 -0.14 1.10 CBS7779B2 SMG1 17.00 17.80 -0.80 -0.69 -0.11 1.08 CBS7779 B3 SMG1 16.05 16.96 -0.91 -0.69 -0.22 1.16 CBS7779 B4 SMG1 16.99 18.08 -1.09 -0.69 -0.40 1.32 CBS7779 Wi SMG1 17.01 18.08 -1.07 -0.69 -0.38 1.30 CBS7779 W2 SMG1 16.69 17.59 -0.90 -0.69 -0.21 1.16 CBS7779 W3 SMG1 17.12 17.95 -0.83 -0.69 -0.14 1.10 H99 CNNOO82O 18.71 20.07 -1.36 -1.36 0.00 1.00 JEC21 CNNOO82O 17.78 19.24 -1.46 -1.36 -0.10 1.07 CBS7779B1 CNNOO82O 16.42 17.91 -1.49 -1.36 -0.13 1.09 CBS7779 B2 CNNOO82O 16.51 17.80 -1.29 -1.36 0.07 0.95 CBS7779B3 CNNOO82O 15.49 16.96 -1.47 -1.36 -0.11 1.08 CBS7779B4 CNNOO82O 16.71 18.08 -1.37 -1.36 -0.01 1.01 CBS7779W1 CNNOO82O 15.84 18.08 -2.24 -1.36 -0.88 1.84 CBS7779 W2 CNNOO82O 15.22 17.59 -2.37 -1.36 -1.01 2.01 CBS7779 W3 CNNOO82O 15.61 17.95 -2.34 -1.36 -0.98 1.97 The locus SMG1 is on CHR 4 and 51 RESULTS changes in chromosome number can affect the gene expression (Hughes et a!., 2000; Torres et a!., 2007) and hence the phenotypes of an organism. It is interesting to note that during the isolation of the black and white colonies for the analysis in Table 3.10, some of the colonies displayed a “sectoring” phenotype as shown in Figure 3.11. Figure 3.11 A Figure 3.11 B Figure 3.11 Photographs of two colonies (CB57779 plated on L-DOPA) displaying the sectoring phenotype for melanin. The bar represents 0.5 mm. The photographs were taken was at 50X magnification (a) and 25X magnification (b) using a Wild Heerbrugg microscope. This phenotype is very similar to the sectoring phenotype seen in yeast cells with aneuploidy (James et a?., 1974). Although the sectors have not yet been analyzed for the copy number of CHR 13, the observation supports the correlation between disomy of CHR 13 in CSB7779 and a reduction in melanin production. Later experiments, however, showed that the correlation between disomy and melanin formation for strain CBS7779 was only applicable to the initial set of black and white colonies that were isolated from the freezer stock (Figure 3.12). That is, when white colonies from the 52 RESULTS initial screen were used to isolate subsequent colonies with variable melanin production, it was found that black colonies from this second screen did not show a strict correlation with monosomy (Table 3.11). Instead, it was found that there was no correlation between the copy number of CHR 13 and melanin production. Interestingly, further analysis revealed that some of these strains in the second screen were disomic for CFTR 4 (note that disomy of CHR 4 is detected with the SMG1 primers and was not seen in the initial screen). Overall, this analysis led to the generation of a chart that details the state of C}IR 13 and CHR 4 over three successive screens (Figure 3.12). The overall conclusion is that there is chromosome instability for CRR 4 and CHR 13 for strain CBS7779, but that the correlation between melanin and CHR 13 disomy is limited to the first set of strains analyzed. These results suggest that changes in melanin production are not solely dependent on changes in the copy number of CHR 13. Instead, the copy number of other chromosomes or other epigenetic/genetic phenomena may play a role in controlling melanin production. See Appendix D for strain name nomenclature and Tables D.7 to D.16. 53 Ta bl e 3. 11 qR T- PC R re su lts fo r a se le ct n u m be ro fb la ck an d w hi te st ra in s de riv ed fro m ch ar ac te riz ed bl ac k an d w hi te co lo ni es se le ct ed fro m sc re en #1 (T ab le 3. 10 ). . . Ct (A vg .) C t( Av g.) N or m al iz ed ge ne co y n u m be r St ra in Pr im er . A ct en e A ct en o m e A A Ct . te st ge ne A ct in g re la tiv e to H 99 (2 ) 11 99 SM G1 16 .2 8 16 .8 6 - 0. 58 - 0. 58 0. 00 1. 00 JE C2 1 SM G1 17 .5 6 19 .2 2 - 1. 66 - 0. 58 - 1. 08 2. 11 CB S7 77 9 W hi te 2, B la ck A SM G1 15 .83 17 .3 5 - 1. 52 - 0. 58 - 0. 94 1. 92 CB S7 77 9 W hi te 2, B la ck B SM G1 16 .8 2 17 .4 1 - 0. 59 - 0. 58 - 0. 01 1.0 1 C B S7 77 9W hi te 2, B la ck C SM G1 16 .1 2 17 .4 7 - 1. 35 - 0. 58 - 0. 77 1.7 1 CB S7 77 9 W hi te 2, W hi te A SM G1 17 .9 3 18 .3 9 - 0. 46 - 0. 58 0. 12 0. 92 CB S7 77 9 W hi te 2, W hi te B SM G1 17 .5 7 17 .9 7 - 0. 40 - 0. 58 0. 18 0. 88 CB S7 77 9 W hi te 2, W hi te C SM G1 18 .2 8 18 .7 3 - 0. 45 - 0. 58 0. 13 0. 91 CB S7 77 9 B la ck 1 SM G1 17 .5 7 17 .9 0 - 0. 33 - 0. 58 0. 25 0. 84 CB S7 77 9 W hi te 2 SM G1 17 .0 7 17 .5 5 - 0. 48 - 0. 58 0. 10 0. 93 H 99 CN NO O8 2O 15 .1 7 16 .8 6 - 1. 69 - 1. 69 0. 00 1. 00 JE C2 1 CN NO O8 2O 17 .8 6 19 .2 2 - 1. 36 - 1. 69 0. 33 0. 80 CB S7 77 9 W hi te 2, B la ck A CN NO O8 2O 14 .9 4 17 .3 5 - 2. 41 - 1. 69 - 0. 72 1. 65 CB S7 77 9 W hi te 2, B la ck B CN NO O8 2O 14 .93 17 .4 1 - 2. 48 - 1. 69 - 0. 79 1. 73 CB S7 77 9 W hi te 2, B la ck C CN NO O8 2O 15 .0 0 17 .4 7 - 2. 47 - 1. 69 - 0. 78 1. 72 CB S7 77 9 W hi te 2, W hi te A CN NO O8 2O 16 .0 7 18 .3 9 - 2. 32 - 1. 69 - 0. 63 1. 55 CB S7 77 9 W hi te 2, W hi te B CN NO O8 2O 15 .4 6 17 .9 7 - 2. 51 - 1. 69 - 0. 82 1. 77 CB S7 77 9 W hi te 2, W hi te C CN NO O8 2O 16 .33 18 .7 3 - 2. 40 - 1. 69 - 0. 71 1. 64 CB S7 77 9 B la ck 1 CN NO O8 2O 16 .4 0 17 .9 0 - 1. 50 - 1. 69 0. 19 0. 88 C B S7 77 9W hi te 2 CN NO O8 2O 15 .11 17 .5 5 - 2. 44 - 1. 69 - 0. 75 1. 68 N or m al iz ed ge ne co py n u m be rw hi ch ar e hi gh er th an 1.5 ar e hi gh lig ht ed in ye llo w . Th e SM G1 lo cu s is o n CH R 4 an d th e CN NO O8 2O lo cu s is o n CH R 13 . RESULTS Two important observations can be made regarding the qRT-PCR analysis of isolates from CBS7779. First, from the analysis summarized in Figure 3.12, once a strain was identified as disomic at CHR 13, subsequent isolates generally retained disomy at CHR 13, regardless of the amount of melanin production. Second, it was noted that some of the strains which had disomy at CHR 13, gained a second disomic chromosome (CHR 4) in conjunction with the change in the melanin phenotype. However, this new disomy of CHR 4 did not correlate with changes in melanin production. It is interesting, however, that disomy at CHR 4 was only observed to arise in the screen in which the starting isolate already had CCNV. This could be a chance occurrence that depended on the genotype of the colony selected for subsequent analysis, and more analysis is clearly needed to understand the patterns of changes in chromosome copy number. Conversely, one can also say that once disomy is established in a particular strain, the chance that it will develop a second disomy may be higher (6 of 23 strains screened) compared to strains that were not disomic to begin with (0 of 12 strains screened). The change in CHR 4 copy number was confirmed by qRT-PCR with a new set of primers (designated as 00_707). The locus was designated as glean_00707 on the Duke University Fungal Resources for Fungal Comparative Genomics website. The two loci, SMG1 and glean_00707 were chosen because they were present as single copies in both H99 and JEC21, and they are on opposite ends of CHR 4. As a result, the use of the two primer sets, 00_707 and SMG1, provided evidence that the entire sequence of CHR4 is disomic, rather than just a segmental or local duplication. 55 C B 87 77 9 (fr om fr ee ze st oc k) e e Z i ’ 1 x l O c o l e S cr ee n #1 12 F re qu en cy o fw li it e c o lo ni es : - 5 x 1O B 1a m I • P is o m y fo rC R R l3 • M o n o so m y fo rC U R l3 L M on os on iy fo rC U R 4 • M on os ow y fo rC U P. 4 e c I o n e w h i t e i s o l e t o n e bl ac k i s o l Sc re en ed - 2 x c o lo ni es : Sc re en ed -- x c o lo ni es : Fr eq ue nc y o fb la ck co lo ni es : 3 X Fr eq ue nc y o fw hi te co lo ni es :- - 5 x 1 O 5 W hI te s 6B la ck s 6 W hi le s 6B la ck s • D is oi ny fo rC U R 13 • D is om y fo r C U R 13 • M on os o* ny fo rC R_ P. 4 • M on os oi ny fo rC U R 13 M on os om y fo rC H R 4 • 4 ha d di so nl yf or C H R 4 • 4 ha di no no sc in yf or C U R 13 • M on os om y fo rC U R 4 • 2 ha dm on os ci ny fo rC K R 4 • 2 ha d di so ni y fo rC U R 13 Pi ck ed w hi te is ol at ew ith di so ni y o fC H R 13 Sc ie en ed - 3 x 1O co lo ni es : Fr eq ue nc y o fb la ck co lo ni es : - - 1 x 1O S cr ee n #3 I 6 W hI te s 6B la ck s I • M on os om y fo rC H R 4 . 5 ha d di so m y fo rC U R 13 I • D is oi ny fo rC U R 13 • (i.e . o n e lo st its di so in y fo rC uR l 3 ) L • m o n o s o m yf or cH R 4 Fi gu re 3. 12 Su m m ar y o ft he co py n u m be ro fC H R 4 an d CH R 13 in 3 su cc es siv e sc re en s an d its m el an in pr od uc tio n ph en ot yp e in th e st ra in CB S7 77 9. RESULTS In summary, Figure 3.12 supports the idea that disomy of CRR 13 is a relatively stable trait once it has been established. Moreover, once disomy at CI{R 13 is established, it will increase the probability that the strain would gain disomy at CHR 4. Because only CHR 4 and CHR 13 were tested in this screen, one cannot rule out the possibility that there could be copy number changes in other chromosomes. Overall, the analysis to date supports the idea that CCNV in CBS7779, if not in C. neoformans in general, is complicated and will require further analyses. 3.4.3 CCNV in the serotype A strain WM626 The initial CGH data also suggested that strain WM626 showed CCNV; specifically the strain (like CBS7779) may have disomy at CHR 13 (Figure 3.13 and Table 3.2). This disomy was also analyzed by using qRT-PCR. Again, Dr. Guanggan Hu integrated a selectable marker at the CNNOO189O locus (Hu et al., 2008) in WM626. He then confirmed the deletion by PCR and Southern blot hybridization. The resulting mutants were then analyzed by qRT-PCR. In this case, the Southern blot hybridization of the mutants clearly showed that these strains had both an intact copy of CNNTO1 890 and a disrupted copy, thereby supporting the idea that WM626 was disomic at CHR 13. The qRT-PCR experiments on these mutants also support this view (See Appendix D, Table D.2). Also, the fact that WM626 maintained the disomy at CHR 13 even after transformation (unlike CBS7779) may suggest that the CCNV in WM626 is more stable than the CBS7779 strain. This idea is further supported by comparisons of the qRT-PCR results for CB57779 and WM626. Recall that the original qRT-PCR experiments for CB57779 were conducted with gDNA that was isolated fresh from the freezer stock and the results did not show copy numbers 57 RESULTS of 2.0 (Table 3.10). Instead, it was only after restreaking the strain to produce single isolates for gDNA isolation that a higher copy number was achieved (Table 3.11). As a result, it was concluded that the disomy of CHR 13 in CB57779 was unstable. Unlike CB57779, the qRT PCR with WM626 gDNA that was isolated fresh from the freezer stock produced copy numbers that were close to 2.0. This result suggested that the disomy of CHR 13 observed in WM626 (Table 3.12), unlike in CB57779, is more stable or that a greater population of cells in the WM626 freezer stock were disomic at CHR 13. Figure 3.13 CGH data of CHRs 10 - 14 of the strain WM626 hybridized to the H99 array. The red arrow highlights the possible disomy of CHR 13. f f’f(f’ffffi’ff’ 2 frwlO 14çnkp tflfl - 6162t0i0- 2576 Rows roWPa i (1600 losgigy, 2 chril o 1PømYmewwIe$*i4. 61626 O400bp -3713 Rows Total Positions (800- 1561954). Data Values -4.23 tO 1 62 jM *%#J Data Values -424 to 2.64 : 61626 O400bp- 2754 Rows. Total Positions (800-755968). Data Values -4 3Oto 429 2Jfl14 o jTØ$jfl¶FØ -- ___ -2- --*-: - - -- --1-- i ijjRÔ.Total Positions) 400- 925192 (‘Data Values 4.06 to 1.63 58 RESULTS Table 3.12 Quantitative RT-PCR analysis of gene copy number in strains JEC21 and WM626 relative to the H99 genome. CNN01 890 H99 JEC21 WM626 18.69 18.03 17.77 19.41 18.49 19.28 -0.72 -0.46 -L52 -0.72 -0.72 -0.72 0.00 0.26 -0.80 1.00 0.84 1.74 Normalized gene Primer Strain Ct(Avg.) Ct (Avg.) copy numberAct gene ACtgenome AACttest gene Actin relative to H99 (2Ct) CNNOO82O H99 18.53 19.41 -0.88 -0.88 0.00 1.00 JEC21 17.65 18.49 -0.84 -0.88 0.04 0.97 WM626 17.57 19.28 -1.71 -0.88 -0.83 1.78 CNNO2400 H99 18.81 19.41 -0.60 -0.60 0.00 1.00 JEC21 18.28 18.49 -0.21 -0.60 0.39 0.76 WM626 17.79 19.28 -1.49 -0.60 -0.89 1.85 SMG1 The three genes tested are all on CHR 13 (CNNOO82O, CNNO189O and CNNO2400). The SMG1 locus is on CHR 4. Normalized gene copy number which are higher than 1.5 are highlighted in yellow. H99 JEC21 WM626 19.05 17.24 18.98 19.41 18.49 19.28 -0.36 -1.25 0.30 -0.36 -0.36 -0.36 0.00 -0.89 0.06 1.00 1.85 0.96 As described above, highly melanized isolates from the first screen of CBS7779 colonies were monosomic for CHR 13 whereas lighter colonies were disomic for CHR 13 (Table 3.10 and Figure 3.12). Based on that result, it was hypothesized that WM626 (which also displayed variable melanin production among colonies from the original culture), might also potentially show an association between CCNV and the melanin phenotype (Figure 3.14). 59 RESULTS Figure 3.14 Photographs of —l000 WM626 colonies on L-DOPA medium after three days of growth at 30°C. The arrows highlight sample colonies that display the variable melanin production phenotype. The photographs were taken using the HP scanner. To test this idea, a series of qRT-PCR amplifications were performed on 12 black and 12 white strains derived from the original stock of WM626. The results, however, proved the contrary in that no association was found between the melanin production of the various WM626 isolates and the copy number of CHR 13. See Table 3.13 and Appendix D (Tables D.17 to D.23) for a summary of the results. 60 RESULTS Table 3.13 Summary of the copy number of CHR 13 and CHR 4 for a select number of WM626 isolates. Strain Copy number of CHR 13 / CHR 4 WM626 B1 Disomic for CHR 13 and CHR 4 WM626 B2 — B3, B5 — B7 Disomic for CHR 13 WM626 B4 Monosomic for CHR 13 WM626 Wi — W12 Disomic for CHR 13 Primer sets for CNNOO82O, CNNO1 890, and CNN02400 were used to confirm the copy number of CHR 13 for WM626 strains Bi — B3 and WM626 strains Wi — W3. Primer set CNNOO82O was used to confirm the copy number of CHR 13 for the remaining strains. Primer sets SMG1 and 00_707 were used to confirm the copy number of CHR 4 for WM626 strains Bi — B3 and WM626 strains WI — W3. In order to simplify the data presentation, original copy numbers are not shown because the results were from the compilation of 5 separate experiments, each with its own internal control. (See Appendix D. 15 to D.21 for details.) In summary, there was no apparent correlation between the copy number of CHR 13 and the amount of melanin produced (Table 3.13) and disomy of CHR 13 is more common in subcultured isolates of WM626 than in CBS7779. In addition, the finding that CHR 4 is disomic for isolate WM626 B 1 again supports the idea that CCNV in general may be very common in C. neoformans. 3.4.4 CCNV in selected other strains A search of the literature revealed descriptions of C. neoformans strains that show variable melanin production and the frequent occurrence of melanin variants (Tanaka et al., 2005, Ohkusu et al., 2002). A set of these strains was obtained from the 1PM stock centre (Research Center for Pathogenic Fungi and Microbial Toxicoses, Chiba University). These strains included IFM5 1645 (Serotype B) and 1FM46084 (Serotype D), thus indicating that other serotypes produce colonies with variable melanin production besides serotype A, which is the serotype of both CBS7779 and WM626 (Tanaka et al., 2005). To examine whether disomy occurred in 61 RESULTS these melanin variable strains, qRT-PCR analysis was performed on both white and black isolates of 1FM46084 and IFM5 1645. Strain 1FM46084 produced mainly black isolates and only 1 in l03colonies showed a marked decrease in melanin production. Unfortunately, based on the qRT-PCR data, there was no disomy at CHR 13 or CHR 4 for 1FM46084 (See Appendix D, Tables D. 24 and D.25). Strain IFM5 1645 (also known as 484) was found to have mostly white isolates and only 1 in i03 colonies showed a marked increased in melanin production. Unfortunately, the qRT-PCR failed to give amplification. One possible reason for this result is that 1FM5 1645 is a serotype B strain, but the primers were originally designed using only serotype A and D genomes (JEC21 and H99). Subsequent testing of the molecular subtype of this strain, 1FM51645, via PCR-RFLP revealed that did not have a VOl or VGII molecular subtype (data not shown). This was an issue because the only two genome sequences available for serotype B strains were from the VOl and VGII molecular subtypes. Thus, it will be difficult to efficiently design any new set of primers to test the disomy of CHR 13 and CHR 4 in IFM5 1645. An additional 11 strains were also screened to detect disomy at either CHR 4 or CHR 13 (see Appendix D, Table D.26 to D.27). These strains were selected because FACS analysis revealed that they have more than the usual amount of chromosomal DNA (Tanaka et a!., 2005, Obkusu eta!., 2002). Of these 11 strains, one was a serotype B strain (IFM5 1654) and the others were serotype A strains of unknown molecular subtype. Again, qRT-PCR failed to amplify a product for testing IFM5 1654, and thus it is unknown whether or not there is CCNV. In this case, however, IFM5 1654 had the VOlT subtype (discovered by PCR-RFLP, data not shown). Thus, it will be possible to design a set of primers that specifically test CCNV in this strain. For the remaining 10 strains, qRT-PCR revealed that disomy was not present for CHR 4 or CHR 13. 62 RESULTS In brief, the search for CCNV for these strains (i.e. strains that FACS analysis have identified as aneuploids) revealed that there was no disomy at CHR 4 or CHR 13. However other disomic chromosomes remain a possibility. Finally, an additional 13 clinical strains of the serotype A, VNI molecular subtype, were tested. These strains were selected from a wide variety of geographical backgrounds (see Table 2.5), but all of them were of clinical origin. Clinical strains were chosen because it was hypothesized that CCNV might arise as a result of selective pressure while the fungus is growing inside the host. Also, only strains which were of the VNI molecular subtype were chosen to maximize the chance of efficient primer binding. See Tables D. 28 to D. 37. Table 3.14 Summary of the copy number at the SMG1 locus (CHR 4) and at the CNNOO820 locus (CHR 13) for a set of strains from AIDS patients. SMG1 CNNOO82O SMG1 CNNOO82O JP1086 1.41 0.57 Bt68 2.06 1.04 0.03 0.08 0.94 0.15 Arg1373 1.50 1.43 C8 1.39 0.83 0.29 0.29 0.08 0.26 Arg1363 1.77 1.57 RTC23-1 1.71 1.48 0.29 0.23 0.26 0.12 Ug2467 1.24 1.09 RTC23-2 1.66 1.31 0.05 0.09 0.22 0.13 1n2637 1.34 0.45 RTC23-3 1.15 1.08 0.03 0.14 0.27 0.05 Tn470 1.65 1.39 RTC31-Mix 1.72 1.37 0.24 0.13 0.19 0.19 Bt9 1.39 0.93 RTC31-1 2.73 1.80 0.11 0.52 0.84 0.16 Each amplification was conducted at least three times (i.e. at least on three separate 96 well plates, with each qRT PCR reaction in triplicate. Cells which have copy numbers higher than 1.3, after the SD was taken into consideration, are highlighted in yellow. The average copy number is always shown in the first row and the SD is shown in the second row. 63 RESULTS The first finding from the results in Table 3.14 was that the copy numbers for this set of experiments were highly variable (i.e. large SDs were observed.). To address this issue, each strain was analyzed by qRT-PCR in at least three separate experiments. Next, the SD was taken into consideration, and if the copy numbers were higher than 1.3, the loci were considered as “disomic.” The rationale for using 1.3 as a cut-off value and the variability are discussed below. Based on the selected cut-off value, it appeared that disomy (more specifically elevated copy number) is widespread amongst clinical strains such that 11 of the 13 strains showed possible duplication at the SMG1 locus (CHR 4) and three of the 13 strains showed possible duplication at the CNNOO82O locus (CHR 13). This supports the idea that CCNV may play an important role in virulence andlor be that it may develop during infection because of two reasons. First, all of these strains (unlike the strain set from Chiba University) are of clinical origin. Second, some of these strains (those with the RTC designation) were freshly collected from the CSF of AIDS patients and only grown in culture twice. The idea that CCNV may play an important role in virulence is further supported by the qRT-PCR data with the “RTC” strains. It is important to note that RTC23-l, RTC23-2 and RTC23-3 came from the same patient, yet all three isolates have a different pattern of copy number at CNNOO82O and SMG1. When RTC3 1 was isolated from the patient, over 300 colonies were obtained from the CSF pellet. To keep matters simple, a single colony (RTC3 1-1) was chosen for the qRT-PCR experiment and the remaining colonies were scraped and conglomerated into one batch (resulting in RTC3 1-MIX). Again, RTC3 1-Mix and RTC31-1 have different patterns of copy number at CNNOO82O and SMG1. All of this suggests that there could be a wide variety or mixture of CCNV within isolates from the same patient infected with C. neoformans. 64 RESULTS Finally, the data in Table 3.14 support the idea that there could be a wide degree of sequence variation within the same molecular subtype. Specifically, strain 1P1086 has a copy number of 0.57 ± 0.08 at the CNNOO82O locus (Table 3.14). This suggests that either this particular locus is deleted in JP1086 or that the primer set for CNNOO82O does not bind well to the DNA of this strain. Given that the melting temperature of the PCR product for 1P1086 gDNA and CNNOO82O was the same as the melting temperature for other PCR products involving the same primer set (data not shown), the results suggest that the relatively low copy number could merely be a result of poor primer binding efficiencies and not because the locus is deleted. This issue of poor primer binding efficiencies and variable copy number resulted in a more detailed analysis regarding the average and SD of the copy number at the SMG1 and CNNOO82O loci (Table 3.15). JEC2 1 was chosen for this detailed analysis because the published sequence indicates that there are two copies of SMG1 and only one copy of CNNOO82O in JEC21 relative to the H99 genome. This analysis was necessary in order to determine the amount of variability one would normally expect to see if the primers were binding efficiently. This kind of analysis was not necessary for CBS7779 or WM626 because the CGH data showed that these genomes did not differ greatly from H99 (one of the reference genomes used to design the qRT PCR primers). Table 3.15 Average copy number of two loci for the strain JEC21 relative to the strain H99. I SMG1 CNNOO82O Average 2.11 1.05 SD 0.31 0.15 These data form the basis for the rationale behind using 1.3 as a cut-off value for the 13 clinical strains that were screened. The data represent at least 8 separate amplifications (i.e. 8 separate 96 well plates with each plate having 3 replicates per locus). 65 RESULTS Based on Table 3.15, one can establish the amount of variation expected in the qRT-PCR experiments when using primers that exactly match the target sequences. Based on the genome sequence, it was known that CNNOO82O exists in a single copy in JEC21. As a result of this analysis, a cut-off value of 1.3 was chosen to indicate whether a particular chromosome had an elevated copy number; i.e. copy numbers which were lower than 1.3 were designated as “single copy” for that locus. This number was chosen because the average copy number for JEC21 at CNNOO82O is 1.05 and the SD is 0.15 and the sum of these numbers is 1.20. It is unlikely that this cut-off value is too low because any variation in the target sequence should delay the CT. Consequently, the final calculation would be a lower copy number. Naturally, the genome that the primers were originally designed for would produce the highest binding efficiency. Thus, 1.20 indicates the upper limit that might be expected by variation in the assay and it was therefore assumed that any number higher than 1 .3 would suggest that the locus may be duplicated. This information was used to analyze the data from the qRT-PCR for the additional 13 clinical isolates (Table 3.14). Overall, the qRT-PCR experiments supported the idea that CCNV is common in C. neoformans, and especially in clinical isolates. Furthermore, the presence of chromosome copy number heterogeneity within the same patient also suggests that CCNV can be very dynamic. It is interesting to speculate that CCNV may be a common feature of fungal cells within the host and that a haploid chromosome complement may be a feature of growth in culture. 66 DISCUSSION AND CONCLUSIONS 4. DISCUSSION AND CONCLUSIONS 4.1 Analysis of genome variation in pathogenic Cryptococcus species 4.1.1 Annotation of genome variation in C. neoformans and C. gattii The data presented in this thesis demonstrate the utility of CGH for identifying regions of difference between strains and lay the groundwork for future studies to examine how genome variation may influence virulence in Cryptococcus species. A number of interesting insights arose from comparing C. neoformans strains of different molecular subtypes, C. gattii strains within one subtype and a set of C. neoformans parental strains used in a genetic cross. For the hybridizations involving the serotype A strains, the level of genomic variation was as expected based on comparisons of strains representing the three molecular subtypes VNI, VNII and VNB (Table 3.2). That is, larger SDs were observed for strains that had been previously identified through molecular subtyping to be more divergent from the genome of the reference strain H99. Conversely, serotype A strains of C. neoformans that were within the same molecular subtype (VNI) had a smaller range of SDs. However, the hybridizations involving the serotype B strains of C. gattii of the same molecular subtype (VGI), had a wide range of SDs (Tables 3.2 and 3.4). SD’s did not correlate with geographical origins because the serotype B strain R794, which had a SD of 0.3 19, is from Vancouver Island while strain E566 from Australia had a SD of 0.365. Although these studies employed a relatively small number of strains, it also appears that there was relatively more divergence within the VGI subtype of C. gattii than within the VNI subtype of C. neoformans. It is possible that the wider range of genome variation seen in the serotype B (C. gattii) strains reflects environmental influences or differences in evolutionary history (e.g., the level of sexual recombination). 67 DISCUSSION AND CONCLUSIONS This then raises the question as to whether the degree of sequence divergence is related to geographical separation. Previously, when RFLP analysis was used to classify the molecular subtypes of a number of serotype B strains from around the world, it was found that there was no association between geographic location and genotype (Kidd et al., 2005). This is particularly worrisome because it suggests that C. gattii strains are not restricted to specific regions. Thus, it may be possible that drug resistant strains and emerging C. gattii isolates of higher virulence, such as those which are causing disease in immunocompetent patients, are spreading around the world. Another key observation from these hybridization experiments is the possible presence of disomy in two of the serotype A strains (CBS7779 and WM626), but the lack of disomy in any of the serotype B strains that were used in the hybridization experiments. One possible way to interpret these data is that disomy is more common in the serotype A strains than in the serotype B strains. Although this may be valid, it is important to note that the majority of strains chosen for the serotype B hybridizations (five of six) were environmental isolates, whereas all of the four strains chosen for the serotype A hybridizations were clinical isolates. Thus, it may be possible that CCNV is more prevalent in clinical isolates than in environmental isolates. A more detailed discussion of disomy is presented in section 4.2 below. In general, the extensive genomic variations revealed in these hybridization studies lay the groundwork for identifying new candidate genes that may be associated with virulence, and the work also demonstrates the use of CGH for characterizing mutants. During the annotation of the genome variation in both serotype A and the serotype B strains, a number of carbon utilization and sugar transporters were shown to be variable amongst the strains. On the surface, one might not expect differences in carbon-source utilization genes to have an impact on 68 DISCUSSION AND CONCLUSIONS virulence. However, a further examination of the literature involving carbon-source utilization and virulence reveals a different story. Specifically, it was found that in C. albicans, a specific subset of genes (glyoxylate cycle genes) are up-regulated upon phagocytosis, thereby facilitating anabolic metabolism in the absence of fermentable carbon sources (Barelle et al., 2006). In fact, the ability to use alternative carbon sources in C. albicans is an important tool for surviving inside the host (Ramirez & Lorenz, 2007). The idea that alternative carbon-source utilization inside the host is an important aspect of virulence can also be seen in Mycobacterium species which, like the Cryptococcus species, cause granulomas inside the host (Lorenz & Fink, 2002). In fact, Mycobacterium tuberculosis relies on using alternative carbon-sources to survive during the granuloma stage of growth. Thus, it is interesting to speculate that differences in carbon- source utilization and transportation genes may contribute to differences in virulence that have been observed between the serotype A and serotype B strains used in this study. Also, in the annotations of genome variability, one strain of interest is the WM276-GFP strain that was constructed in the Kronstad laboratory. WM276, from which WM276-GFP strain was derived, is a virulent strain, whereas WM276-GFP is avirulent in a mouse model of cryptococcosis (Dr. Guanggan Hu, unpublished data). The WM276-GFP mutant strain was constructed by introducing the GFP gene into WM276. Subsequent PCR and Southern hybridization analysis in the Kronstad Laboratory (C. D’Souza, unpublished data) revealed that the GFP insertion did not disrupt any known genes. Later, CGH analysis revealed that WM276- GFP harbors a 75 kb deletion in CHR K, and it may be possible that the loss of virulence is due to this deletion. What is exciting is that none of the genes in the deleted region have any previously known functions in pathogenesis. Instead, they encode a wide range of predicted functions including sugar transporters, glucosidases, enzymes for lipid metabolism and ubiquitin 69 DISCUSSION AND CONCLUSIONS related proteins. Experiments are currently underway to insert these genes back into the WM276-GFP strain in the hopes that virulence can be restored. Of course, the difference in virulence between W1v1276-GFP and WM276 may not be related to this 75 kb deletion at all. In fact, it is plausible that during the generation of WM276-GFP, single nucleotide polymorphisms (SNPs) were introduced into the genome, thereby disrupting a gene essential for virulence. This kind of mutation would not be detectable by CGH. The large deletions observed in strain WM276-GFP, and the elevated copy numbers of entire chromosomes in CBS7779 and WM626, suggest that the genomes of C. neoformans can undergo dramatic changes. The idea that C. neoformans may not have a stable genome has been raised before (Fries et al., 1996, Franzot et al., 1998) and a further discussion of how these genomic changes may relate to virulence can be found in section 4.2. The database of genome variation initiated by the work in this thesis illustrates not only the sensitivity of CGH but the wide degree of genome variability. Future experiments could include using CGH to document and annotate genomic changes that might occur during infection of animal hosts andlor in response to drug treatments. It would also be useful to continue to tabulate the genomic differences that are observed between clinical and environmental strains. 4.1.2 Examination of a genetic cross and recombination sites for serotype D strains The CGH analysis of the serotype D strains NIH 12 and N111433 putatively identified the recombination sites (as described in section 3.2) throughout the genome of the progeny strain JEC21. This clearly highlights the sensitivity of the CGH method to detect relatively small levels of sequence divergence (--2%). There is precedent for the detection of crossover events by whole genome array analysis in S. cerevisiae. Specifically, Winzeler et al. (1998) observed 97 70 DISCUSSION AND CONCLUSIONS crossovers across the whole genome upon analysis of the four meiotic progeny from a single tetrad. This sensitivity could be quite useful in characterizing the association between mating, sexual recombination and virulence in the Cryptococcus species. For example, MATc strains predominate in the environment and recent evidence suggest that same-sex mating can occur, thereby generating new MATa strains (Fraser et al., 2005a, Hiremath et al., 2008, Nadal & Gold, 2007). In addition, there is growing evidence that sexual recombination occurs in various subpopulations and may contribute to the ability of the fungus to adapt to the host and the environment (Litvintseva et al., 2003). The CGH experiments presented here demonstrate that this technique could potentially be used to characterize the progeny of mating events in the wild. In addition, future experiments could employ CGH to examine the progeny that arise from crossing avirulent strains with virulent strains. The virulence properties of the progeny in a mouse model could be compared with those of the parental strains in order to track the genome segments that could be associated with virulence. 4.1.3 Examination of serotype AD strains The CGH analysis described in this thesis also sheds new light on the genomic makeup of the serotype AD strains, which are thought to arise from mating events between serotype A and serotype D strains. These strains typically have reduced virulence when compared to either parent (Lin et al., 2008). The CGH data for strains KW5, CDC228 and CDC3O4 illustrated that only certain parental chromosomes are retained. In particular, CHR 1 appeared to be preferentially retained from the serotype A parent. However, this idea of specific chromosome retention is complicated by the fact that both the average LR for the hybridization of CHR 1 to serotype A in CDC228 and KW5 was double the average LR for the hybridization of CHR 1 to 71 DISCUSSION AND CONCLUSIONS serotype A in CDC3O4 (Table 3.7). This suggested that CDC228 and KW5 have two copies of CHR 1 (from serotype A) and that CDC3O4 may have only one copy of this chromosome. These results agree with a previous report that CDC3O4 is aneuploid (1-2n) (Lengeler et al., 2001). This skewed distribution of CHR 1 in clinical AD strains to favor the serotype A chromosome suggests that there is a selective advantage for this preferential retention during host infection. It is interesting to note that this kind of skewing is only observed for CHR 1, therefore suggesting that there may be a specific subset of genes on CHR 1 which are particularly advantageous for pathogenesis if the cell has the serotype A version over the serotype D version of the chromosome. This is likely because CHR 1 is the largest chromosome and it also contains a number of identified virulence factors such as the LAC1 and LAC2 genes that are responsible for melanin synthesis. Although no virulence attributes (including melanin production) were found to be conmion amongst the AD strains that preferentially retained a serotype A version of CHR 1, it may still be possible that other untested virulence attributes (e.g. the ability to withstand nitrosative or oxidative stress) may be common amongst these unique AD strains. It may be possible to test the hypothesis that the serotype origin of CHR 1 influences virulence by using the strain CDC3O4 because it appears to be in the act of losing its serotype D copy of CHR 1. The experiment would involve assaying the ability of two different isolates of CDC3O4 to infect mice: one isolate would have the serotype AD complement of CHR 1 and the other isolate would only have the A complement. If the hypothesis is correct, then the isolate with only the A complement of CHR 1 will be more virulent than the other isolate. Another way to further examine the importance of CHR 1 would be to mark the chromosome with a selectable marker in an AD strain and then to passage the strain through mice. If the serotype A 72 DISCUSSION AND CONCLUSIONS chromosome does confer a selective advantage for the hybrid strain, then one would expect that the serotype D version of CHR 1 to be more easily lost. If this proves to be correct, it would then be interesting to look at exactly which stressor in the host triggers this change and then to look more specifically at the virulence levels of these strains before and after the loss of the serotype D version of CuR 1. Another important point to consider with regard to the AD strains is the reported virulence differences between the three strains analyzed by CGH. Specifically, strain KW5 is more virulent in a mouse model than either CDC3O4 or CDC228 (Lengeler et al., 2001). Because all three strains were missing a serotype D copy of CHR 1, it is unlikely that this is the reason behind the difference in virulence. The marked difference in the assortment of CHRs 6, 7 and 8 (Figure 3.4 and Table 3.7) may also contribute to virulence differences between these three strains. The idea that the virulence of AD strains is dependent on chromosome assortment is intriguing and may provide a new avenue to explore why the A and D serotypes have differences in virulence. It should be noted that the preferential reassortment of chromosomes in response to environmental conditions has been observed in C. albicans, another fungal pathogen (Forche et a!., 2008). In this case, exposure to antifungal drugs and specific carbon sources can influence the assortment of chromosomes in this fungus. On the other hand, the differing chromosome assortments may not contribute to virulence. Instead, mutations in various other genes involved in virulence, which may not be detectable by CGH, could be the cause of the differences in virulence. 73 DISCUSSION AND CONCLUSIONS 4.1.4 Final comments on genome variation Overall, the genomic variation observed by CGH provides the foundation for future studies. The generation of a database of genomic differences for different serotypes, different molecular subtypes, strains from various geographical regions, and strains representing clinical vs. environmental sources provides a wealth of information for future research. In addition, the discovery that clinical AD strains preferentially retain the serotype A version of CHR 1 provides interesting insights and new potential avenues of pursuit to understand the pathogenesis of the C. neoformans. 4.2 Chromosome Copy Number Variation in C. neoformans 4.2.1 CCNV in serotype A strains CBS7779 and WM626 The CGH data revealed CCNV in strains CBS7779 and WM626 of C. neoformans. In looking at CCNV in the initial screen of CBS7779, there appeared to be a strong correlation between melanin production and the copy number of CHR 13. However, this correlation did not apply to the WM626 strains, nor did it apply to subsequent screenings of the CB57779 strains. This suggests that there could be more than one mechanism for variation in melanin production. Although CCNV for CHR 4 appeared in subsequent screens, it was not correlated with melanin production. The initial observations with CBS7779 might suggest that CCNV is associated with growth inside the host, although it is not known how long the strain has been in culture since its original isolation. One can speculate that specific trigger(s) inside the host that were not present in later screens (cells were grown overnight at 30°C in rich medium) might have caused the cells to gain an extra copy of CHR 13 and thereby influence melanin production. 74 DISCUSSION AND CONCLUSIONS Possible stressful factors inside the host that may trigger CCNV include oxidative and nitrosative stress, growth at host temperature and nutritional conditions. Another possible trigger for CCNV in either CBS7779 or WM626 could be the drug treatments the patient was receiving because both CBS7779 and WM626 are clinical strains. For example, a number of drugs such as fluconazole, which is commonly used to treat fungal infections, have been shown to cause non disjunction events in the fungal pathogen C. albicans (Perepnikhatka et al., 1999). In fact, the trigger to produce varying levels of melanin production for CBS7779 could be in response to antifungal treatments because a high production of melanin is known to provide protection against amphotericin B (Ikeda et al., 2003). Unfortunately, no information is available regarding the drug treatments that the patients were receiving for the infections with either the CBS7779 or WM626 strains. Thus, future experiments should employ a well characterized aneuploid strain such as CBS7779, whose copy number for each chromosome is already known via CGH (Ru et al., 2008), and passage the strain through a mouse with and without drug treatments to look for changes in aneuploidy. In parallel, the strain should also be passaged in culture and its CCNV examined. A comparison of these “passage” models (i.e., “laboratory” passage vs. “mouse” passage, with and without subinhibitory drug treatments) may provide information regarding the mechanism(s) of CCNV. The information will also be useful for correlating any virulence phenotype differences among these passaged isolates with CCNV. The idea that passaging through a host can promote chromosome copy number polymorphisms (Chen et a!., 2004) as well as other chromosomal rearrangements has also been established in C. albicans (Rustchenko Bulgac eta!., 1990) and it may be possible that C. neoformans uses a similar mechanism. 75 DISCUSSION AND CONCLUSIONS Other possible drugs or chemicals that can be used to study the mechanism of CCNV include sulfacetamide, saccharin and methyl benzimidazole carbamate. These chemicals were chosen because they generally induce chromosomal copy number changes in yeast without causing other kind of mutations (Parry et al., 1979 and Barton & Gull, 1992). These experiments might reveal whether certain chromosomes (e.g. CHRs 4 and 13) are more susceptible than others to copy number changes. These experiments might also provide strains to test whether CCNV can be associated with gains or losses in virulence. If CCNV is relevant to virulence, then it would be logical to systematically tag each chromosome with a gene for drug resistance and to select for strains which are disomic for each chromosome. These strains could then be used to examine the phenotypic influence of disomy on a genome-wide scale, as recently described in yeast (Torres et a!., 2007). However, in these experiments it will be important to remember that previous studies with C. albicans have shown that the longer the treatment with fluconazole, the more likely there would be a nondisjunction event for CHR 17, which will in turn, induce drug resistance (Perepnikhatka et al., 1999). In addition, azole resistance in C. albicans has been linked to a specific segmental duplication in CHR 5 (Coste et a!., 2007, Selmecki et a!., 2006). With these issues in mind, it is even more important to consider the relationship between CCNV and drug treatment because cryptococcal infections often require prolonged antifungal therapy (Antinori et a!., 2001). In looking at the changes in aneuploidy for the C. neoformans, it is important to remember that changes in chromosome copy number may have a fitness cost. In S. cerevisiae, aneuploidy results in cells that reach saturation at a smaller population size (as measured by optical density at 600 nm) and these cells will also lose viability upon prolonged growth in media (after 5 days) (Torres et a!., 2007). On the other hand, there are a number of differences between 76 DISCUSSION AND CONCLUSIONS C. neoformans and S. cerevisiae. For example, the aneuploids observed in this thesis are all clinical strains. That is, they must have arisen and must have been growing successfully at elevated temperatures. And yet, aneuploid yeast strains grow slowly at high temperatures (Torres et al., 2007). This suggests that either the mechanism for the generation and maintenance of aneuploidy are different between yeast and C. neoformans or that the driving mechanism in CCNV for Cryptococcus inside the host overrides the disadvantages of growing slowly at elevated temperatures. This is possible since stressful environments (such as those inside a host), have been implicated in driving phenotypic variation (Massey & Buckling, 2002). Although gaining a better understanding of the mechanism of CCNV is interesting, what is more important is determining whether or not CCNV causes any phenotype changes. The idea that specific phenotypic changes could be caused by CCNV is supported by the fact that the presence of extra chromosomes in S. cerevisiae is correlated with a variety of changes in gene expression (Torres et a!., 2007). lii fact, the presence of an extra chromosome in Candida glabrata, which is another human fungal pathogen, is also correlated with an almost global down-regulation of a number of different proteins (Marichal et a!., 1997). Perhaps the mechanism behind the down-regulation of protein production in C. glabrata is similar to the down-regulation of melanin that is seen in the initial CBS7779 strains. Another possible way that CCNV can influence melanin in the initial CBS7779 isolates might be through an indirect influence on the expression of other genes required for the regulation of melanin production. Previously, it was reported that extra copies of chromosomes were linked with the up-regulation of high-affinity glucose transporters (Torres et a!., 2007), suggesting that the state of having extra chromosomes affects the level of glucose inside the cell. In C. neoformans, the presence of elevated glucose in culture medium reduces melanin 77 DISCUSSION AND CONCLUSIONS production (Pukkila-Worley et a!., 2005). Tn fact, glucose sensing and transport mechanisms may be advantageous for a pathogen because there are relatively low levels of glucose inside the host (and particularly inside macrophages) (Appelberg, 2006). Thus, it may be possible that CCNV influences the expression of a certain number of genes (such as those involved in sugar transport) and this in turn influences the production of melanin. Another example of the possible downstream gene expression effects of CCNV can be seen in C. albicans. Here, a strain with trisomy for CHR 1 is avirulent compared to euploid cells in mouse models of candidiasis (Chen et a!., 2004). Also, even though the gene responsible for L-sorbose assimilation, SOUl, is not on CHR 3, the copy number of CHR 3 has been demonstrated to regulate the expression of SOUl (Janbon et al., 1998). Thus, in looking at CHR 13 disomy for C. neoformans, it is important to not solely concentrate on the possible changes in expression for genes on CHR 13, but to also look at the expression profile of the entire genome. Future experiments could employ microarrays to compare the transcriptomes of monosomic and disomic variants of a strain. It may be possible, however, that there is no link between the copy number of CHR 13 and the production of melanin. Certainly, the lack of correlation between the copy number of CHR 13 and melanin production in WM626 clearly illustrates this point. Previous studies have already identified a number of genes which contribute to melanin production (Panepinto et al., 2005, Walton et a!., 2005) and it may be possible that the initial screening process was selecting for strains with mutations in these melanin production genes and the presence of disomy was merely a by-product of this selection. Although this thesis mainly focused on the production of melanin and its possible correlation to CCNV, it is important to remember that there could be other phenotypic changes in disomic strains. For example, it was previously reported that specific changes in the 78 DISCUSSION AND CONCLUSIONS electrophoretic karyotypes of certain strains are associated with changes in capsule size, growth rates at 37°C and colony morphologies (smooth vs. mucoid) (Fries & Casadevall, 1998). These ideas reinforce the need to develop an understanding of the mechanism(s) leading to CCNV and the resulting influence of CCNV on gene expression in C. neoformans. On that note, it is interesting that both CBS7779 and WM626 showed disomy at CHR 13. It may be that certain chromosomes more easily become disomic over other chromosomes or conversely, certain chromosomes may be more resistant to duplication. This idea has been observed in S. cerevisiae, where losses or gains of CHR 8 are not as well tolerated as other chromosome copy number changes (Waghrnare & Bruschi, 2005). One can easily test this idea by screening a number of clinical strains for CCNV by CGH. In this case, the use of qRT-PCR would not be appropriate because qRT-PCR would only allow one to assay the copy number of a specific loci, but CGH would allow one to survey the entire genome. From these experiments, it may be possible to observe whether other specific changes in chromosome copy number can be correlated to other phenotypes (e.g., colony morphologies). This interplay between CCNV and phenotype becomes even more complicated when one considers the pattern that emerges when one examines results of the second screen with the CBS7779 isolates (Figure 3.12). Although the correlation observed in the first screen regarding CHR 13 copy number and melanin production no longer applied, it appeared that having an extra copy of CHR 13 was a stable trait (all eleven strains). This stability was also seen in the third screen, where eleven of the twelve strains screened maintained the disomy at CHR 13. This is particularly interesting because aneuploidy can be associated with a reduction in fitness (Torres et al., 2007, Birchier et a!., 2007). As mentioned for S. cerevisiae, aneuploidy results in cells which reach saturation at a smaller population size and they lose viability upon prolonged growth 79 DISCUSSION AND CONCLUSIONS in culture (after 5 days) (Torres et at., 2007). This phenotype is also observed in Cryptococcal strains with CCNV, in which strains which have a disomy at CHR 13 grow at 90% of the rate of cells which are monosomic at CHR 13 (I. Liu, unpublished results). One possible reason for stability is that the strains were grown in a rich medium (YPD) and previous experiments in C. albicans have shown that aneuploids grown in rich media are more stable than aneuploids grown in nutrient deficient media (Chen et at., 2004, Rustchenko-Bulgac et at., 1990). The stability may also be due to the fact that CBS7779 may have an unidentified mutation which is suppressed by an extra copy of CHR 13, in a similar manner that extra chromosomes in yeast have been found to exist in 8% of a laboratory deletion set in order to offset deleterious mutations (Hughes et at., 2000). If this is the case for the strain CBS7779, then losing the disomy may actually cause a reduction instead of a gain in fitness. In any case, the differences in the stability and prevalence of CHR 13 disomy in CBS7779 and WM626 suggest that the mechanism for the development and maintenance of CCNV can be multi-factorial. The complexities of chromosomal instability are further illustrated by the findings that among the strains that were disomic to begin with, four of the six strains gained a disomy at CHR 4, whereas none of the strains which were monosomic to begin with gained this particular disomy of CHR 4. This disomy of CHR 4 is a completely new development, because subsequent testing of the original strains for CHR 4 failed to show that the parent strain was disomic. This suggests that once a cell is an aneuploid, the subsequent daughter cells can gain extra copies of other chromosomes relatively easily. One possible explanation is that cells which had an extra copy of CHR 13 gained this extra copy because of a mutation in a gene that is important in maintaining genome stability. These could include genes involved in chromatin dynamics, cell cycle control, DNA replication or mitotic chromosome segregation (Ouspenski et at., 1999) as 80 DISCUSSION AND CONCLUSIONS well as genes encoding other DNA binding proteins (de Lahondes et al., 2003). As a result, a mutation in any one of these genes could make it easier for the cell to lose or gain other chromosomes. 4.2.2 CCNV in selected strains of C. neoformans and C. gattii The discovery of CCNV in two clinical strains (CBS7779 and WM626) prompted an investigation of additional strains to establish the prevalence of the phenomenon. Disomy at CHR 4 and CHR 13 was not found among the strains obtained from the 1PM stock center (Research Center for Pathogenic Fungi and Microbial Toxicoses, Chiba University) suggesting that this kind of CCNV may not be common. It is important to remember, however, that the assay only looked at two of the fourteen chromosomes and there may be disomies at other chromosomes. Thus, future experiments should include CGH experiments with serotype specific arrays to rapidly screen clinical and environmental isolates for changes at all chromosomes. Also, the initial screen and subsequent screens for CCNV in CBS7779 produce different results, suggesting that the host may play an important role in the development and maintenance of CCNV in C. neoformans. In relation to the strains from the 1PM stock center, it is not known how many times these strains were passaged in culture. Thus, it is difficult to assess whether the lack of CCNV in the 1PM strains is an indication that CCNV is a rare event in C. neoformans or that CCNV is only more prevalent in freshly isolated clinical strains. In fact, in contrast to the results with the 1PM strains, subsequent qRT-PCR experiments involving strains freshly isolated from patients suggested that CCNV may be very common. Eleven of the thirteen strains showed differences for CHR 4 and three showed differences for CHR 13. These results support the idea that the 81 DISCUSSION AND CONCLUSIONS 1PM strains might have lacked the selective pressure to maintain aneuploidy and therefore have balanced haploid genomes. This idea was supported by the fact that aneuploidy was particularly prevalent for the strains freshly isolated from patients (strains beginning with the designation RTC in Table 3.15). One possible explanation for this phenomenon is that there is selective pressure for aneuploidy in the host. Thus, one would predict that subsequent laboratory passaging of strains isolated from patients would result in the loss of CCNV over time. This idea that laboratory passaging can affect both the phenotype and the genotype of C. neoformans has been previously observed. In fact, in vitro passaging can lead to dramatic changes in virulence. In one case, strain ATCC24067 lost virulence completely (Franzot et a!., 1998). Further support for this idea can be found by the fact that in vitro passaging has also been shown to result in karyotyping differences (Franzot et a!., 1998). This piece of evidence, therefore, again illustrates the need to characterize CCNV in a mouse model as mentioned before. Not only does there appear to be more CCNV in freshly isolated clinical strains, but there also appears to be heterogeneity within the same patient in terms of the copy number of CHRs 4 and 13 (Table 3.15, strains RTC23 and RTC31). Although chromosomal heterogeneity within a patient has been previously documented (Fries et a!., 1996 & Almeida et a!., 2007), this is the first time that heterogeneity was linked to a specific chromosome / locus. In one study, eight of the patients tested (24%) showed alterations in electrophoretic karyotyping between relapsing cryptococcal infections (Brandt et a!., 1996). In another study, it was discovered that the electrophoretic karyotypes of isolates before and after infection in a mouse model also differed (Fries et a!., 1996). Also, a survey of 32 clinical isolates in two hospitals revealed a wide range of karyotype patterns, with chromosome numbers ranging from 6 to 11 (Fries et a!., 1996). This 82 DISCUSSION AND CONCLUSIONS kind of microevolution on a chromosome scale can become stable if the infection is persistent or chronic (Sukroongreung et a!., 2001). Also, the microevolution may be dependent on where in the host these cells are found because isolates from different body sites have been reported to have different karyotypes (Sukroongreung et al., 2001). All of these results support the idea that microevolution within the patient via changes in chromosome copy number may occur in C. neoformans. Because in vitro passaging could result in a loss of aneuploidy, it will be important to perform CGH on cells directly originating from the patient without any laboratory passage. This kind of experiment is now possible because the technique for performing genome amplification on a single cell and the subsequent use of the gDNA for CGH analysis has been developed (Geigi & Speicher, 2007). If a wide range of CCNV is found within the same patient, then the next logical experiment would be to assess if the strains have any differing phenotypes. The next question that one may then ask is: why does CCNV seem to be more prevalent in clinical strains? As mentioned, C. neoformans can undergo phenotype switching and previous experiments have shown that switching occurs in vivo. This kind of switching is linked with higher virulence levels (especially if the switch involves changing from a mucoid colony to a smooth colony) (Fries et a!., 2001). It has been proposed that switching may cause changes in the capsular polysaccharide which in turn may help the cells evade the immune system (Fries et a!., 2001, Fries et al., 2005, Pietrella et a!., 2003). If CCNV can generate new phenotypes such as those observed in the first screening of CBS7779 (Figure 3.12), then it may be possible that CCNV could be the mechanism responsible for phenotype switching. Thus, a logical experiment would be to characterize the phenotypes such as the capsular components (another common phenotype that switches (Fries et a!., 1999)), and test if CCNV heterogeneity within a patient correlates with phenotypic heterogeneity. 83 DISCUSSION AND CONCLUSIONS Of course, CCNV may contribute to genetic diversity and evolution. That is, CCNV may provide a mechanism for the expansion of gene families or changes in the expression of genes involved with virulence. Previously, it has been reported that extra copies of genes have been associated with enhanced virulence. For example, Magnaporthe grisea has an extra copy of a gene involved in melanin biosynthesis when compared to its non-pathogenic fungal relative, Neurospora crassa (Thompson et a!., 1997). Other examples of pathogenic fungi which have chromosome number aberrations include Candida parapsilosis, a fungal pathogen associated with intravascular catheter infections in which six of the thirteen clinically isolated strains were aneuploids (Fundyga et al., 2004). In addition, aneuploidy has also been found in other yeast species including S. pastorianus (Bond et at., 2004) and Candida parapsilosis (Fundyga et at., 2004). The qRT-PCR experiments, besides revealing information regarding the copy number of CHRs 4 and 13, also provided interesting insights into the sequence variation of the strains. For example, as shown in Table 3.15, there was a low copy number for CHR 13 in strain JP1086 when the primer set CNNOO82O was used. Because the melting temperature of the PCR product was the same as the melting temperatures of PCR products from other strains (with the same primer set), the results suggest that the low copy number is not due to sequence deletion, but due to primer binding inefficiency. This in turn suggests that JP1086 may have a sequence variation at CNNOO82O despite being of the same molecular type (VNI) as the reference genome (H99). In addition, the relatively large SDs observed in Table 3.15, suggest that there could be a wide range of sequence variation within the same molecular subtype, because large SDs are usually observed when there are sequence polymorphisms (Lim et at., 2008). Given these two observations, the qRT-PCR results suggest that there could be sequence divergences within the 84 DISCUSSION AND CONCLUSIONS same molecular subtype of VNI, and that strain-specific primers may be needed for accurately assessing CCNV. 4.2.3 Final comments on CCNV A key result in this thesis work is that CCNV appears to be particularly prevalent among clinical strains. In addition, the data from screening isolates indicate that once CCNV is established, it is a relatively stable trait and it may lead to the generation of other chromosome copy number changes. Also, the presence of CCNV may affect certain virulence factors such as melanin production. Interestingly, this work provides evidence that there could be CCNV heterogeneity within the same patient during the course of an infection. Future experiments need to focus on deciphering the mechanism of CCNV and understanding the effects of CCNV on the phenotypes of the C. neoformans. 4.3 Conclusions Overall, the data presented in this thesis lay the groundwork for future studies into cryptococcal genome variation. The database of genomic variation for serotype A and serotype B strains can be used to generate many hypotheses regarding the roles of specific genes and gene-gene interactions in virulence. This work has also established the methodology for using CGH to identify the recombination history of strains and this may become an important method for mapping segments in the genome that play a role in virulence. In addition, the discovery that there is a bias for CHR 1 in AD strains to retain a serotype A copy strongly suggests that there is a selective advantage to this retention. This paves the way for future experiments involving chromosome tagging and virulence assays to investigate the role of CHR 1 in pathogenesis. 85 DISCUSSION AND CONCLUSIONS Finally, the discovery that there are chromosome copy number changes in C. neoformans is quite exciting because this phenomenon had not been previously characterized. This will be an important area for future studies that is justified by the prevalence of CCNV in clinical isolates and the possible link between chromosome copy number and virulence traits such as melanin. 86 REFERENCES REFERENCES Allen, T. D. & D. L. Nuss, (2004) Linkage between mitochondrial hypovirulence and viral hypovirulence in the chestnut blight fungus revealed by cDNA microarray analysis. Eukaiyot Cell 3: 1227-1232. Almeida, A. M., M. T. Matsumoto, L. C. Baeza, E. S. R. B. de Oliveira, A. A. Kleiner, S. Meihem Mde & M. J. Mendes Giannini, (2007) Molecular typing and antifungal susceptibility of clinical sequential isolates of Cryptococcus neoformans from Sao Paulo State, Brazil. FEMS Yeast Res 7: 152-164. Antinori, S., L. Galimberti, C. magni, A. Casella, L. Vago, F. Mainini, M. Piazza, M. Nebuloni, M. Fasan, C. Bonaccorso, G. M. Vigevani, A. Cargnel, M. Moroni & A. Ridolfo, (2001) Ciyptococcus neoformans Infection in a cohort of Italian AJDS patients: Natural history, early prognostic parameters, and autopsy findings. Eur J Clin Microbiol Infect Dis 20: 711 -717. Appelberg, R., (2006) Macrophage nutriprive antimicrobial mechanisms. JLeukoc Biol 79: 1117-1128. Archibald, L. K., M. J. Tuohy, D. A. Wilson, 0. Nwanyanwu, P. N. Kazembe, S. Tansuphasawadikul, B. Eampokalap, A. Chaovavanich, L. B. Reller, W. R. Jarvis, G. S. Hall & G. W. Procop, (2004) Antifungal susceptibilities of Cryptococcus neoformans. EmerglnfectDis 10: 143-145. Balm, Y. S., K. Kojima, G. M. Cox & J. Heitman, (2005) Specialization of the HOG pathway and its impact on differentiation and virulence of Ciyptococcus neoformans. Mo! Biol Cell 16: 2285-2300. Barelle, C. J., C. L. Priest, D. M. MacCallum, N. A. R. Gow, F. C. Odds & A. J. P. Brown, (2006) Niche-specific regulation of central metabolic pathways in a fungal pathogen. Cell Micb 8:961—971. Barreto de Oliveira, M. T., T. Boekhout, B. Theelen, F. Hagen, F. A. Baroni, M. S. Lazera, K. B. Lengeler, J. Heitman, I. N. Rivera & C. R. Paula, (2004) Cryptococcus neoformans shows a remarkable genotypic diversity in Brazil. J Clin Microbiol 42: 13 56-1359. Bartlett, K. H., S. E. Kidd & J. W. Kronstad, (2008) The Emergence of Cryptococcus gattii in British Columbia and the Pacific Northwest. Curr Infect Dis Rep 10: 5 8-65. Barton, R. C. & K. Gull, (1992) Isolation, characterization, and genetic analysis of monosomic, aneuploid mutants of Candida albicans. Mo! Microbiol 6: 171-177. Bennett, J. E., K. J. Kwon-Chung & D. H. Howard, (1977) Epidemiologic differences among serotypes of Cryptococcus neoformans. Am JEpidemiol 105: 582-586. 87 REFERENCES Bicanic, T. & T. S. Harrison, (2004) Cryptococcal meningitis. British medical bulletin 72: 99 - 118. Birchler, J. A., H. Yao & S. Chudalayandi, (2007) Biological consequences of dosage dependent gene regulatory systems. Biochim Biophys Acta 1769: 422-428. Boekhout, T. & A. van Belkum, (1997) Variability of karyotypes and RAPD types in genetically related strains of Cryptococcus neoformans. Curr Genet 32: 203-208. Boekhout, T., A. van Belkum, A. C. Leenders, H. A. Verbrugh, P. Mukamurangwa, D. Swinne & W. A. Scheffers, (1997) Molecular typing of Cryptococcus neoformans: taxonomic and epidemiological aspects. mt JSyst Bacteriol 47: 432-442. Bond, U., C. Neal, D. Donnelly & T. C. James, (2004) Aneuploidy and copy number breakpoints in the genome of lager yeasts mapped by microarray hybridisation. Curr Genet 45: 360- 370. Bose, I., A. J. Reese, J. J. Ory, G. Janbon & T. L. Doering, (2003) A yeast under cover: the capsule of Cryptococcus neoformans. Eukaryot Cell 2: 65 5-663. Brandt, M. E., S. L. Bragg & R. W. Pinner, (1993) Multilocus enzyme typing of Cryptococcus neoformans. JClin Microbiol3l: 28 19-2823. Brandt, M. E., M. A. Pfaller, R. A. Hajjeh, B. A. Graviss, J. Rees, E. D. Spitzer, R. W. Pinner & L. W. Mayer, (1996) Molecular subtypes and antifungal susceptibilities of serial Cryptococcus neoformans isolates in human immunodeficiency virus-associated Cryptococcosis. Cryptococcal Disease Active Surveillance Group. Jlnfect Dis 174: 812- 820. Campbell, L. T., J. A. Fraser, C. B. Nichols, F. S. Dietrich, D. Carter & J. Heitman, (2005) Clinical and environmental isolates of Cryptococcus gattii from Australia that retain sexual fecundity. Eukaryot Cell 4: 1410 - 1419. Chandenier, J., K. D. Adou-Bryn, C. Douchet, B. Sar, M. Kombila, D. Swinne, M. Therizol Ferly, Y. Buisson & D. Richard-Lenoble, (2004) In vitro activity of amphotericin B, fluconazole and voriconazole against 162 Cryptococcus neoformans isolates from Africa and Cambodia. Eur J Clin Microbiol Infect Dis 23: 506 - 508. Chaturvedi, S., P. Ren, S. D. Narasipura & V. Chaturvedi, (2005) Selection of optimal host strain for molecular pathogenesis studies on Cryptococcus gattii. Mycopathologia 160: 207-2 15. Chen, X., B. B. Magee, D. Dawson, P. T. Magee & C. A. Kumamoto, (2004) Chromosome 1 trisomy compromises the virulence of Candida albicans. Mol Microbiol 51: 551-565. 88 REFERENCES Clancy, C. J., M. H. Nguyen, R. Alandoerffer, S. Cheng, K. Iczkowski, M. Richardson & 3. R. Graybill, (2006) Cryptococcus neoformans var grubii isolates recovered from persons with AIDS demonstrate a wide range of virulence during murine meninoencephalitis that correlates with the expression of certain virulence factors. Microbiol 152: 2247 - 2255. Cogliati, M., M. C. Esposto, D. L. Clarke, B. L. Wickes & M. A. Viviani, (2001) Origin of Cryptococcus neoformans var. neoformans diploid strains. J Clin Microbiol 39: 3889 - 3894. Corbertt, E. L., G. J. Churchyard, S. Charalambos, B. Samb, V. Moloi, T. C. Clayton, A. D. Grant, J. Murray, R. J. Hayes & K. M. d. Cock, (2002) Morbidity and mortality in South African gold miners: impact of untreated disease due to human immunodeficiency virus. ClinlnfectDis34: 1251- 1258. Coste, A., A. Selmecki, A. Forche, D. Diogo, M. E. Bougnoux, C. d’Enfert, J. Berman & D. Sanglard, (2007) Genotypic evolution of azole resistance mechanisms in sequential Candida albicans isolates. Eukaryot Cell 6: 1889-1904. Coste, A., V. Turner, F. Ischer, J. Morschhauser, A. Forche, A. Selmecki, J. Berman, J. Bille & D. Sanglard, (2006) A mutation in Taclp, a transcription factor regulating CDR1 and CDR2, is coupled with loss of heterozygosity at chromosome 5 to mediate antifungal resistance in Candida albicans. Genetics 172: 2139-2156. D’Souza, C. A., J. A. Aispaugh, C. Yue, T. Harashima, G. M. Cox, 3. R. Perfect & J. Heitman, (2001) Cyclic AIVIP-dependent protein kinase controls virulence of the fungal pathogen Cryptococcus neoformans. Mol Cell Biol. 21: 3179-3191. de Lahondes, R., V. Ribes & B. Arcangioli, (2003) Fission yeast Sapl protein is essential for chromosome stability. Eukaiyot Cell 2: 910-921. Dwight, S.S., M.A. Harris, K. Dolinski, C. A. Ball, G. Binkley, K.R. Christie, D.G. Fisk, L. Issel-Tarver, M. Schroeder, G. Sherlock, A. Sethuranian, S. Weng, D. Botsein & J. M. Cherry, (2002). Nucleic Acids Res. 1: 69 — 72. Ellis, D., D. Marriott, R. A. Hajjeh, D. Wamock, W. Meyer & R. Barton, (2000) Epidemiology: surveillance of fungal infections. Med Mycol 38 Suppi 1: 173-182. Ellison, D. W., T. R. Clark, D. E. Sturdevant, K. Virtaneva, S. F. Porcella & T. Hackstadt, (2008) Genomic comparison of virulent Rickettsia rickettsii and avirulent Rickettsia rickettsii. Infect Immun 76: 542-550. Ferreira, I. D., V. E. Rosario & P. V. Cravo, (2006) Real-time quantitative PCR with SYBR Green I detection for estimating copy numbers of nine drug resistance candidate genes in Plasmodium falciparum. MalarJ5: 1. 89 REFERENCES Forche, A., K. Alby, D. Schaefer, A.D. Johnson, J. Berman & R. J. Bennett, (2008) The Parasexual Cycle in Candida albicans provides an alternative pathway to meiosis for the formation of recombinant strains. PLOSBio1 6:el 10. Franzot, S. P., J. Mukherjee, R. Chemiak, L. C. Chen, J. S. Hamdan & A. Casadevall, (1998) Microevolution of a standard strain of Ciyptococcus neoformans resulting in differences in virulence and other phenotypes. Infect Immun 66: 89-97. Fraser, J. A., S. Diezmann, R. L. Subaran, A. Allen, K. B. Lengeler, F. S. Dietrich & J. Heitman, (2004) Convergent evolution of chromosomal sex-determining regions in the animal and fungal kingdoms. PLoSBio1 2: e384. Fraser, J. A., S. S. Giles, E. C. Wenink, S. G. Geunes-Boyer, J. R. Wright, S. Diezmann, A. Allen, J. E. Stajich, F. S. Dietrich, J. R. Perfect & J. Heitman, (2005a) Same-sex mating and the origin of the Vancouver Island Cryptococcus gattii outbreak. Nature 437: 1360 - 1364. Fraser, J. A., J. C. Huang, R. Pukkila-Worley, J. A. Alspaugh, T. G. Mitchell & J. Heitman, (2005b) Chromosomal translocation and segmental duplication in Cryptococcus neoformans. Euk Cell 4: 401 - 406. Fraser, J. A., R. L. Subarari, C. B. Nichols & J. Heitman, (2003) Recapitulation of the sexual cycle of the primary fungal pathogen Cryptococcus neoformans var. gattii: Implications for an outbreak on Vancouver Island, Canada. Eukaryot Cell 2: 1036 - 1045. Fries, B. C. & A. Casadevall, (1998) Serial isolates of Cryptococcus neoformans from patients with AIDS differ in virulence for mice. JlnfectDis 178: 1761-1766. Fries, B. C., F. Chen, B. P. Currie & A. Casadevall, (1996) Karyotype instability in Cryptococcus neoformans infection. J Clin Microbiol 34: 153 1-1534. Fries, B. C., E. Cook, X. Wang & A. Casadevall, (2005) Effects of antifungal interventions on the outcome of experimental infections with phenotypic switch variants of Ciyptococcus neoformans. Antimicrob Agents Chemother 49: 350-357. Fries, B. C., D. L. Goldman, R. Cherniak, R. Ju & A. Casadevall, (1999) Phenotypic switching in Cryptococcus neoformans results in changes in cellular morphology and glucuronoxylomannan structure. Infect Immun 67: 6076-6083. Fries, B. C., C. P. Taborda, E. Serfass & A. Casadevall, (2001) Phenotypic switching of Ciyptococcus neoformans occurs in vivo and influences the outcome of infection. J Clin Invest 108: 1639-1648. Fundyga, R. E., R. J. Kuykendall, W. Lee-Yang & T. J. Lott, (2004) Evidence for aneuploidy and recombination in the human commensal yeast Candida parapsilosis. Infect Genet Evol4: 37-43. 90 REFERENCES Geigi, J. B. & M. R. Speicher, (2007) Single-cell isolation from cell suspensions and whole genome amplification from single cells to provide templates for CGH analysis. Nat Protoc 2: 3173-3184. Giles, S. S., A. K. Zaas, M. F. Reidy, J. R. Perfect & J. R. Wright, (2007) Cryptococcus neoformans is resistant to surfactant protein A mediated host defense mechanisms. PLoS ONE 2: e1370. Gressmann, H., B. Linz, R. Ghai, K.-P. Pleissner, R. Schlapbach, Y. Yamaoka, C. Kraft, S. Suerbaum, T. F. Meyer & M. Achtman, (2005) Gain and loss of multiple genes during the evolution of Helicobacterpylori. PLoS Genetics 1: e43. Grigg, M. E., S. Bormefoy, A. B. Hehl, Y. Suzuki & J. C. Boothroyd, (2005) Success and virulence in toxoplasma as the result of sexual recombination between two distinct ancestries. Science 294: 161 - 165. Guerrero, A., N. Jam, D. L. Goldman & B. C. Fries, (2006) Phenotypic switching in Cryptococcus neoformans. Microbiol 152: 3 - 9. Hakim, J. G., I. T. Gangaidzo, R. S. Heyderman, J. Mielke, E. Mushangi, A. Taziwa, V. J. Robertson, P. Musvaire & P. R. Mason, (2000) Impact of H1V infection on meningitis in Harare, Zimbabwe: a prospective study of 406 predominantly adult patients. AIDS 14: 1401. Harriff, M. J., M. Wu, M. L. Kent & L. E. Bermudez, (2008) Species of environmental mycobacteria differ in their abilities to grow in human, mouse, and carp macrophages and with regard to the presence of mycobacterial virulence genes, as observed by DNA microarray hybridization. Appl Environ Microbiol 74: 275-285. Harvala, H., H. Kalimo, L. Dahllund, J. Santti, P. Hughes, T. Hyypia & G. Stanway, (2002) Mapping of tissue tropism determinants in coxsackievirus genomes. J Gen Virol 83: 1697-1706. Heitman, J., B. Allen, A. Aispaugh & K. J. Kwon-Chung, (1999) On the origins of congenic MATalpha and MATa strains of the pathogenic yeast Cryptococcus neoformans. Fungal GenetBiol28: 1-5. Henry, I. M., B. P. Dukes & L. Comai, (2006) Molecular karyotyping and aneuploidy detection in Arabidopsis thaliana using quantitative fluorescent polymerase chain reaction. Plant J 48: 307-319. Herbert, M. A., C. J. E. Beveridge, D. McCormick, E. Aten, N. Jones, L. A. S. Snyder & N. J. Saunders, (2005) Genetic islands of Streptococcus agalactiae strains NEM3 16 and 2603VR and their presence in other Group B Streptococcal strains. BMC Microbiol 5: 31. 91 REFERENCES Hiremath, S. S., A. Chowclhary, T. Kowshik, H. S. Randhawa, S. Sun & J. Xu, (2008) Long- distance dispersal and recombination in environmental populations of Ciyptococcus neoformans var. grubii from India. Microbiology 154: 1513-1524. Hoang, L. M. N., J. A. Maguire, P. Doyle, M. Fyfe & D. L. Roscoe, (2004) Ciyptococcus neoformans infections at Vancouver Hospital and Health Science Centre (1997 - 2002): epidemiology, microbiology and histopathology. JMedMicrobiol 53: 935 - 940. Horst, C. M. v. d., M. S. Saag, G. A. Cloud, R. J. Hamill, J. R. Graybill, J. D. Sobel, P. C. Johnson, C. U. Tuazon, T. Kerkering, B. L. Moskovitz, W. G. Powclerly & W. E. Dismukes, (1997) Treatment of cryptococcal meningitis associated with the acquired immunodeficiency syndrome. National Institute of Allergy and Infectious Diseases Mycoses Study Group and AIDS Clinical Trials Group. NEnglJMed 337: 15 -21. Hu, G. & J. W. Kronstad, (2006) Gene disruption in Cryptococcus neoformans and Cryptococcus gattii by in vitro transposition. Curr Genet 49: 341-350. Hu, G., I. Liu, A. Sham, J. E. Stajich, F. S. Dietrich & J. W. Kronstad, (2008) Comparative hybridization reveals extensive genome variation in the AIDS-associated pathogen Cryptococcus neoformans. Genome Rio! 9: R41. Hu, G., B. R. Steen, T. Lian, A. P. Sham, N. Tam, K. L. Tangen & J. W. Kronstad, (2007) Transcriptional regulation by protein kinase A in Cryptococcus neoformans. PLoS Pathog3: e42. Hughes, T. R., C. J. Roberts, H. Dai, A. R. Jones, M. R. Meyer, D. Slade, J. Burchard, S. Dow, T. R. Ward, M. J. Kidd, S. H. Friend & M. J. Marton, (2000) Widespread aneuploidy revealed by DNA microarray expression profiling. Nat Genet 25: 333-337. Hull, C. M. & J. Heitman, (2002) Genetics of Cryptococcus neoformans. Annu Rev Genet 36: 557 - 615. Idnurm, A., Y.-S. Bahn, K. Nielsen, X. Lin, J. A. Fraser & J. Heitman, (2005) Deciphering the model pathogenic fungus Cryptococcus neoformans. Nature Rev Microbiol 3: 753 - 764. Ikeda, R., T. Sugita, E. S. Jacobson & T. Shinoda, (2003) Effects of melanin upon susceptibility of Cryptococcus to antifungals. Microbiol Immunol 47: 27 1-277. Ingham, D. J., S. Beer, S. Money & G. Hansen, (2001) Quantitative real-time PCR assay for determining transgene copy number in transformed plants. Biotechniques 31: 132-134, 136-140. James, A. P., E. R. Inhaber & G. J. Prefontaine, (1974) Lethal sectoring and the delayed induction of aneuploidy in yeast. Genetics 77: 1-9. 92 REFERENCES Janbon, G., (2004) Cryptococcus neoformans capsule biosynthesis and regulation. FEMS Yeast Res 4:765-771. Janbon, G., F. Sherman & E. Rustchenko, (1998) Monosomy of a specific chromosome determines L-sorbose utilization: a novel regulatory mechanism in Candida albicans. Proc NatlAcad Sci USA 95: 5150-5155. Jiang, R. H., R. Weide, P. J. van de Vondervoort & F. Govers, (2006) Amplification generates modular diversity at an avirulence locus in the pathogen Phytophthora. Genome Res 16: 827-840. Kavanaugh, L. A., J. A. Fraser & F. S. Dietrich, (2006) Recent evolution of the human pathogen Cryptococcus neoformans by intervarietal transfer of a 14-gene fragment. Mo! Biol Evol 23: 1879 - 1890. Kidd, S. E., Y. Chow, S. Mak, P. J. Bach, H. Chen, A. 0. Hingston, J. W. Kronstad & K. H. Bartlett, (2007) Characterization of environmental sources of the human and animal pathogen Cryptococcus gattii in British Columbia, Canada, and the Pacific Northwest of the United States. App! Environ Microbiol 73: 143 3-1443. Kidd, S. E., H. Guo, K. H. Bartlett, J. Xu & J. W. Kronstad, (2005) Comparative gene genealogies indicate that two clonal lineages of Cryptococcus gattii in British Columbia resemble strains from other geographical areas. Eukaryot Cell 4: 1629-1638. Koide, T., P. A. Zaini, L. M. Moreira, R. Z. Vencio, A. Y. Matsukuma, A. M. Durham, D. C. Teixeira, H. El-Dorry, P. B. Monteiro, A. C. da Silva, S. Verjovski-Almeida, A. M. da Silva & S. L. Gomes, (2004) DNA microarray-based genome comparison of a pathogenic and a nonpathogenic strain of Xylella fastidiosa delineates genes important for bacterial virulence. J Bacteriol 186: 5442-5449. Koressaar, T & M. Remm, (2007) Enhancements and modifications of primer design program Primer3. Bioinformatics 23: 1289— 1291. Kwon-Chung, K. J. & A. Varma, (2006) Do major species concepts support one, two or more species within Cryptococcus neoformans? FEMS Yeast Res 6: 574-587. Legrand, M., C. L. Chan, P. A. Jauert & D. T. Kirkpatrick, (2007) Role of DNA Mismatch Repair and Double-Strand Break Repair in Genome Stability and Antifungal Drug Resistance in Candida albicans. Eukaryot Cell 6: 2 194-2205. Lengeler, K. B., G. M. Cox & J. Heitman, (2001) Serotype AD strains of Ciyptococcus neoformans are diploid or aneuploid and are heterozygous at the mating-type locus. Infect Immun69: 115-122. 93 REFERENCES Lengeler, K. B., P. Wang, G. M. Cox, J. R. Perfect & J. Heitman, (2000) Identification of the MATa mating-type locus of Cryptococcus neoformans reveals a serotype A MATa strain thought to have been extinct. Proc NatlAcad Sci USA 97: 14455-14460. Lim, J., H. Do, S. G. Shin & S. Hwang, (2008) Primer and probe sets for group-specific quantification of the genera Nitrosomonas and Nitrosospira using real-time PCR. Biotechnol Bioeng 99: 1374-1383. Lin, X. & 3. Heitman, (2006) The biology of Cryptococcus neoformans species complex. Annu Rev Microbiol 60: 69 - 105. Lin, X., A. P. Litvintseva, K. Nielsen, S. Patel, A. Floyd, T. G. Mitchell & J. Heitman, (2007) alpha AD alpha hybrids of Cryptococcus neoformans: evidence of same-sex mating in nature and hybrid fitness. PLoS Genet 3: 1975-1990. Lin, X., K. Nielsen, S. Patel & J. Heitman, (2008) Impact of Mating Type, Serotype, arid Ploidy on Virulence of Cryptococcus neoformans. Infect Immun 21: 21. Litvintseva, A. P., L. Kestenbaum, R. Vilgalys & T. G. Mitchell, (2005) Comparative analysis of environmental and clinical populations of Cryptococcus neoformans. J Clin Microbiol 43: 556-564. Litvintseva, A. P., X. Lin, I. Templeton, J. Heitman & T. G. Mitchell, (2007) Many globally isolated Al) hybrid strains of Cryptococcus neoformans originated in Africa. PLoS Pathog3: e114. Litvintseva, A. P., R. E. Marra, K. Nielsen, 3. Heitman, R. Vilgalys & T. G. Mitchell, (2003) Evidence of sexual recombination among Cryptococcus neoformans Serotype A isolates in Sub-Saharan Africa. Eukaryot Cell 2: 1162 - 1168. Litvintseva, A. P., R. Thakur, R. Vilgalys & T. G. Mitchell, (2006) Multilocus sequence typing reveals three genetic subpopulations of Cryptococcus neoformans var. grubii (Serotype A), including a unique population in Botswana. Genetics 172: 2223 - 2238. Livak, K. J. & T. D. Schmittgen, (2001) Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods 25: 402-408. Lorenz, M. C & G. R. Fink, (2002) Life and death in a macrophage: Role of the glyoxylate cycle in virulence. Euk Cell 1: 657— 662. MacDougall, L. & M. Fyfe, (2006) Emergence of Cryptococcus gattii in a novel environment provides clues to its incubation period. J Clin Microbiol 44: 1851 - 1852. 94 REFERENCES Mao, Z.Q., Y. Huang, M. Sun, Q. Rua, Y. Qi, R. He, Y. J. Huang, Y. P. Ma, Y.H. Ji, Z. R. Sun, & H. Gao, (2007) Genetic polymorphism of UL144 open reading frame of human cytomegalovirus DNA detected in colon samples from infants with Hirschsprung’s disease. World J Gastroenterol 13: 4350 — 4354. Marchler-Bauer, A., J. B. Anderson, M. K. Derbyshire, C. DeWeese-Scott, N. R. Gonzales, M. Gwadz, L. Hao, S. He, D. I. Hurwitz, J. D. Jackson, Z. Ke, D. Krylov, C. J. Lanczycki, C. A. Liebert, C. Liu, F. Lu, S. Lu, G. H. Marchler, M. Mullokandov, J. S. Song, N. Thanki, R. A. Yamashita, J. J. Yin, D. Zhang & S. H. Bryant, (2007) CDD: a conserved domain database for interactive domain family analysis. Nucleic Acids Res 35: D237-240. Marichal, P., H. Vanden Bossche, F. C. Odds, G. Nobels, D. W. Wamock, V. Timmerman, C. Van Broeckhoven, S. Fay & P. Mose-Larsen, (1997) Molecular biological characterization of an azole-resistant Candida glabrata isolate. Antimicrob Agents Chemother 41: 2229-2237. Massey, R. C. & A. Buckling, (2002) Environmental regulation of mutation rates at specific sites. Trends Microbiol 10: 580-584. McClelland, E. B., F. R. Adler, D. L. Granger & W. K. Potts, (2004) Major histocompatibility complex controls the trajectory but not host-specific adaptation during virulence evolution of the pathogenic fungus Cryptococcus neoformans. Proc Biol Sci 271: 1557- 1564. McClelland, E. E., W. T. Perrine, W. K. Potts & A. Casadevall, (2005) Relationship of virulence factor expression to evolved virulence in mouse-passaged Cryptococcus neoformans lines. Infect Immun. 73: 7047 - 7050. Meyer, W., K. Marszewska, M. Amirmostofian, R. P. Igreja, C. Hardtke, K. Methling, M. A. Viviani, A. Chindampom, S. Sukroongreung, M. A. John, D. H. Ellis & T. C. Sorrell, (1999) Molecular typing of global isolates of Cryptococcus neoformans var. neoformans by polymerase chain reaction fingerprinting and randomly amplified polymorphic DNA-a pilot study to standardize techniques on which to base a detailed epidemiological survey. Electrophoresis 20: 1790-1799. Meyer, W. & T. G. Mitchell, (1995) Polymerase chain reaction fingerprinting in fungi using single primers specific to minisatellites and simple repetitive DNA sequences: strain variation in Cryptococcus neoformans. Electrophoresis 16: 1648-1656. Mitchell, T. G. & J. R. Perfect, (1995) Cryptococcosis in the era of AIDS--100 years after the discovery of Cryptococcus neoformans. C/in Microbiol Rev 8: 515-548. Molina, Y., S. B. Ramos, T. Douglass & L. S. Klig, (1999) Inositol synthesis and catabolism in Cryptococcus neoformans. Yeast 15: 1657-1667. 95 REFERENCES Myers, C. L., M. J. Dunham, S. Y. Kung & 0. G. Troyanskaya, (2004) Accurate detection of aneuploidies in array CGH and gene expression microarray data. Bioinformatics 20: 3533-3543. Nadal, M. & S. E. Gold, (2007) Sex in broad daylight: turning a new leaf-fungal style. Cell Host Microbe 1: 246-248. Nash, J. H. E., W. A. Findlay, C. C. Luebbert, 0. L. Mykytczuk, S. J. Foote, E. N. Taboada, C. D. Carrillo, J. M. Boyd, D. J. Coiquhoun, M. E. Reith & L. L. Brown, (2006) Comparative genomics profiling of clinical isolates ofAeromonas salmonicida using DNA microarrays. BMC Genomics 7: 43. Nazi, I., A. Scott, A. Sham, L. Rossi, P. R. Williamson, J. W. Kronstad & G. D. Wright, (2007) Role of homoserine transacetylase as a new target for antifungal agents. Antimicrob Agents Chemother5l: 1731-1736. Nielsen, K., G. M. Cox, A. P. Litvintseva, E. Mylonakis, S. D. Malliaris, J. Daniel K. Benjamin, S. S. Giles, T. G. Mitchell, A. Casadevall, J. R. Perfect & J. Heitman, (2005a) Cryptococcus neoformans alpha strains preferentially disseminate to the central nervous system during coinfection. Infect Immun 73: 4922 - 4933. Nielsen, K., R. E. Marra, F. Hagen, T. Boekhout, T. G. Mitchell, G. M. Cox & J. Heitman, (2005b) Interaction between genetic background and the mating-type locus in Cryptococcus neoformans virulence potential. Genetics 171: 975 - 983. Niwa, 0., Y. Tange & A. Kurabayashi, (2006) Growth arrest and chromosome instability in aneuploid yeast. Yeast 23: 937-950. Nunes, L. R., Y. B. Rosato, N. H. Muto, G. M. Yanai, V. S. da Silva, D. B. Leite, E. R. Goncalves, A. A. de Souza, H. D. Coletta-Filho, M. A. Machado, S. A. Lopes & R. C. de Oliveira, (2003) Microarray analyses of Xylella fastidiosa provide evidence of coordinated transcription control of laterally transferred elements. Genome Res 13: 570- 578. Ohkusu, M., N. Tangonan, K. Takeo, E. Kishida, M. Ohkubo, S. Aoki, K. Nakamura, T. Fujii, I. C. Siqueira, E. A. Maciel, S. Sakabe, G. M. Almeida, E. M. Heins-Vaccari & S. Lacaz Cda, (2002) Serotype, mating type and ploidy of Cryptococcus neoformans strains isolated from patients in Brazil. Rev Inst Med Trop Sao Paulo 44: 299-302. Ouspenski, II, S. J. Elledge & B. R. Brinkley, (1999) New yeast genes important for chromosome integrity and segregation identified by dosage effects on genome stability. NucleicAcidsRes 27: 3001-3008. Panepinto, J., L. Liu, J. Ramos, X. Zhu, T. Valyi-Nagy, S. Eksi, J. Fu, H. A. Jaffe, B. Wickes & P. R. Williamson, (2005) The DEAD-box P.NA helicase Vadi regulates multiple virulence-associated genes in Cryptococcus neoformans. J Clin Invest 115: 632-641. 96 REFERENCES Parry, J. M., D. Sharp & E. M. Parry, (1979) Detection of mitotic and meiotic aneuploidy in the yeast Saccharomyces cerevisiae. Environ Health Perspect 31: 97-ill. Perepnikhatka, V., F. J. Fischer, M. Niimi, R. A. Baker, R. D. Cannon, Y. K. Wang, F. Sherman & E. Rustchenko, (1999) Specific chromosome alterations in fluconazole-resistant mutants of Candida albicans. JBacteriol 181: 4041-4049. Pietrella, D., B. Fries, P. Lupo, F. Bistoni, A. Casadevall & A. Vecchiarelli, (2003) Phenotypic switching of Cryptococcus neoformans can influence the outcome of the human immune response. Cell Microbiol 5: 513-522. Pitkin, J. W., D. G. Panaccione & J. D. Walton, (1996) A putative cyclic peptide efflux pump encoded by the TOXA gene of the plant-pathogenic fungus Cochliobolus carbonum. Microbiology. 142: 1557-1565. Powell, A. J., G. C. Conant, D. E. Brown, I. Carbone & R. A. Dean, (2008) Altered patterns of gene duplication and differential gene gain and loss in fungal pathogens. BMC Genomics 9: 147. Pukkila-Worley, R., Q. D. Gerrald, P. R. Kraus, M. Boily, M. J. Davis, S.S. Giles, G. M. Cox, J. Heitman and J.A. Aispaugh, (2005) Transcriptional network of multiple capsule and melanin genes governed by the Cryptococcus neoformans cyclic AMP cascade. Euk Cell 4:190—201. Rajashekara, G., J. D. Glasner, D. A. Glover & G. A. Splitter, (2004) Comparative Whole Genome Hybridization reveals genomic islands in Bruce/la species. J Bacteriol 186: 5040-505 1. Ramirez, M. A. & M. C. Lorenz, (2007) Mutations in alternative carbon utilization pathways in Candida albicans attenuate virulence and confer pleiotropic phenotypes. Euk Cell 6: 280—290. Rustchenko-Bulgac, E. P., F. Sherman & J. B. Hicks, (1990) Chromosomal rearrangements associated with morphological mutants provide a means for genetic variation of Candida albicans. JBacteriol 172: 1276-1283. Schutte, C. M., C. H. V. d. Meyden & D. S. Magazi, (2000) The impact of HIV on meningitis as seen at a south African academic hospital (1994 to 1998). Infection 28: 3 - 7. Selmecki, A., S. Bergmann & J. Berman, (2005) Comparative genome hybridization reveals widespread aneuploidy in Candida albicans laboratory strains. Mo! Microbiol 55: 1553 - 1565. Selmecki, A., A. Forche & J. Berman, (2006) Aneuploidy and isochromosome formation in drug-resistant Candida albicans. Science 313: 367-370. 97 REFERENCES Seizer, R. R., T. A. Richmond, N. J. Pofahi, R. D. Green, P. S. Eis, P. Nair, A. R. Brothman & R. L. Staiiings, (2005) Analysis of chromosome breakpoints in neuroblastoma at sub kilobase resolution using fine-tiling oligonucleotide array CGH. Genes Chromosomes Cancer 44: 305-319. Sukroongreung, S., S. Lim, S. Tantimavanich, B. Eampokalap, D. Carter, C. Nilakul, S. Kulkeratiyut & S. Tansuphaswadikul, (2001) Phenotypic switching and genetic diversity of Cryptococcus neoformans. J Clin Microbiol 39: 2060-2064. Suzuki, T., I. Kobayashi, T. Kanbe & K. Tanaka, (1989) High frequency variation of colony morphology and chromosome reorganization in the pathogenic yeast Candida albicans. J Gen Microbiol 135: 425-434. Taboada, E. N., R. R. Acedillo, C. D. Carrillo, W. A. Findlay, D. T. Medeiros, 0. L. Mykytczuk, M. J. Roberts, C. A. Valencia, J. M. Farber & J. H. Nash, (2004) Large scale comparative genomics meta-analysis of Campylobacterjejuni isolates reveals low level of genome plasticity. J Clin Microbiol 24: 4566 - 4576. Taboada, E. N., R. R. Acedillo, C. C. Luebbert, W. A. Findlay & J. H. E. Nash, (2005) A new approach for the analysis of bacterial microarray-based Comparative Genome Hybridization: insights from an empirical study. BMC Genomics 6: 78. Tanaka, E., S. Ito-Kuwa, K. Nakamura, S. Aoki, V. Vidotto & M. Ito, (2005) Comparisons of the laccase gene among serotypes and melanin-deficient variants of Cryptococcus neoformans. Microbiol Immunol 49: 209-2 17. Tanaka, R., K. Nishimura & M. Miyaji, (1999) Ploidy of serotype AD strains of Cryptococcus neoformans. Nzppon Ishinkin Gakkai Zasshi 40: 3 1-34. Thompson, J. E., G. S. Basarab, A. Andersson, Y. Lindqvist & D. B. Jordan, (1997) Trihydroxynaphthalene reductase from Magnaporthe grisea: realization of an active center inhibitor and elucidation of the kinetic mechanism. Biochemistry 36: 1852-1860. Torres, B. M., T. Sokolsky, C. M. Tucker, L. Y. Chan, M. Boselli, M. J. Dunham & A. Amon, (2007) Effects of aneuploidy on cellular physiology and cell division in haploid yeast. Science 317: 9 16-924. Ulaszewski, S., J. R. Woodward & V. P. Cirillo, (1978) Membrane damage associated with inositol-less death in Saccharomyces cerevisiae. JBacteriol 136: 49-54. Voullaire, L. & L. Wilton, (2007) Comparative genomic hybridization on single cells. Methods Mol Med 132: 101-115. Waghmare, S. K. & C. V. Bruschi, (2005) Differential chromosome control ofploidy in the yeast Saccharomyces cerevisiae. Yeast 22: 625-639. 98 REFERENCES Walton, F. J., A. Idnurm & J. Heitman, (2005) Novel gene functions required for melanization of the human pathogen Cryptococcus neoformans. Mo! Microbiol 57: 138 1-1396. Wang, F. X., H. Zhou, H. Ling, H. Z. Zhou, W. H. Liu, Y. M. Shao & J. Zhou, (2007) Subtype and sequence analysis of HIV-1 strains in Heilongjiang Province. Chin Med J22: 2006— 2010. Watanabe, T., Y. Murata, S. Oka & H. Iwahashi, (2004) A new approach to species determination for yeast strains: DNA microarray-based comparative genomic hybridization using yeast DNA microarray with 6000 genes. Yeast 21: 351 - 365. Winzeler, E. A., D. R. Richards, A. R. Conway, A. L. Goldstein, S. Kalman, M. J. McCullough, J.H. McCusker, D.A. Stevens, L. Wodicka, D. J.Lockhart & R. W. Davis, (1998) Direct alleic variation scanning of the yeast genome. Science 281: 1194— 1197. Wright, L. C., R. M. Santangelo, R. Ganendren, 3. Payne, S. T. Djordjevic & T. C. Sorrell, (2007) Cryptococcal lipid metabolism: phospholipase B 1 is implicated in transcellular metabolism of macrophage-derived lipids. Eukaryot Cell 6: 3 7-47. Wu, M. L., T.P. Lin, M. Y. Lin, Y. P. Cheng & S. Y. Hwang, (2007) Divergent evolution of the chloroplast small heat shock protein gene in the genera Rhododendron (Ericaceae) and Machilus (Lauraceae). Ann Bot 99: 461 — 475. Zhong, J., S. Frases, H. Wang, A. Casadevall and P. E. Stark, (2008) Following fungal melanin biosynthesis with solid-state NMR: Biopolymer molecular structures and possible connections to cell-wall polysacchardies. Biochem 47: 4701-4710. 99 A pp en di x A .D at ab as e o f g en om ic v a ri at io n o bs er ve d in se ro ty pe A st ra in s v ia C G H R eg io ns o f di ff er en ce in th e ge no m es o f fo ur se ro ty pe A st ra in s c o m pa re d w ith th e se qu en ce d ge no m e o f st ra in H 99 . R eg io ns o f di ff er en ce th at o v e rl ap a re in th e sa m e c o lo ur . T he fi rs t c o lu m n in di ca te s th e c hr om os om e (C H R) n u m be r. T he su bs eq ue nt c o lu m ns a re la be le d a s fo llo w s: a N uc le ot id e c o o rd in at es o f th e s e gm en t id en tif ie d by C G H . bG le an n u m be r fo r th e H 99 ge ne in th e re gi on ba se d o n th e a n n o ta tio n a t th e B ro ad In st itu te . c G er ]3 fl ID o f to p B L A ST n. T he E -v al ue o f th e B L A ST n re s u lt is in cl ud ed in th e fo llo w in g c o lu m n. dC oo rd jn ate s o f th e sp ec if ic ge ne in th e se gm en t id en tif ie d by C G H . e Fn nc tio na l in fo rm at io n a bo ut th e to p B L A ST hi t. N ot e th at if n o o rg an ism n am e is gi ve n, th e fu nc tio na li nf or m at io n is fro m th e C. n eo fo rm an sa n n o ta tio n o fJ E C 21 (T IG R) . A V G LR Lo w es tL R fo rr eg io n H ig he st LR fo rr eg io n C oo rd .- G le an n u m be r° G eo B an k u ic -v al ue C oo rd . - CB S W M CB S W M CB S W M Pr ed ic te d fu nc tio n B t6 3 12 5. 91 C hr B t6 3 12 5.9 1 B i6 3 12 5. 91 se gm en t G en ed 77 79 62 6 77 79 62 6 77 79 62 6 0- 20 00 G LE A N _0 52 58 X P_ 00 12 59 19 1 7.O OE -0 6 16 8- 54 06 H el ic as e, pu ta tIv e - 2. 39 2 - 0. 88 7 1.0 73 1. 17 6 - 4 32 6 - 1. 68 9 0. 75 2 0. 29 8 1. 38 9 0. 53 4 1.6 33 2. 44 5 [N eo sar tor ya fis ch er i]. 12 83 0- A m id oh yd ro la se ,p ut at iv e 12 80 0- 24 80 0 G LE A N _0 52 59 X P_ 74 77 06 4.O OE -0 4 - 3. 11 0 - 2. 89 0 - 4. 82 6 - 5. 54 5 0. 55 5 0. 90 3 13 57 8 [A sp erg illu sf lim ig at us ] 14 11 5- 12 80 0- 24 80 0 G LE A N _0 52 55 X P_ 56 72 65 6.O OE -5 8 16 37 0 D ru gt ra ns po rte r - 3. 11 0 - 2. 89 0 - 4. 82 6 - 5. 54 5 0. 55 5 0. 90 3 16 72 3- U nn am ed pr ot ei n pr od uc t - 3. 11 0 - 2. 89 0 - 4. 82 6 - 5. 54 5 0. 55 5 0. 90 3 12 80 0. 24 80 0 G LE A N _0 52 54 BA E5 73 78 0. 02 3 19 38 8 [A sp erg illu so ry za ej. 20 90 6- TP R do m ai n pr ot ei n 12 80 0- 24 80 0 G LE A N _0 52 53 X P_ 00 12 61 01 1 2.O OE -1 77 - 3. 11 0 - 2. 89 0 - 4. 82 6 - 5. 54 5 0. 55 5 0. 90 3 22 64 2 [N eo mr tor ya fis ch er i) 24 79 0- 12 80 0- 24 80 0 G LE A N _0 52 52 X P_ 77 79 99 2. 00 E- 12 0 26 10 5 H yp ot he tic al pr ot ei n - 3. 11 0 . 2. 89 0 - 4. 82 6 - 5. 54 5 0. 55 5 0. 90 3 33 93 00 - 34 16 53 - G LE A N 05 19 0 XI ’ 56 83 54 6.O OE -6 5 35 07 00 — — H yp ot he tic al pr ot ei n 1. 00 0 - 0. 09 4 1. 75 4 33 93 00 - 34 58 35 - G LE A N 05 31 8 EA U 93 68 8 1.O OE -1 8 35 07 00 — 34 96 44 Pr ed ic te dp ro te in [C op rin op sis cin ere a]. 1. 00 0 - 0. 09 4 1. 75 4 33 96 00 - 34 16 53 - G LE A N 05 19 0 X P 56 83 54 6.O OE -6 5 35 90 90 — — H yp ot he tic al pr ot ei n 1. 85 7 - 3. 03 8 - 0. 64 5 - 5. 12 7 3. 07 6 - 0. 33 1 33 96 00 - 34 58 35 - G LE A N 05 31 8 EA U 93 68 8 l.O O E- l8 35 00 00 34 96 44 Pr ed ic te d pr ot ei n [C op rin op sis ci ne re a]. 1. 85 7 - 3. 03 8 - 0. 64 5 - 5. 12 7 3. 07 6 - 0. 33 1 34 43 00 - 34 58 35 - G LE A N 05 31 8 EA U 93 68 8 l.O O E- l8 35 02 00 — 34 90 44 Pr ed ic te dp ro te in [C op rin op sis cin ere a]. 1. 02 7 - 0. 98 3 2. 41 1 44 63 00 - 44 70 26 . G LE A N 05 17 3 X P 77 81 47 3. OO E- l7 9 45 03 00 — — 44 87 23 H yp ot he tic al pr ot ei n - 2. 26 1 . 41 92 0. 82 90 92 00 - 99 97 27 - G LE A N 05 06 6 XI ’ 77 18 10 9.O OE -0 8 H yp ot he tic al pr ot ei n - 1. 43 3 - 2. 44 0 - 0. 32 9 10 02 80 0 — — 10 00 13 7 99 92 00 - 10 00 22 4- G LE A N 05 06 5 XI ’ 56 79 71 2.O OE .83 10 02 80 0 — — 10 01 85 1 R et ro tra ns po so n n u cl eo ca ps id pr ot ei n - 1. 43 3 - 24 40 - 0. 32 9 10 01 30 0- 10 00 22 4- G LE A N 05 06 5 XI ’ 56 79 71 2.O OE -8 3 10 02 90 0 — — 10 01 85 1 R et ro lra ns po so nn uc le oc ap sid pr ot ei n - 1. 67 7 - 3. 51 6 0. 41 4 10 01 30 0- 10 01 98 2- G LE A N 05 46 0 X P 77 18 25 3.O OE -8 5 10 02 90 0 — — 10 02 62 0 H yp ot he tic al pr ot ei n - 1. 67 7 - 3. 51 6 0. 41 4 13 75 60 0- 13 54 81 4- So di um P- ty pe A TP as e, pu ta tiv e - 2. 19 4 - 4 32 8 - 0 62 1 G LE A N 04 98 7 XI ’ 75 18 81 6.O OE -1 62 13 76 80 0 — — 13 58 60 7 [A sp erg illu sf rm ig at us ]. 19 31 00 0- 19 30 97 8- G LE A N _0 48 78 X P_ 56 70 34 2.O OE -5 5 19 32 30 0 19 31 77 2 H yp ot he tic al pr ot ei n - 1. 91 3 - 3. 97 9 - 0. 28 5 A V G LR Lo w es tL R fo r re gi on H ig he st LR fo rr eg io n CO Or d. G en B an k ID ’ E- va lu e C oo rd . - CB S W M CB S W M CB S W M Pr ed ic te d fu nc tio n’ Bt 63 12 5. 91 B t6 3 12 5. 91 B t6 3 12 5.9 1 C ur s gm en t G en e’ 77 79 62 6 77 79 62 6 77 79 62 6 21 50 70 0- 21 46 94 1- N uc le ar co he si nc om pl ex su bu ni t(P sc 3) , - 1. 68 3 - 31 97 - 0. 00 3 21 59 60 0 G LE A N _0 48 38 X P_ 75 59 43 3.O OE -6 1 21 50 76 0 pu ta tiv e [A sp ec gil lus fu m ig at ut ]. 21 50 70 0- 21 52 56 1- G LE A N 05 67 1 X P 56 69 21 6.O OE -6 0 H yp ot he tic al pr ot ei n - 1. 68 3 . 3. 19 7 - 0. 00 3 21 59 60 0 — — 21 54 87 8 21 50 70 0- 21 55 12 4- G LE A N 04 83 7 X P 77 77 65 0.O OE +0 0 21 59 60 0 — — 21 59 01 6 H yp ot he tic al pr ot ei n - 1. 68 3 - 3. 19 7 - 0. 00 3 21 52 60 0- 21 52 58 1- G LE A N 05 61 7 X P 56 69 08 0.O OE +0 0 H yp ot he tic al pr ot ei n - 1. 93 6 - 2. 93 7 - 0. 98 5 21 59 00 0 — — 21 54 87 8 21 52 60 0- 21 55 12 4- G LE A N 04 83 7 X P 77 77 65 0.O OE -I- 00 21 59 00 0 — — 21 59 01 6 H yp ot he tic al pr ot ein - 1.9 36 - 2.9 37 - 0.9 85 22 73 90 0 - 22 75 86 5 - G LE A N 05 69 5 X P 77 65 39 5.O OE -2 7 22 89 80 0 — — 22 76 75 1 H yp ot he tic al pr ot ei n - 2. 51 4 - 4. 26 8 - 0. 02 1 22 73 90 0. 22 77 03 2- Pr ed ic te d pr ot ei n lO str eo co cc us GL EA N 05 69 6 XI ’ 00 14 21 49 2 6. 30 E- 01 - 25 14 - 42 66 - 0. 02 1 22 89 80 0 — — 22 77 69 1 lu ci m ar in us ]. — 22 73 90 0- 22 79 60 1 - G LE A N 05 69 7 XI ’ 77 63 17 l.O OE -6 2 H yp ot he tic al pr ot ei n - 2. 51 4 - 4 26 8 - 0. 02 1 22 89 80 0 — — 22 82 09 1 22 73 90 0- 22 87 65 6- R et ro tra ns po sa bl e el em en t s la cs 13 2 kd a G LE A N 04 81 6 X l’ 56 87 29 4.O OE -8 4 - 2. 51 4 - 4. 26 8 - 0. 02 1 22 89 80 0 — — 22 88 39 9 pr ot ei n — — — — — — — — — — 57 19 6- 2 59 00 0- 60 00 0 G LE A N _0 27 95 X P_ 77 65 39 1.O OE -3 2 59 31 6 H yp ot he tic al pr ot ei n - 0. 64 4 - 1. 86 1 0. 08 4 84 27 9- 88 40 0- 89 70 0 G LE A N _0 28 25 X P_ 56 87 72 0.O OE +0 0 G am m a D N A -d ire ct ed D N A po ly in er ao e - 1. 26 9 - 2. 63 8 - 0. 15 1 88 79 4 89 66 1 - A T? o yn th as e de lta ch ai n, m ito ch on dr in l 88 40 0- 89 70 0 G LE A N _0 27 91 X P_ 56 87 74 7.O OE -2 5 - 1. 26 9 - 2. 63 8 - 0. 15 1 90 95 4 pr ec ur so r 10 20 00 - 98 93 0- G LE A N 02 82 7 X P 77 70 07 9.O OE -8 2 H yp ot he tic al pr ot ei n - 1. 93 1 - 4. 03 1 0 65 9 10 28 00 — — 10 30 13 15 04 00 - 15 06 56 - G LE A N 02 83 9 X i’ 77 65 39 4.O OE -3 1 15 56 00 — — 15 25 03 H yp ot he tic al pr ot ei n - 0. 69 2 - 1. 79 1 0. 43 7 15 38 76 Sm al ln uc le ol ar rib on uc le op ro te in — 15 04 00 - G LE A N 02 84 0 X P 75 56 20 8.O OE -8 2 15 56 00 — — 15 59 16 co m pl ex su bu ni t, pu ta tiv e IA op er gi llu s - 0. 69 2 - 1. 79 1 0. 43 7 fu m ig at us i. — 21 92 00 - 21 96 51 - G LE A N 02 85 1 A A W 41 77 8 0.O OE +0 0 22 10 00 — 22 23 24 C hi tin sy nt ha se re gu la to r3 - 2. 27 3 - 3. 83 4 0. 06 8 26 06 00 - 26 06 48 - G LE A N 02 66 1 XI ’ 77 24 44 8.O OE -7 3 26 39 00 — — 26 19 60 H yp ot he tic al pr ot ei n - 2. 87 1 - 4. 23 3 - 0. 74 4 26 06 00 - 26 21 48 - G LE A N 02 76 4 EA U 8I O IS 3. 40 E- O t 26 39 00 — 26 33 10 Pr ed ic te dp ro te in [C op rin op sis cin ere a]. - 2, 87 1 - 4. 23 3 - 0. 74 4 26 70 00 - 26 81 99 - G LE A N 02 86 3 X P 77 65 39 1.O OE -3 1 28 38 00 — — 27 00 25 H yp ot he tic al pr ot ei n - 0. 98 7 - 2. 12 3 - 0. 01 2 Sm al l n u cl eo la rr ib on uc le op ro te in 26 70 00 - 27 13 82 - G LE A N 02 86 4 XI ’ 00 12 75 56 4 2, 00 E- 38 28 38 00 — — 27 25 66 co m pl ex su bu ni t,p ut at iv e(A sp erg ill us - 09 87 - 2. 12 3 - 0. 01 2 cla va tse t]. 26 70 00 - 27 29 70 - H yp ot he tic al pr ot ei n (T etr ah ytn en a G LE A N 02 86 5 Xl ’ 00 10 28 74 5 4 OO E- 65 - 0. 98 7 - 2. 12 3 - 0, 01 2 28 38 00 — — 27 54 25 th er m op bi la . 26 70 00 - 27 42 27 - H yp ot he tic al pr ot ei n (N eo tai toi ya G LE A N 02 76 1 X P 00 12 67 66 6 3.O OE -1 9 - 0. 98 7 - 2. 12 3 - 0. 01 2 28 38 00 — — 27 45 00 fn ch er il. 26 70 00 - 27 56 43 - M ito ch on dr in l p ro te in G LE A N 02 86 6 N P 69 08 45 l.O OE -l5 - 0. 98 7 - 2, 12 3 - 0. 01 2 28 38 00 — — 27 59 00 (S ac ch aro my ce s c er ev isi ne J. 26 70 00 - 27 63 46 - H yp ot he tic al pr ot ei n [P ha eo sp ha eri a G LE A N 02 86 7 EA T8 91 75 5.O OE -l3 - 0. 98 7 - 2. 12 3 - 0. 01 2 28 38 00 — 27 64 98 n o do ru m ]. 26 70 00 - 27 65 83 - G LE A N _0 28 68 X P_ 00 l2 l4 10 8 7.O OE -0 4 28 38 00 27 66 78 H yp ot he tic al pr ot ei n [A sp erg ill us te rr eu s]. - 0. 98 7 - 2. 12 3 - 0. 01 2 A V G LR Lo w es tL R fo rr eg io n H ig he st LR fo rr eg io n C oo rd . C hr G le an n u m be r” G en B an k fl ) B -v al ue Co or d. - CB S W M CB S W M CB S W M se gm en . Ge ne d Pr ed ic te d fu nc tio n’ B1 63 12 5. 91 77 79 62 6 B t6 3 12 5. 91 77 79 62 6 B t6 3 12 5. 91 77 79 62 6 26 70 00 - 27 71 88 - rE N A in tro n- en co de d ho m in g - 0. 98 7 - 2. 12 3 - 0. 01 2 2 28 38 00 G LE A N _0 28 69 A A K 13 58 9 3.O OE -1 5 27 78 78 en do nu cl ea se lO ry za sa tiv a] 26 70 00 - 27 86 49 - G LE A N 02 87 0 X P 77 88 98 3. 10 E+ 00 H yp ot he tic al pr ot ei n Io ia rd ia Ia m bu s]. - 0. 98 7 - 2 12 3 - 0. 01 2 28 38 00 — — 27 88 91 26 70 00 - 28 09 81 - Pu ta tiv e pr ot ei n [S ac ch aro my ce s G LE A N 02 75 9 N P 01 32 63 6.O OE -0 1 - 0. 98 7 - 2. 12 3 - 0. 01 2 28 38 00 — — 28 11 44 ce re v isi ae ). 26 70 00 - 28 12 45 - Se ne sc en ce -a ss oc ia te d pr ot ei n G LE A N 02 87 1 N P 66 52 29 2.O OE -3 8 - 0. 98 7 - 2. 12 3 - 0. 01 2 28 38 00 — — 28 19 00 IC ry pt os po rid iu m ho m in isl . 26 70 00 - 28 25 02 - H yp ot he tic al pr ot ei n [N eo sa rto ry a - 0 98 7 - 2 12 3 - 0 01 2 G LE A N 02 75 8 N P 00 12 67 66 6 3.O OE -1 9 28 38 00 — — 28 27 75 fis ch er i]. 39 87 00 - 39 82 61 - G LE A N 02 73 7 X P 77 72 05 2.O OE -1 37 40 28 00 — — 39 95 25 H yp ot he tic al pr ot ei n - 0. 88 8 - 3. 06 6 0. 63 4 39 87 00 - 40 04 30 - G LE A N 02 89 3 XI ’ 56 88 09 5.O OE -6 5 D N A -d ire ct ed E llA po ly m er as e - 0. 88 8 - 3. 06 6 0. 63 4 40 28 00 — — 40 11 14 41 92 00 - 41 96 00 - G LE A N 02 89 7 EA U 92 02 4 2 OO E- 09 Pr ed ic te d pr ot ei n [C op rin op sis ci ne re a]. - 1. 56 4 - 3. 41 1 0. 40 5 42 04 00 — 42 03 87 14 06 70 0- 14 06 89 6- St re ss re sp on se pr ot ei n kd sl ,p ut at iv e G LE A N 03 08 3 N P 00 12 74 53 9 7.O OE -0 4 - 1. 69 0 - 3. 37 6 - 0. 21 8 14 08 20 0 — — 14 07 93 2 [A sp erg ill us cl av at us i. 14 08 40 0- 14 09 93 9- G LE A N 03 08 4 N P 56 89 91 6.O OE -5 0 Pr ot ei n- m et hi on in e- R- ox id e re du ct as e - 2. 52 1 - 3. 61 9 - 1. 26 2 14 09 50 0 — — 14 10 68 7 15 18 40 0- 15 18 45 6- H yp ot he tic al pr ot ei n [S tre pto co cc us G LE A N 02 51 3 N P 26 89 11 2. 50 E- 02 - 1. 79 6 - 4. 50 0 1. 48 9 15 29 30 0 — — 15 18 77 3 ph ag e 37 0. 1]. 15 18 40 0- 15 19 70 2- G LE A N 02 51 2 N P 56 68 46 4.O OE -8 9 15 29 30 0 — — 15 21 21 8 Tr an sp os ab le el em en t-c ,y pt on -C nl - 1. 79 6 - 4. 50 0 1. 48 9 15 18 40 0- 15 22 30 4- G LE A N 03 10 6 N P 77 74 60 2.O OE -7 5 15 29 30 0 — — 15 23 25 5 H yp ot he tic al pr ot ei n - 1. 79 6 - 45 00 1. 48 9 15 18 40 0- 15 24 29 1- G LE A N 03 10 7 N P 56 69 22 2.O OE -1 08 15 29 30 0 — — 15 25 83 0 H yp ot he tic al pr ot ei n - 1. 79 6 - 4. 50 0 1. 48 9 15 18 40 0- 15 26 26 3- G LE A N 02 51 1 N P 56 69 21 9.O OE -0 5 15 29 30 0 — — 15 26 66 9 H yp ot he tic al pr ot ei n - 1. 79 6 - 4. 50 0 1. 48 9 15 18 40 0- 15 27 91 6- G LE A N 02 51 0 N P 77 56 48 3. OO E- ll 15 29 30 0 — — 15 28 51 9 H yp ot he tic al pr ot ei n - 1. 79 6 - 4. 50 0 1. 48 9 15 24 80 0- 15 24 29 1- G LE A N 03 10 7 N P 56 69 22 2. OO E- l0 8 15 29 20 0 — — 15 25 83 0 H yp ot he tic al pr ot ei n - 1. 37 8 - 4. 06 6 1.6 43 15 24 80 0- 15 26 26 3- G LE A N 02 31 1 XI ’ 56 69 21 9.O OE -0 5 15 29 20 0 — — 15 26 66 9 H yp ot he tic al pr ot ei n - 1. 37 8 - 4. 06 6 1.6 43 15 24 80 0- 15 27 91 6- G LE A N 02 51 0 N P 77 56 48 3.O OE -l I H yp ot he tic al pr ot ei n - 1. 37 8 - 4. 06 6 1.6 43 15 29 20 0 — — 15 28 51 9 — — — — 44 87 00 - 44 88 06 - Ca rh ox ym uc on ol ac to ne de ca rb ox yl as e - 2. 15 5 - 4. 06 9 - 0. 02 9 45 03 00 G LE A N _0 44 35 Y P_ 95 l7 72 8.O OE -0 6 44 97 81 IM yc ob ac te riu m va nb aa le ni i] 65 40 00 - 65 46 33 - G LE A N 04 65 1 N P 57 00 11 4.O OE -5 1 65 50 00 — — 65 52 34 U D P- gl uc os e: ste ro lg lu co sy ltr an sf er as e - 1. 15 5 - 2. 57 3 0. 14 6 84 64 00 - G LE A N 04 35 0 H yp ot he tic al pr ot ei n - 1. 13 6 - 3. 70 4 0. 15 6 84 78 00 — 85 96 00 - 86 05 09 - G LE A N 04 34 8 N P 77 65 39 2.O OE -3 4 86 34 00 — — 86 26 19 H yp ot he tic al pr ot ei n - 1. 68 8 - 3. 12 5 - 0. 90 2 12 30 00 0- 12 29 69 8- G LE A N 04 27 8 N P 56 77 85 3.O OE -7 9 12 30 90 0 — — 12 30 60 7 H yp ot he tic al pr ot ei n 1.2 51 0. 59 2 1. 88 6 12 30 00 0- 12 30 74 9- H yp ot he tic al pr ot ei n U M 02 35 8. 1 12 51 0 59 2 1 88 6 G LE A N 04 75 7 N P 75 85 05 2.O OE -2 5 12 30 90 0 — — 12 33 65 6 [U sti lag o m ay di s] 13 34 80 0- 13 34 06 4- G LE A N _0 47 75 X P_ 56 76 26 2.O OE -1 78 13 40 80 0 13 35 08 3 H yp ot he tic al pr ot ei n - 0. 77 5 - 2. 27 6 1. 29 4 C I’-. ) A V G UR Lo w es t L R fo r re gi on H ig he st LR fo rr eg io a C oo rd . C hr , G kn n n u m be r G en fl an ki D ’ E- va ha e C oo rt i- CB S W M CB S W M CB S W M Pr ed ic te d fu nc tio n’ B t6 3 12 5. 91 Bt 63 12 5. 91 B t6 3 12 5. 91 se gm en . G en e’ 77 79 62 6 77 79 62 6 77 79 62 6 13 34 80 0- 13 35 78 9- 13 40 80 0 G LE A N _0 47 76 X P_ 56 76 26 2. 00 E- 20 13 36 21 9 H yp ot he tic al pr ot ein - 0. 77 5 - 2. 27 6 1. 29 4 13 34 80 0- 13 37 86 3 - N A D H de hy dr og en as e su bu ni t4 L G LE A N 04 77 7 A A M 81 26 9 2.O OE -1 5 - 0. 77 5 - 2. 27 6 1. 29 4 13 40 80 0 — 13 39 91 0 IC ry pt oc cu s n eo fo rm an s va r. gru bii l. 13 36 00 0- 13 35 71 9- G LE A N 04 77 6 X P 56 76 26 2.O OE -2 0 H yp ot he tic al pr ot ei n - 1. 83 7 - 45 03 1. 05 4 13 40 50 0 — — 13 36 21 9 13 36 00 0- 13 37 86 3 - N A D H de hy dr og en as e su bu ni t 4 L 1 83 7 - 45 03 10 54 G LE A N 04 77 7 A A M 81 26 9 2.O OE -1 5 13 40 50 0 — 13 39 98 0 IC ry pt oc oc cu s n eo fo m ia ns va r. gr ub ii 14 20 40 0- 14 19 52 1- H yp ot he tic al pr ot ei n [C op rin op sis 07 51 08 01 I 60 7 G LE A N 04 79 4 EA U 92 91 9 8.O OE -5 8 14 22 80 0 — 14 21 65 8 ci ne re al . 14 20 40 0- 14 22 35 4- H yp ot he tic al pr ot ei n [C op rin op sis G LE A N 04 24 4 EA U 81 95 5 4.O OE -4 6 0.7 51 0.8 01 1. 60 7 14 22 80 0 — 14 26 42 1 ci ne re aj. 14 33 30 0- 14 31 67 0- G LE A N 04 24 2 AA .L 35 34 1 0.O OE +0 0 So di um -h yd ro ge n an tip or te r - 0. 47 8 - 3. 91 6 2. 16 0 14 46 60 0 — 14 33 33 5 14 33 30 0- 14 34 62 8- G LE A N 04 24 1 X i’ 56 76 07 1.O OE -1 63 M em br an e tr an sp or te r - 0. 47 8 - 3. 91 6 2. 16 0 14 46 60 0 — — 14 37 05 6 — 14 33 30 0- 14 39 20 4- M et hy ltr an sf ur as e [S ch izo sa ce ha ro my ce s G LE A N 04 24 0 N P 59 52 54 2.O OE -2 6 - 0. 47 8 - 3. 91 6 2. 16 0 14 46 60 0 — — 14 40 11 4 po m be ]. 14 33 30 0- 14 40 42 7- G LE A N 04 79 7 Xl ’ 56 76 05 5.O OE -1 40 A ut op ha gy -re la te d pr ot ei n - 0. 47 8 - 3. 91 6 2. 16 0 14 46 60 0 — — 14 42 01 2 14 33 30 0- 14 42 23 8- G LE A N 04 79 8 X l’ 77 26 05 5.O OE -1 36 H yp ot he tic al pr ot ei n - 0. 47 8 - 3. 91 6 2. 16 0 14 46 60 0 — — 14 43 48 0 14 33 30 0- 14 43 93 2- G LE A N 04 23 9 X P 56 76 03 4.O OE -0 8 14 46 60 0 — — 14 44 26 4 H yp ot he tic al pr ot ei n - 0. 47 8 - 3. 91 6 2. 16 0 14 33 30 0- 14 44 88 7- G LE A N 04 23 8 X P 56 76 03 7.O OE -1 5 14 46 60 0 — — 14 45 69 6 H yp ot he tic al pr ot ei n - 0. 47 8 - 3. 91 6 2. 16 0 14 33 30 0- 14 46 73 7- G LE A N 04 23 7 X P 77 26 05 5. 00 E- 13 6 H yp ot he tic al pr ot ei n - 0. 47 8 - 3. 91 6 2. 16 0 14 46 60 0 — — 14 49 05 2 14 33 30 0- 14 50 05 6- GL EA N 04 23 6 X l’ 56 76 01 5.O OE -6 2 H yp ot he tic al pr ot ei n - 0. 47 8 - 3. 91 6 2. 16 0 14 46 60 0 — — 14 51 93 9 14 33 30 0- 14 31 67 0- G LE A N 04 24 2 A A L3 53 41 0.O OE +0 0 So di um -h yd ro ge na nt ip or te r - 1. 60 5 - 34 44 0.1 01 14 34 90 0 — 14 33 33 5 14 33 30 0- 14 34 62 8- G LE A N 04 24 1 X l’ 56 76 07 1. OO E- l6 3 14 34 90 0 — — 14 37 05 6 M em br an e tr an sp or te r - 1. 60 5 - 3. 44 4 01 01 15 65 50 0- 15 58 84 6- C6 fm ge r d om ai n pr ot ei n, pu ta tiv e - 3 19 3 - 4 82 3 - 04 45 G LE A N 04 21 7 Xl ’ 00 12 58 62 6 2.O OE -1 15 15 71 90 0 — — 15 61 54 4 (N eo sar tor ya fis ch eri j. 15 65 50 0. 15 62 22 3 - H ex os e tr an sp or te r p ro te in [A sp erg ill us G LE A N 04 21 6 X l’ 74 83 41 2.O OE -1 76 - 3. 19 3 - 4. 82 3 - 0. 44 5 15 71 90 0 — — 15 64 41 3 fh m ig at us ]. 15 65 50 0- 15 66 52 4 - Is oc ho ris m at as e fu m ily hy dr ol as e, G LE A N 04 21 5 XI ’ 00 12 61 61 2 2.O OE -5 t - 3. 19 3 - 4. 82 3 - 04 45 15 71 90 0 — — 15 67 13 7 pu ta tiv e [N eo sar tor ya fis cb er il. 15 65 50 0- 15 68 42 3- G LE A N 04 21 4 A A G5 98 31 3.O OE -8 4 B et a. gl uc os id as e [V olv ari ell av olv ac ca ]. - 3. t9 3 - 4 82 3 - 0. 44 5 15 71 90 0 — 15 71 89 2 — — — — — — 4 20 0- 11 00 N O G EN E 1.1 23 0 88 2 1. 69 7 14 28 00 - 14 21 96 - G LE A N 00 68 8 X l’ 57 26 00 4. OO E- l3 0 14 52 00 — — 14 43 58 Cy to sin e- pu rin ep er m ea se - 2. 73 7 - 4. 34 8 0. 23 3 18 72 00 - 18 62 94 - G LE A N 00 67 8 XI ’ 77 34 52 l.O OE -8 4 H yp ot he tic al pr ot ei n - 2. 32 3 - 3. 60 1 18 89 00 — — 18 87 01 23 35 00 - 23 57 27 - G LE A N 00 67 0 XI ’ 56 81 31 0.O OE +0 0 24 02 00 — — 23 79 78 G ab A pe rm ea se - 2. 29 7 - 4. 65 9 - 0. 03 7 23 35 00 - 23 84 65 - Po ly ke tid e s yn th as e, pu ta tiv e - 2. 29 7 - 4. 65 9 - 0. 03 7 G LE A N 00 66 9 XI ’ 00 12 59 37 9 4.O OE -0 6 24 02 00 — — 23 97 05 LN eo sa rto ry a fis ch er ij. 23 35 00 - 24 01 25 - G LE A N _0 07 59 X P_ 77 34 71 0.O OE +0 0 24 26 41 H yp ot he tic al pr ot ei n - 2. 29 7 - 4. 65 9 - 0. 03 7 24 02 00 - a A V G LR Lo w es tL It fo rr eg io n H ig he st LR fo rr eg io n C oo rd . C hr G le an n u m be rh G en B an k flY E- va hi e C oo rd .- CB S W M CB S W M CB S W M Pr ed ic te d fu nc tio n B t6 3 12 5.9 1 B t6 3 12 5.9 1 B t6 3 12 5.9 1 a e gm en t G en e’ 77 79 62 6 77 79 62 6 77 79 62 6 24 92 00 - 25 13 03 - 25 52 00 G LE A N _0 07 62 X P_ 56 81 24 O. OO E+ 00 25 31 35 A m in e o x id as e - 2. 29 7 - 4. 65 9 - 0. 03 7 24 92 00 - 25 36 43 - G LE A N 00 66 5 X l’ 56 11 23 9.O OE -1 06 M an go es te ra se - 2. 29 7 - 4. 65 9 - 0. 03 7 25 52 00 — — 25 47 56 38 28 00 - N O G EN E N O G EN E - 1. 89 9 - 3. 50 2 0. 72 8 38 52 00 43 60 00 - N O G EN E N O G EN E - 2. 84 2 - 4. 53 7 - 0. 21 6 43 69 00 64 64 00 - 64 73 20 - G LE A N 00 84 1 X l’ 56 79 96 0.O OE +0 0 64 79 00 — — 64 90 05 H yp ot he tic al pr ot ei n - 0. 71 6 - 2. 00 4 0.5 01 75 46 00 - 75 30 95 - BS D do m ai n pr ot ei n [A sp erg ill us G LE A N 00 57 3 X l’ 74 96 15 6.O OE -0 8 - 0. 70 5 - 2. 11 2 - 0. 05 1 75 60 00 — — 75 48 38 fu m ig at us ]. 75 46 00 - 75 51 50 - G LE A N 00 86 7 XI ’ 56 79 68 0.O OE +0 0 Pr ot ei n th re on in e/ ty ro sin e ki na se - 0. 70 5 - 2. 11 2 - 0. 05 1 75 60 00 — — 75 78 60 82 90 00 - 82 98 07 - G LE A N 00 56 0 X P 56 77 37 1.O OE -6 8 83 13 00 — — 83 05 46 H yp ot he tic al pr ot ei n - 1. 13 2 - 3. 53 8 0. 54 8 88 48 00 - 88 47 07 - G LE A N 00 55 0 XI ’ 57 25 26 3.O OE -7 3 M em br an e pr ot ei n 0. 59 9 0. 10 3 1 54 88 62 00 — — 88 65 12 99 23 00 - 99 25 40 - G LE A N 00 91 2 XI ’ 56 83 78 3.O OE -0 4 H yp ot he tic al pr ot ei n - 2. 77 6 - 3. 88 9 0. 48 5 99 44 00 — — 99 39 55 10 58 00 0- 10 58 29 6- A lla nt oa te pe rm ea se (A sp erg illu s G LE A N 00 51 6 X l’ 00 12 74 98 3 3.O OE -0 4 - 2. 44 9 - 4. 68 4 0. 28 5 10 73 70 0 — — 10 59 83 1 cl av at us ]. 10 58 00 0- 10 61 84 7- G LE A N 00 92 3 Xl ’ 56 77 75 9.O OE -tO B et a- fiu ct of in an os id as e - 2. 44 9 - 4. 68 4 0. 28 5 10 73 70 0 — — 10 63 58 0 10 58 00 0- 10 63 96 5 - H yp ot he tic al pr ot ei n [C ha eto mi um - 2 44 9 - 4 68 4 0 28 5 G LE A N 00 51 5 X P 00 12 22 74 0 4. 20 E+ 00 10 73 70 0 — — 10 65 85 6 gl ob os um ]. 10 58 00 0- 10 67 40 7- G LE A N 00 92 4 X l’ 77 25 66 6.O OE -3 7 10 73 70 0 — — 10 70 81 2 H yp ot he tic al pr ot ei n - 2. 44 9 - 4. 68 4 0. 28 5 N A O .b in di ng Ro ss m an n fo ld 10 58 00 0- 10 71 37 8- G LE A N 00 51 4 N P 00 12 59 05 9 5.O OE -6 l o x id or ed uc ta se th m ily pr ot ei n - 2. 44 9 - 4. 68 4 0. 28 5 10 73 70 0 — — 10 72 71 5 IN eo sa rto rv a fia ch er il. 10 58 00 0- 10 73 62 1- G LE A N 00 92 5 N P 57 14 60 0.O OE +0 0 Tr eh al os e tr an sp or te r - 2. 44 9 - 4. 68 4 0. 28 5 10 73 70 0 — — 10 75 76 7 10 66 70 0- 10 67 40 7- G LE A N 00 92 4 N P 77 25 66 6.O OE -3 7 10 80 40 0 — — 10 70 81 2 H yp ot he tic al pr ot ei n - 2. 51 4 - 4. 42 6 - 0. 07 5 N A D .b in di ng Ro ss m an n fo ld 10 66 70 0- 10 71 37 8- G LE A N 00 51 4 N P 00 12 59 05 9 5.O OE -6 1 o x id or ed uc ta se Ilm ily pr ot ei n - 2. 51 4 - 4. 42 6 - 0. 07 5 10 80 40 0 — — 10 72 71 5 [N eo sa rto iya fis ch er i]. 10 66 70 0- 10 73 62 1- G LE A N 00 92 5 N P 57 14 60 0.O OE -IO 0 Tr eh al os e tr an sp or te r - 2. 51 4 - 4. 42 6 - 0. 07 5 10 80 40 0 — — 10 75 76 7 10 66 70 0- 10 78 79 7- G LE A N 00 92 6 X P 77 50 62 1.O OE -5 1 10 80 40 0 — — 10 79 54 3 H yp ot he tic al pr ot ei n - 2. 51 4 - 4. 42 6 - 0. 07 5 — — — — — 5 0- 52 00 G LE A N _0 21 55 N O G EN E - 2. 17 9 - 3. 86 5 0. 25 4 50 -8 00 G LE A N 02 15 5 N O G EN E - 0. 75 0 - 1. 47 9 - 0. 10 4 42 00 -2 20 00 G LE A N _0 21 55 N O G EN E 1. 10 4 0. 01 4 1. 59 4 42 00 - 22 00 0 G LE A N _0 21 56 N P_ 57 21 49 4.O OE -4 0 66 85 -6 95 5 H yp ot he tic al pr ot ei n 1. 10 4 0. 01 4 1. 59 4 42 00 - 22 00 0 G LE A N _0 2l 54 N P_ 77 50 60 1.O OE -1 65 72 85 - 86 14 H yp ot he tic al pr ot ei n 1. 10 4 0. 01 4 1. 59 4 M FS al la nt oa te tr an sp or te r, pu ta tiv e 1. 10 4 0. 01 4 1. 59 4 42 00 - 22 00 0 G LE A N _0 21 57 N P_ 75 36 92 3.O OE -7 3 91 45 -1 09 06 [A sp erg illu sf um ig at us ]. A V G LR Lo w es tL R fo rr eg io n H ig he st LR fo rr eg io n C oo rd .- G le an n u m be rb G en B an k H 3 E- va ln e C oo rd .— CB S W M CB S W M CB S W M C hr se gm en t* G en ed Pr ed ic te d fu nc ti on ’ Bt 63 12 5.9 1 77 79 62 6 B t6 3 12 5. 91 77 79 62 6 B t6 3 12 5.9 1 77 79 62 6 11 82 9- H pc H /H pa la ld ol as e/ ci tra te ly as e ffi m ily 5 42 00 -2 20 00 G LE A N _0 21 53 X P_ 00 12 64 61 1 9.O OE -2 9 1. 10 4 0. 01 4 1. 59 4 12 74 6 pr ot ei n (N eo sar tor ya fis ch er il. 14 55 4- 5. ox o. L. pr ol in as e, pu ta tiv e (A sp erg illu s 1. 10 4 0. 01 4 1. 59 4 42 00 - 22 00 0 G LE A N 02 15 2 X P_ 75 13 07 3.O OE -1 63 16 50 4 fh m ig at us ]. 16 98 2- 42 00 -2 20 00 G LE A N _0 2I 58 CA D 70 76 3 0.O OE +0 0 5. ox op ro lin as e 1. 10 4 0. 01 4 1. 59 4 19 48 5 21 43 9- H yp ot he tic al pr ot ei n (C op rin op sis 00 14 15 94 42 00 - 22 00 0 G LE A N _0 21 59 EA U 83 89 4 7.O OE -2 9 24 27 7 ci ne re a]. 67 00 .3 28 00 G LE A N _0 21 56 X P_ 57 21 49 4.O OE -4 0 66 85 - 69 55 H yp ot he tic al pr ot ei n - 3. 03 5 - 4. 78 2 1. 11 0 67 00 -3 28 00 G LE A N _0 21 54 X P_ 77 50 60 1.O OE -1 65 72 85 - 86 14 H yp ot he tic al pr ot ei n - 3. 03 5 - 4. 78 2 1. 11 0 M FS al la nt oa te tr an sp or te r, pu ta tiv e 67 00 -3 28 00 G LE A N _0 21 57 X P_ 75 36 92 3.O OE -7 3 91 45 - 10 90 6 - 3. 03 5 - 4. 78 2 1. 11 0 (A sp erg illu sf irm ig at us ]. 11 82 9- H pc H lH pa ta ld ol as e/ ci tra te ly as e tim ily 67 00 -3 28 00 G LE A N _0 21 53 X P_ 00 12 64 61 1 9.O OE -2 9 - 3. 03 5 - 4. 78 2 1. 11 0 12 74 6 pr ot ei n [N eo sa rto ry a f isc he ril . 14 55 4- 5- ox o- L- pr ol in as e, pu ta tiv e (A sp erg illu s - 3. 03 5 - 4 78 2 1. 11 0 67 00 -3 28 00 G LE A N _0 21 52 X P_ 75 13 07 3.O OE -1 63 16 50 4 fu m ig at us (. 16 98 2- 67 00 -3 28 00 G LE A N _0 21 58 CA D 70 76 3 0.O OE +0 0 19 48 5 5- ox op ro lin as e . 3. 03 5 - 4. 78 2 t. tlO 2t 43 9 - H yp ot he tic al pr ot ei n (C op rin op sis 67 00 - 32 80 0 G LE A N _0 21 59 EA U 83 89 4 7.O OE -2 9 - 3. 03 5 - 4. 78 2 1. 11 0 24 27 7 ci ne re a]. 24 8t I - DU F1 44 5 do m ai n pr ot ei n [A sp erg illu s - 3. 03 5 - 4. 78 2 1. 11 0 67 00 - 32 80 0 G LE A N _0 21 50 X P_ 74 81 73 5.O OE -2 3 25 96 4 fu m ig at es ]. 27 69 6- 67 00 - 32 80 0 G LE A N _0 21 60 X P_ 57 06 9t 0.O OE +0 0 29 48 5 H yp ot he tic al pr ot ei n - 3. 03 5 - 4. 78 2 1. 11 0 30 26 4- 67 00 - 32 80 0 G LE A N _0 21 49 X P5 70 69 2 5.O OE -1 30 En ol as e 1 - 3. 03 5 - 4. 78 2 I. it O 32 03 9 16 98 2- 17 70 0- 20 80 0 G LE A N _0 21 58 X P5 71 05 3 0. 00 E4 00 19 48 5 5- ox op ro lin as e 0. 69 6 - 1. 15 8 1.7 25 27 69 6- 27 40 0- 29 50 0 G LE A N 02 16 0 X P_ 57 06 9l 0.O OE +0 0 H yp ot he tic al pr ot ei n - 2. 45 2 - 4. 32 0 0. 29 6 29 48 5 95 79 0- 94 10 0- 96 60 0 G LE A N _0 21 77 X P_ 77 61 97 1.O OE -1 31 98 44 4 H yp ot he tic al pr ot ei n - 3. 13 5 - 4. 42 3 - 0. 44 5 12 68 00 - N O G EN E N O G EN E - 2. 12 4 - 3. 57 6 - 0. 04 0 12 87 00 12 68 00 - 12 85 56 - G LE A N 02 18 3 X P 57 03 89 1.O OE -1 44 G lu co sid as e . 2. 12 4 - 3. 57 6 - 0, 04 0 12 87 00 — — 13 10 58 18 48 00 - 18 38 59 - FA O1 [C ry pto co cc us n eo fo rm an sv ar . G LE A N 02 19 2 A A N 75 16 7 7.O OE -1 70 - 2. 32 4 - 2. 37 1 - 4. 65 6 - 4. 86 5 1. 01 5 0. 69 3 20 40 00 — 18 52 18 gr ub ii] . 18 48 00 - t8 59 88 - G LE A N 02 19 3 X l’ 57 01 06 6, 00 E- 15 7 Se xu al de ve lo pm en tr eg ul at or - 2. 32 4 - 2. 37 1 - 4. 65 6 - 4. 86 5 1. 01 5 0. 69 3 20 40 00 — — 18 73 49 18 48 00 - 18 77 61 - G LE A N 02 19 4 X P 77 31 97 9.O OE -6 9 H yp ot he tic al pr ot ei n - 2. 32 4 - 2. 37 1 - 4. 65 6 - 4. 86 5 1. 01 5 0. 69 3 20 40 00 — — 18 85 78 18 48 00 - 18 98 44 - G LE A N 02 12 6 A A Y 25 03 8 6.O OE -7 1 20 40 00 — 19 20 96 CA PI (C ryp toc oc cu sg att ii] . - 2. 32 4 - 2. 37 1 - 4. 65 6 - 4. 86 5 1. 01 5 0. 69 3 18 48 00 - 19 44 39 - Ph os ph ol ip as e D l (P LD I), pu tat iv e - 2. 32 4 - 2. 37 1 - 4. 65 6 - 4. 86 5 1. 01 5 0. 69 3 G LE A N 02 12 4 X P 75 50 38 1.O OE -1 52 20 40 00 — — 20 06 33 [A sp erg ill us fis m ig at us ]. 22 24 00 - 22 25 00 - M E al ph a; hy po th et ic al pr ot ei n - 3. 31 7 - 3. 06 9 - 4. 76 6 - 5 51 7 0. 28 2 0. 69 3 G LE A N 02 11 9 A A S9 25 22 8.O OE -0 4 23 84 00 — 22 26 16 [C ry pto co cc us ga tti i]. 22 24 00 - 22 25 00 - G LE A N 02 11 9 A A K 55 60 8 8.O OE -0 4 Ph er om on e al ph a - 3. 31 7 - 3. 06 9 . 4. 76 6 - 5. 51 7 0. 28 2 - 0, 06 0 23 84 00 — 22 26 16 22 24 00 - 22 29 31 - M E al ph a; hy po th et ic al pr ot ei n - 3. 31 7 - 3. 06 9 - 4. 76 6 - 5, 51 7 0. 28 2 - 0, 06 0 G LE A N _0 22 00 A A S9 25 22 2, 00 E- 03 23 84 00 22 30 47 [C ry pto co cc us ga tti i]. A V G LR Lo w es tL R fo rr eg io n H ig he st LR fo rr eg io n C oo rd . C hr G le an n u m be r° G en B an k ID E- va hi e C oo rd .- CB S W M CB S W M CB S W M Pr ed ic te d fu uc tjo n B t6 3 12 5. 91 B t6 3 12 5. 91 11 06 3 12 5. 91 se gm en . Ge ne d 77 79 62 6 77 79 62 6 77 79 62 6 22 24 00 - 22 43 03 - Cl as s V m yo sin (M yo 4), pu ta tiv e - 3. 31 7 - 3. 06 9 - 4. 76 6 - 5. 51 7 0. 28 2 - 0. 06 23 84 00 G LE A N _0 22 01 X P_ 00 12 73 89 1 1.O OE -1 09 23 01 36 [A sp erg ill us cl av at us j 22 24 00 - 23 05 85 - G LE A N 02 11 8 A A N 75 61 5 6.O OE -1 34 ST E2 O - 3. 31 7 - 3. 06 9 - 4. 76 6 - 5. 51 7 0. 28 2 - 0. 06 23 84 00 — 23 29 96 24 73 00 - 24 57 18 - D ih yd ro lip oa m id e de hy dr og en as e G LE A N 02 11 6 X P 00 12 65 77 6 2.O OE -1 43 - 2. 16 7 - 1. 88 0 - 4. 60 7 - 4. 97 3 1.0 11 1. 44 7 28 82 00 — — 24 78 60 [N eo sao tor ya fia ch er il. 24 73 00 - 24 81 66 - H yp ot he tic al pr ot ei n [C op rin op sis G LE A N 02 11 5 EA U 87 64 3 9.O OE -5 1 - 2. 16 7 - 1. 88 0 - 4. 60 7 - 4. 97 3 1.0 11 1 44 7 28 82 00 — 25 11 16 ci ne re a]. 24 73 00 - 25 21 01 - G LE A N 02 11 4 X P 57 05 46 70 0E -l5 5 28 82 00 — — 25 52 69 H yp ot he tic al pr ot ei n - 2. 16 7 - 1. 88 0 - 4. 60 7 - 4. 97 3 1.0 11 1. 44 7 24 73 00 - 25 69 01 - PH D tr an sc rip tio n fa ct or (R um I), pu ta tiv e G LE A N 02 20 4 X P 74 80 00 6.O OE -9 9 - 2. 16 7 - 1. 88 0 - 4. 60 7 - 4. 97 3 1.0 11 1. 44 7 28 82 00 — — 26 28 90 [A sp ee gil lus fli m ig atu s]. 24 73 00 - 26 35 67 - BS PI [C ryp toc oc cu sn eo fo rm an s v ar . GL EA N 02 20 5 A A V 98 45 4 6.O OE -1 64 - 2. 16 7 - 1. 88 0 - 4. 60 7 - 4. 97 3 1.0 11 1. 44 7 28 82 00 — 26 63 23 gr ub iiJ . 38 04 00 - 38 01 01 - G LE A N 02 22 7 X P 77 60 63 5.O OE -8 8 38 21 00 — — 38 09 94 H yp ot he tic al pr ot ei n - 1. 96 5 - 4. 36 8 1. 44 7 38 04 00 - 38 12 23 - G LE A N 02 22 8 XI ’ 57 00 55 2.O OE -6 3 H yp ot he tic al pr ot ei n - 1. 96 5 - 4. 36 8 0. 09 2 38 21 00 — — 38 23 08 38 03 00 - 38 01 01 - G LE A N 02 22 7 XI ’ 77 60 63 5.O OE -8 8 38 23 00 — — 38 09 94 H yp ot he tic al pr ot ei n - 1. 66 9 - 4. 46 5 0. 02 7 38 03 00 - 38 12 23 - G LE A N 02 22 8 XI ’ 57 00 55 2.O OE -6 3 38 23 00 — — 38 23 08 H yp ot he tic al pr ot ei n - 1. 66 9 - 4. 46 5 0. 02 7 47 86 00 - 47 60 13 - G LE A N 02 07 2 X P 77 18 18 5.O OE -6 7 48 80 00 — — 47 88 09 H yp ot he tic al pr ot ei n - 1. 34 9 - 3. 57 3 1. 13 9 47 86 00 - 47 91 69 - G LE A N 02 07 1 X P 56 92 89 4.O OE -5 1 48 80 00 — — 47 99 92 H yp ot he tic al pr ot ei n - 1. 34 9 - 3. 57 3 1. 13 9 47 86 00 - 48 10 74 - G LE A N 02 25 2 XI ’ 56 69 21 6.O OE -6 0 H yp ot he tic al pr ot ei n - 1. 34 9 - 3. 57 3 1. 13 9 48 80 00 — — 48 33 01 47 86 00 - 48 35 47 - G LE A N 02 07 0 X P 77 77 65 0.O OE -fO O H yp ot he tic al pr ot ei n - 1. 34 9 - 3. 57 3 1. 13 9 48 80 00 — — 48 74 39 47 87 00 - 47 60 13 - G LE A N 02 07 2 XI ’ 77 18 18 5.O OE -6 7 48 85 00 — — 47 88 09 H yp ot he tic al pr ot ei n 0. 43 5 0. 02 3 1. 07 2 47 87 00 - 47 91 69 - G LE A N 02 07 1 XI ’ 56 92 89 4.O OE -5 1 48 85 00 — — 47 99 92 H yp ot he tic al pr ot ei n 0. 43 5 0. 02 3 1. 07 2 47 87 00 - 48 10 74 - G LE A N 02 25 2 XI ’ 56 69 21 6.O OE -6 0 48 85 00 — — 48 33 01 H yp ot he tic al pr ot ei n 0. 43 5 0. 02 3 1. 07 2 47 87 00 - 48 35 47 - G LE A N 02 07 0 Xl ’ 77 77 65 0.O OE +0 0 48 85 00 — — 48 74 39 H yp ot he tic al pr ot ei n 0. 43 5 0. 02 3 1. 07 2 48 05 00 - 48 10 74 - G LE A N 02 25 2 X P 56 69 21 6.O OE -6 0 H yp ot he tic al pr ot ei n - 2. 05 8 - 2. 04 5 - 3 56 3 - 3. 70 7 - 0. 40 9 0. 09 2 48 73 00 — — 48 33 01 48 05 00 - 48 35 47 - G LE A N 02 07 0 XI ’ 77 77 65 0.O OE +0 0 48 73 00 — — 48 74 39 H yp ot he tic al pr ot ei n - 2. 05 8 - 2. 04 5 - 3. 56 3 - 3. 70 7 - 0. 40 9 0. 09 2 58 98 00 - 58 78 66 - Po ly pr ot ei n [P ha ne ro ch ae te G LE A N 02 27 1 AA .Z 28 94 3 3.O OE -2 4 1.5 91 0.4 41 3. 28 4 59 15 00 — 58 84 68 ch ry so sp or iu m ]. 58 98 00 - 58 85 20 - G LE A N 02 27 2 XI ’ 77 62 89 1.O OE -1 5 59 15 00 — — 58 92 47 H yp ot he tic al pr ot ei n 1.5 91 0.4 41 3. 28 4 17 76 00 0- 17 75 55 1- A TP -b in di ng ca ss et te tra ns po rte r G LE A N 01 80 7 XI ’ 00 12 58 99 1 4. OO E- t5 3 1. 02 5 - 0. 55 3 1. 95 2 17 86 40 0 — — 17 80 92 3 [N eo sa rto ry afi sc he ril . 17 76 00 0- 17 82 26 5- G LE A N 01 80 6 XI ’ 57 03 59 6. OO E- t6 5 17 86 40 0 — — 17 84 46 8 M yo -in os ito lt ra ns po rte r 1. 02 5 - 0 55 3 1. 95 2 17 90 00 0- 17 90 71 9- pu ta tiv e 0- ac et yl tr an sf er as e - 2. 58 5 - 4. 05 0 0. 27 4 G LE A N 01 80 4 DA AO S9 S6 8.O OE -9 8 17 92 90 0 — 17 91 42 7 [C ry pto co cc us n eo fo rm an s v at . gr ub ii] . 18 09 20 0- 18 08 14 7 - Ch ai n A, Ra tB ra in 3- H yd ro xy ac yl -C oA G LE A N _0 24 89 IE 3W _A 2.O OE -0 6 - 1. 54 1 - 2. 70 1 0. 12 18 14 50 0 18 09 29 7 D eh yd ro ge na se A V G LR Lo w es tL R fo rr eg io n H ig he st LR fo rr eg io n C oo rd . C hr G le an n u m be r G en Ba nk ID E- va lu e Co or d. — CB S W M CB S W M CB S W M Pr ed ic te d fu nc tio n Bt 63 12 5.9 1 Bt 63 12 5.9 1 B t6 3 12 5.9 1 se gm en t G en ed 77 79 62 6 77 79 62 6 77 79 62 6 18 09 20 0- 18 11 11 9- 18 14 50 0 G LE A N _0 24 90 X P_ 77 65 39 2.O OE -3 1 18 13 22 4 H yp ot he tic al pr ot ei n - 1. 54 1 - 2. 70 1 0. 12 — — — — 6 21 00 -3 80 0 GL EA N_ 05 96 9 X P_ 5f l7 87 20 0E -1 56 34 09 -5 24 6 Fu ng al sp ec ifi ct ra ns cr ip tio nf ac to r - 2. 18 9 - 4. 01 7 0. 35 6 49 92 00 - N O G EN E N O G EN E - 1. 53 4 - 38 54 0. 77 9 50 12 00 73 00 00 - 73 08 25 - Po ly pr ot ei n LP ha ne ro ch ae te - 0. 85 5 - 2. 56 8 1. 49 7 GL EA N 05 83 3 A A Z2 89 42 l.O OE -0 8 73 30 00 — 73 13 31 ch ry os po riu m 1. 98 80 00 - 98 76 94 - G LE A N 05 78 2 X l’ 57 09 31 4.O OE -0 6 Ex on uc le as el l 0. 56 6 0. 07 4 1. 47 6 99 12 00 — — 99 08 61 Pr io n- lik e- (Q /N -ri ch )-d om ain -be ari ng 10 49 10 0- 10 49 80 3- GL EA N 05 77 3 N P 50 85 54 3. IO E+ 00 pr ot ei n fa m ily m em be r( pq n-3 7) - 0. 97 9 - 2. 47 4 0. 86 7 10 51 60 0 — — 10 50 55 5 IC ae no rh ab di tis ele ga na l. 10 60 40 0- 10 61 16 3 - In os ito l5 -p ho sp ha tas e, pu tat iv e GL EA N 05 77 1 X P 75 38 74 1. 30 E+ 00 - 0. 97 9 - 2. 47 4 0. 86 7 10 62 80 0 — — 10 62 34 8 [A sp erg ill us fu m ig at us i. 11 74 80 0- 11 74 20 0- M G M F fa m ily pr ot ei n [A sp erg ill us GL EA N 06 19 9 X P 00 12 74 92 5 3.O OE -0 7 - 0. 78 2 - 2. 59 9 1.2 33 11 79 20 0 — — 11 74 84 6 cla va tu s]. 11 74 80 0- 11 75 33 2- GL EA N 05 74 7 X P 57 10 89 4.O OE -1 49 11 79 20 0 — — 11 76 29 4 Ph os ph at id yl in os ito ltr an sp or te r - 0. 78 2 - 2. 59 9 1.2 33 11 74 80 0- 11 77 30 8- Re la te d to pu ta ti ve C yt O pl as m ic s tr u ct u ra l - 0. 78 2 - 25 99 12 33 GL EA N 05 74 6 CA E7 62 25 9.O OE -0 5 11 79 20 0 — 11 83 31 8 pr ot ei n[ Ne ur os po rac za ssa l. 13 13 00 0- 13 14 34 9- pr os ta gl an di n. en do pe ro xi de sy nt ha se 2 G LE A N 06 21 7 X P 52 49 99 3. 90 E+ 00 - 2. 30 1 - 4. 31 7 0. 10 9 13 15 10 0 — — 13 14 72 5 [P an tro glo dy tes l. 13 22 00 0- 13 20 68 9- G LE A N 06 22 0 X P 77 52 07 9.O OE -1 45 13 26 10 0 — — 13 22 10 0 H yp ot be tic al pr ot ei n - 1. 04 9 - 2. 68 1 1.2 15 13 22 00 0- 13 22 54 2- G1 .E AN 06 22 1 X P 77 52 08 5.O OE -1 24 13 26 10 0 — — 13 24 13 3 H yp ot he tic al pr ot ei n - 1. 04 9 - 2. 68 1 1. 21 5 13 22 00 0- 13 24 19 3- D im er ic di hy dr od io ld eh yd ro ge aa se , GL EA N 05 71 6 N P 00 12 69 41 3 4.O OE -3 8 - 1. 04 9 - 2. 68 1 1. 21 5 13 26 10 0 — — 13 25 67 9 pu ta tiv e(A sp erg illu scl av atu s}. 10 93 00 - 11 04 46 - G LE A N 05 95 0 N P 56 85 95 3.O OE -0 8 H yp ot he tic al pr ot ei n - 1. 96 5 - 4. 11 9 0. 37 6 11 22 00 — — 11 14 23 20 12 00 - 19 99 34 - G LE A N 06 0)0 N P 57 08 06 2.O OE -1 26 20 40 00 — — 20 )58 2 V ac uo le pr ot ei n - 0. 99 6 - 2. 87 8 0. 30 0 20 12 00 - 20 33 19 - GL EA N 06 0)1 N P 77 53 57 0.O OE +0 0 20 40 00 - - 20 55 22 H yp ot he tic al pr ot ei n - 0. 99 6 - 28 78 03 00 21 29 00 - 21 17 50 - 3- ca rb ox ym uc on at e cy cl as e [V ibr ion ale s - 0 89 1 - 3 34 7 1. 56 7 GL EA N 06 01 4 E0 K 25 84 9 2.O OE -0 7 22 46 00 — 21 31 26 ba ct er iu m l. 21 29 00 - 21 36 46 - GL EA N 06 01 5 N P 77 53 60 0.O OE +0 0 22 46 00 — — 21 76 92 H yp ot he tic al pr ot ei n - 0. 89 1 - 3. 34 7 1. 56 7 21 29 00 - 21 80 37 - G LE A N 05 93 2 N P 57 08 14 0.O OE +0 0 Fl av in -c on ta in in g m o n o o x yg en as e - 0. 89 1 - 3. 34 7 1. 56 7 22 46 00 — — 22 01 12 N A ]) bi nd in g R os sm an n fo ld 21 29 00 - 22 21 90 - GL EA N 06 01 6 Xl ’ 74 86 37 7.O OE -8 0 o x id or ed uc ta se ,p ut at iv e [A sp erg illu s - 0. 89 1 - 3. 34 7 1. 56 7 22 46 00 — — 22 36 49 fu m ig at us ]. 44 67 00 - 44 20 72 - SH 3d om ai np ro te in ,p ut at iv e - 2. 34 5 - 4. 75 2 0. 16 0 GL EA N 05 89 6 N P 00 12 64 56 9 1.O OE -2 4 45 37 00 — — 44 67 83 [N eo sa rto ry a f isc he ril . 44 67 00 - 44 78 00 - GL EA N 06 06 1 N P 56 92 89 2.O OE -0 9 45 37 00 — — 44 88 67 H yp ot he tic al pr ot ei n - 2. 34 5 4. 75 2 0. 16 0 44 67 00 - 45 04 54 - GL EA N 05 89 5 N P 56 69 21 9. OO E- l6 45 37 00 — — 45 12 07 H yp ot he tic al pr ot ei n - 2. 34 5 4. 75 2 0. 16 0 44 67 00 - 45 20 45 - R et ro tra ns po so n pr ot ei n, pu ta tiv e, T y l G LE A N 05 89 4 A BA 98 80 4 4.O OE -3 6 - 2. 34 5 - 4. 75 2 0. 16 0 45 37 00 - 45 26 69 co pi a su bc la ss [O ry za sa tiv a] 72 78 00 - 73 08 25 - Po ly pr ot ei n [P ha ne ro ch ae te - 2. 45 0 - 4. 25 7 1. 30 8 G LE A N _0 58 33 A A Z2 89 42 lO GE -O S 73 38 00 73 13 31 ch ry so sp or iu m . A V G LR Lo w es tL R fo r r eg io n H ig he st LR fo rr eg io n C oo rd . C br , G le an n u m be ra G en B an kI D ’ E- va lu e Co on ).— CB S W M CB S W M CB S W M B8 63 12 5. 91 se gm en . G en e’ Pr ed ic te d fu nc tio n B t6 3 12 59 1 77 79 62 6 B t6 3 12 5.9 1 77 79 62 6 77 79 62 6 72 78 00 - 73 15 42 - O lig op ep tid e A BC tr an sp or te r, pe rm ea se 24 50 - 4 25 7 13 08 6 73 38 00 G LE A N _0 58 32 N P_ 35 02 21 3. 70 E+ 00 73 25 01 co m po ne nt (C los tri diu m ac et ob ut yl ic um ] 72 78 00 - 73 32 27 - H om os er in e 0- ac et yl tra ns fe ra se G LE A N 06 11 8 XI ’ 00 12 74 19 2 5.O OE -6 4 - 2. 45 0 - 42 57 1. 30 8 73 38 00 — — 73 52 45 [A sp erg ill us cl av at us i. 10 45 20 0- 10 41 96 6 - A ct in cy to sk el et on o cg an iz at io n an d - 43 43 G LE A N 05 77 5 XI ’ 57 09 64 0.O OE +0 0 10 48 10 0 - - 10 46 13 1 bi og en ea ss -re la te d pr ot ei n 10 45 20 0- 10 46 46 5- G LE A N 05 77 4 X P 77 79 18 2.O OE -4 2 10 48 10 0 - - 10 48 42 0 H yp ot he tic al pr ot ei n - 2. 04 5 - 4. 34 3 0. 10 8 14 15 30 0- 14 16 14 3- M yc -re gu la te d D EA D /H bo x 18 RN A G LE A N 05 69 8 CA L5 53 33 5.O OE -7 6 - 2. 96 1 - 4. 88 6 0. 16 6 14 20 10 0 — 14 18 17 7 he lic as e- lik e (IS S) [O str eo co cc us ta ur ij. 14 15 30 0- 14 20 79 5- G LE A N 06 23 9 XI ’ 77 50 62 7.O OE -7 5 14 20 10 0 — — 14 21 72 5 H yp ot he tic al pr ot ei n - 2. 96 1 - 4. 88 6 0. 16 6 — — — — — — — — — — Fe rr ic -c he la te re du cta se ,p ut at iv e - 2. 72 4 - 4. 37 9 0. 09 3 7 66 00 - 10 60 0 G LE A N _0 02 62 X P_ 00 12 68 43 4 2.O OE -0 6 75 12 - 10 01 3 [A sp erg illu s c lav atu s). 14 27 00 - N O G EN E N O G EN E - 2. 40 9 - 4. 72 8 0. 18 0 14 49 00 38 36 00 - 38 21 03 - G LE A N 00 33 0 X P 57 16 55 2.O OE -1 17 39 59 00 — — 38 41 26 R tfl pr ot ei n - 2. 77 8 - 4. 83 2 0. 32 6 38 36 00 - 38 45 84 - H al oa lk an oi c ac id de ha lo ge na se ,p ut at iv e G LE A N 00 20 0 X P 00 12 58 57 4 7.O OE -1 8 - 2. 77 8 - 4. 83 2 0. 32 6 39 59 00 — — 38 57 25 [N eo sa rto ry a f tsc he ril . 38 36 00 - 38 68 01 - H SV -1 U L3 6- lik e pr ot ei n [G all id - 2 77 8 - 4 83 2 0 32 6 G LE A N 00 19 9 A BF 72 27 4 3, 20 E+ 00 39 59 00 — 38 75 65 he rp es vi ru s 2]. 38 36 00 - 38 82 29 - A m id oh yd ro la se [B urk ho lde ria - 2. 77 8 - 4. 83 2 0. 32 6 G LE A N 00 33 1 ZP 01 50 94 65 4.O OE -5 7 39 59 00 — — 38 98 73 ph yt of irm an s]. — 38 30 00 - 39 09 74 - M I’S al la nt oa te tr an sp or te r, pu ta tiv e - 2 77 8 - 4. 83 2 0. 32 6 G LE A N 00 19 8 X P 00 12 60 34 3 t.O OE -5 5 39 59 00 — — 39 29 35 [N eo tar tor ya fis ch er i]. 38 36 00 - 39 37 93 - H yp ot he tic al pr ot ei n [A rth ro ba cte r S I) - 2 77 8 - 4 83 2 0 32 6 G LE A N 00 33 2 V P 83 3t 24 2. 00 E- lO 39 59 00 — — 39 44 39 FB 24 1. 38 36 00 - 39 48 32 - G LE A N 00 33 3 XI ’ 57 16 57 5.O OE -7 8 39 59 00 — — 39 55 88 H yp ot he tic al pr ot ei n - 2 77 8 . 4. 83 2 0. 32 6 38 36 00 - 39 57 79 - G LE A N 00 19 7 XI ’ 57 16 32 5.O OE .17 5 O Pt -a nc ho r I ra ns am id as e 1 - 2 77 8 - 4. 83 2 0 32 6 39 59 00 — — 39 72 55 38 40 00 - 38 21 03 - G LE A N 00 33 0 XI ’ 57 16 55 2. OO E- 1l 7 R tfl pr ot ei n - 3. 14 0 - 4. 82 6 0. 90 7 39 54 00 — — 38 41 26 38 40 00 - 38 45 84 - G LE A N 00 20 0 XI ’ 77 74 54 3.O OE -3 1 39 54 00 — — 38 57 25 H yp ot he tic al pr ot ei n - 3. 14 0 - 48 26 0. 90 7 38 40 00 - 38 68 01 - 81 EV -I U 13 6- lik e pr ot ei n [G all id - 3. 14 0 - 4 82 6 09 07 G LE A N 00 19 9 A BF 72 27 4 3. 20 E+ 00 39 54 00 — 38 75 65 he rp es vi ru s 2]. 38 40 00 - 38 82 29 - Co ns er ve d hy po th et ic al pr ot ei n 3 14 0 - 48 26 09 07 G LE A N 00 33 1 XI ’ 00 12 17 24 2 8.O OE -8 6 39 54 00 — — 38 98 73 [A sp erg ill us ter reu a]. 38 40 00 - 39 09 74 - M FS al la nt oa te tr an sp or te r, pu ta tiv e G LE A N 00 19 8 X P 00 12 60 34 3 l.O OE -5 5 - 3. 14 0 - 4. 82 6 0. 90 7 39 54 00 — — 39 29 35 [N eo tar tor ya fis ch er i). 38 40 00 - 39 37 93 - H yp ot he tic al pr ot ei n [A rth rob ac ter op . G LE A N 00 33 2 V P 83 31 24 2.O OE -1 0 - 3. 14 0 - 4. 82 6 0. 90 7 39 54 00 — — 39 44 39 FB 24 ]. 38 40 00 - 39 48 32 - G LE A N 00 33 3 X P 57 16 57 5.O OE -7 8 39 54 00 — — 39 55 88 H yp ot he tic al pr ot ei n - 3. 14 0 - 4. 82 6 0. 90 7 42 56 00 - 42 57 06 - G LE A N 00 19 2 X P 57 14 29 2.O OE -1 7l 42 68 00 — — 42 73 78 H yp ot he tic al pr ot ei n - 0. 71 6 - 2. 73 9 0. 32 9 58 60 00 - 58 67 77 - Pu ta tiv e n ic ot in na m in e am in ot ra ns fe rn se — G LE A N 00 16 6 BA D 23 58 2 7.O OE -4 4 - 0. 92 0 - 1. 51 1 0. 04 3 58 68 00 — 58 84 29 A [O ryz as at iv a] 67 36 00 - 67 29 33 - G LE A N _0 01 48 Y P_ 08 l8 54 3. 20 E+ 00 67 60 00 67 37 05 Ch ap er on in G ro EL [B ac illu sc er eu s]. 0.4 71 - 0. 35 4 00 A V G LR Lo w es tL R fo rr eg io n H ig he st LR fo rr eg io n C oo rd . C hr G le an n u m be rb G en B an k w E- va lu e C oo rd .— CB S W M CB S W M CB S W M Pr ed ic te d fu nc tio n B t6 3 12 5. 91 B t6 3 12 5. 91 B t6 3 12 5. 91 se gm en t G en e 77 79 62 6 77 79 62 6 77 79 62 6 72 47 00 - 72 49 39 - 72 52 00 G LE A N _0 03 99 P1 97 11 6. 40 E- 01 72 92 21 Po ly pr ot ei np ro te as & he lic as e 0. 94 4 0. 27 7 0. 93 9 98 28 00 - 97 93 12 - H yp ot he tic al pr ot er n IC op rrn op ai s 0 47 1 0 35 4 19 19 GL EA N 00 44 1 EA U 91 57 9 9, 00 E- 12 1 98 36 00 — 98 37 68 ci ne re a] 10 02 50 0- 10 05 13 5- G LE A N 00 44 3 X P 57 12 85 3, 00 E- l5 1 10 12 10 0 — — 10 06 42 5 A lp ha .1 6- m an no sy ltr an sf er as e - 1. 91 8 - 3. 46 7 0. 41 7 10 02 50 0- 10 06 67 3- G LE A N 00 08 1 XI ’ 57 15 82 5.O OE -4 0 H yp ot he tic al pr ot ei n - 1. 91 8 - 34 67 0. 41 7 10 12 10 0 — — 10 07 11 3 10 02 50 0- 10 08 85 0- G LE A N 00 44 4 XI ’ 57 15 57 0, 00 E+ 00 G TP as e ac tiv at in g pr ot ei n] - 1. 91 8 - 3. 46 7 0. 41 7 10 12 10 0 — — 10 11 92 1 10 18 90 0- 10 16 99 8- G LE A N 00 44 6 XI ’ 57 12 76 0.O OE +0 0 Cy to ch ro m eP 45 o - 1. 04 4 - 2. 34 0 0.4 01 10 20 80 0 — — 10 19 11 9 10 18 90 0- 10 20 30 0- m R N A ca p- bi nd in gp ro te in el F4 E 1 04 4 - 23 40 04 01 G LE A N 00 44 7 N P 00 10 40 37 9 3 OO E- 20 10 20 80 0 — — 10 22 26 6 [B om by xm or i]. 11 00 70 0- 11 05 57 3- G LE A N 00 46 7 XI ’ 57 12 51 5, 00 E- 16 6 11 02 00 0 — — 11 08 23 2 G ly co sy ltr an sf er as e - 1. 19 3 - 2. 23 1 0.3 71 11 28 00 0- 11 27 18 9- G LE A N 00 05 2 X P 77 47 11 3.O OE -9 1 11 31 00 0 — — 11 29 17 7 H yp ot he tic al pr ot ei n - 0, 68 7 - 1. 59 7 0. 12 6 11 28 00 0- 11 29 58 1 - Ph os ph og ly ce ra te m u ta te fa m ily pr ot ei n - 0 68 7 - I 59 7 0 12 6 G LE A N 00 05 1 Y P 00 11 08 51 7 2.O OE -1 5 11 31 00 0 — — 11 30 77 2 IS ac ch ar op ol ys po ra es yt hr ae a]. 11 66 60 0- N O G EN E N O G EN E - 1. 14 8 - 2. 87 4 0. 21 2 11 67 90 0 11 88 00 0- 11 89 46 5- G LE A N 00 04 0 X P 77 24 01 3.O OE -1 38 11 96 00 0 — — 11 93 48 9 H yp ot he tic al pr ot ei n - 1. 61 3 - 1. 86 6 - 4. 19 2 - 5. 28 0 0, 77 2 0, 21 3 11 88 00 0- 11 94 26 2- G LE A N 00 47 7 A BA 94 15 5 2. 40 E+ 00 11 96 00 0 — 11 95 75 9 Ex pr es se d pr ot ei n [O ryz at at iv a] - 1. 61 3 - 1. 86 6 - 4, 19 2 - 5. 28 0 0, 77 2 0, 21 3 11 89 10 0- 11 89 46 5- G LE A N 00 04 0 X P 77 24 01 3.O OE -1 38 11 92 90 0 — — 11 93 48 9 H yp ot he tic al pr ot ei n 0. 86 8 - 0. 15 6 1. 85 2 12 62 20 0- 12 56 63 5 - R N A po ly m er as e ifi la rg e su bu ni t G LE A N 00 02 6 XI ’ 71 54 82 1.O OE -1 76 - 0. 61 2 - 2. 65 6 0. 21 3 12 80 00 0 — — 12 61 87 2 IC an di da al bi ca ns l. 12 62 20 0- 12 62 53 4- G LE A N 00 02 5 X P 56 83 78 2.O OE -1 22 12 80 00 0 — — 12 63 57 6 H yp ot he tic al pr ot ei n - 0. 61 2 - 2. 65 6 1.5 13 12 62 20 0- 12 65 21 0- G LE A N 00 48 9 X P 77 22 60 3. OO E- l4 12 80 00 0 — — 12 65 55 7 H yp ot he tic al pr ot ei n - 0. 61 2 - 2. 65 6 1.5 13 12 62 20 0- 12 66 92 8- G LE A N 00 02 4 X P 77 47 52 1.O OE -3 7 12 80 00 0 — — 12 67 25 2 H yp ot he tic al pr ot ei n - 0. 61 2 - 2. 65 6 1.5 13 12 62 20 0- 12 68 78 9- G LE A N 00 02 3 XI ’ 57 10 67 3.O OE -0 4 12 80 00 0 — — 12 68 91 7 H yp ot he tic al pr ot ei n - 0. 61 2 - 2. 65 6 1.5 13 12 62 20 0- 12 73 36 0- C2 H 2 tr an sc rip tio n fa ct or (C on 7), G LE A N 00 49 0 X P 00 12 60 55 3 4, 00 E- 24 - 0. 61 2 - 2. 65 6 1.5 13 12 80 00 0 — — 12 75 46 5 pu ta tiv e [N eo sa rto ry af isc he rij. 12 62 20 0- 12 76 32 5- G LE A N 00 49 1 XI ’ 56 83 78 3. OO E- l2 9 12 80 00 0 — — 12 77 39 1 H yp bt he tic al pr ot ei n - 0. 61 2 - 2. 65 6 1.5 13 13 67 60 0- 13 67 76 6- G LE A N 00 00 4 BA D 31 39 8 3. 20 E+ 00 13 69 80 0 — 13 68 42 8 H yp ot he tic al pr ot ei n [O ryz as al iv a] - 1. 70 2 - 3. 17 1 0, 06 4 13 96 80 0- 13 95 13 5- G LE A N 00 51 3 X P 38 35 45 2. OO E- l4 6 13 99 48 9 — — 13 96 99 4 H yp ot he tic al pr ot ei n [G ibb ere lla ze ae l. - 2. 40 8 - 1. 98 5 1.7 01 — — — — — — 8 0- 14 00 0 G LE A N _0 65 17 X P_ 77 63 87 8.O OE -5 9 40 33 - 63 12 H yp ot he tic al pro tei n - 1. 28 2 - 4. 16 2 1. 26 7 10 04 6- 0- 14 00 0 G LE A N _0 65 16 X P_ 77 65 39 1.O OE -2 8 12 15 1 H yp ot he tic al pr ot ei n - 1, 28 2 - 4. 16 2 1. 26 7 13 76 9- L- PS P en do rib on uc le as e fa m ily pr ot ei n, 0- 14 00 0 G LE A N _0 65 15 X P_ 74 65 29 4.O OE -4 3 - 1, 21 2 - 4. 16 2 1, 26 7 14 32 3 pu ta tiv e [A sp erg ill us fü m ig at us ]. A V G LR L ow es lL R fo rr eg io n H ig he st LR fo rr eg io n O ir C oo rd .- G le an n u m be rb G en B an k ID ’ B -v al ue C oo rd . - CB S W M CB S W M CB S W M Pr ed ic te d fu nc tio n’ B t6 3 12 5. 91 B t6 3 12 5. 91 B t6 3 12 5. 91 se gm en t’ G en e’ 77 79 62 6 77 79 62 6 77 79 62 6 8 40 00 -3 37 00 G LE A N _0 65 17 X P_ 77 63 87 8.O OE -5 9 40 33 - 63 12 H yp ot he tic al pr ot ei n - 2. 84 0 - 4. 66 0 0. 34 6 10 04 6- 40 00 -3 37 00 G LE A N _0 65 16 X P_ 77 65 39 l.O OE -2 8 H yp ot he tic al pr ot ei n - 2. 84 0 - 4. 66 0 0. 34 6 12 15 1 13 76 9- L- PS P en do rib on uc le as e Em ily pr ot ei n, 40 00 - 33 70 0 G LE A N _0 65 15 X P_ 74 65 29 4.O OE -4 3 - 2. 84 0 - 4.6 60 0. 34 6 14 32 3 pu ta tiv e [A sp erg ill us fis m ig atu s]. 15 32 0- H yp ot he tic al pr ot ei n, co n se rv ed - 2 84 0 - 4. 66 0 0 34 6 40 00 - 33 70 0 G LE A N _0 65 18 CA M 37 29 2 3.1 OE -f0 0 15 84 8 [L eis hm an iab raz ili en sis j. 16 36 0- Fu ng al sp ec ifi ct ra ns cr ip tio n Ec to r, - 2 84 0 - 4 66 0 0. 34 6 40 00 - 33 70 0 G LE A N _0 65 19 X P_ 00 12 62 47 0 1.O OE -1 1 18 63 6 pu ta tiv e IN eo sa rto ry af io ch er i]. 19 16 1 - M FS al ph a- gl uc os id e tr an sp or te r, pu ta tiv e - 2. 84 0 - 4. 66 0 0. 34 6 40 00 - 33 70 0 G LE A N _0 65 14 X P_ 75 10 72 2.O OE -3 7 21 27 0 [A sp erg ill us Em ig at us ]. 21 43 6- G ly co sid e hy dr ol as e [C hlo ro fle xu s 40 00 - 33 70 0 G LE A N _0 65 20 ZP _0 07 67 t7 8 l.O OE -1 6 - 2. 84 0 - 4. 66 0 0. 34 6 21 86 5 au ra n tia cu s]. 22 01 6- G ly co sid e hy dr ol as e [R os eif lex us 40 00 -3 37 00 GL EA N_ 06 52 1 ZP _0 15 31 62 6 9.O OE -5 6 - 2. 84 0 - 4. 66 0 0. 34 6 23 95 8 ca st en ho lz iil . 24 51 9- 40 00 -3 37 00 G LE A N _0 65 13 X P_ 57 21 41 l.O OE -2 8 26 49 0 Ef flu x pr ot ei n En cT - 2. 84 0 - 4. 66 0 0. 34 6 28 05 2- 40 00 -3 37 00 G LE A N _0 65 12 X P_ 56 94 75 2.O OE -0 6 29 42 6 In te gr al m em br an e pr ot ei n - 2. 84 0 - 4. 66 0 0. 34 6 30 22 7- 40 00 -3 37 00 GL EA N 06 51 1 X P_ 77 76 95 2.O OE -5 6 32 23 8 H yp ot he tic al pr ot ei n - 2. 84 0 - 4. 66 0 0. 34 6 15 32 0- H yp ot he tic al pr ot ei n, co n se rv ed 3 56 2 - 4 85 2 - 0 24 3 14 00 0- 36 00 0 G LE A N _0 65 18 CA M 37 29 2 3. IO E+ 00 15 84 8 IL ei sh m an ia br az ili en sis ]. 16 36 0- Fu ng al sp ec ifi c tr an sc rip tio n Ec to r, - 3. 56 2 - 4 85 2 - 0 24 3 14 00 0- 36 00 0 G LE A N _0 65 19 X P_ 00 12 62 47 0 1.O OE -1 I 18 63 6 pu ta tiv e [N eo sa rto ry af isc he ri] . 19 16 1 - M FS ai ph a- gl uc os id et ra ns po rte r, pu ta tiv e 14 60 0- 36 00 0 G LE A N _0 65 14 X P_ 75 10 72 2.O OE -3 7 - 3. 56 2 - 4. 85 2 - 0. 24 3 21 27 0 [A sp erg ill us fis m ig atu s]. 21 43 6- G ly co sid e hy dr ol as e [C hio ro fle xu s 14 00 0- 36 00 0 G LE A N _0 65 20 ZP _0 07 67 17 8 l.0 0E -1 6 - 3. 56 2 - 4. 85 2 - 0. 24 3 21 86 5 au ra n tia cu s]. 22 01 6- G ly co sid e hy dr ol as e (R os eif lex us 14 00 0- 36 00 0 G LE A N _0 65 21 ZP _0 15 31 62 6 9.O OE -5 6 - 3. 56 2 - 4. 85 2 - 0. 24 3 23 95 8 ca st en ho lz ii] . 24 51 9- 14 00 0- 36 00 0 G LE A N _0 65 13 X P5 72 14 1 1.O OE -2 8 26 49 0 Ef flu xp ro te in En cT - 3. 56 2 - 4. 85 2 - 0. 24 3 28 05 2- H yp ot he tic al pr ot ei n A N 81 04 .2 - 3 56 2 - 4. 85 2 - 0. 24 3 14 00 0- 36 00 0 G LE A N _0 65 12 X P_ 68 13 73 2.O OE -0 7 29 42 6 [A sp erg illu sn id ul an s]. 30 22 7- 14 00 0- 36 00 0 G LE A N _0 65 1 1 X P_ 77 76 95 2.O OE -5 6 32 23 8 H yp ot he tic al pr ot ei n - 3. 56 2 - 4. 85 2 - 0. 24 3 34 18 9- 14 00 0- 36 00 0 G LE A N _0 65 22 X P_ 77 30 71 4.O OE -1 73 35 78 2 H yp ot he tic al pr ot ei n - 3. 56 2 - 4. 85 2 - 0. 24 3 34 18 9- 33 70 0- 37 40 0 GL EA N_ 06 52 2 X P_ 77 30 71 4.O OE -1 73 35 78 2 H yp ot he tic al pr ot ei n 0. 94 5 - 0. 98 2 2. 33 3 37 46 4- 33 70 0- 37 40 0 GL EA N_ 06 51 0 X P_ 57 l0 50 6.O OE -4 7 38 14 3 H yp ot he tic al pr ot ei n 0. 94 5 - 0. 98 2 2. 33 3 37 46 4- 36 00 0- 37 80 0 GL EA N_ 06 51 0 X P_ 57 10 50 6.O OE -4 7 38 14 3 H yp ot he tic al pr ot ei n - 1. 48 9 - 2. 31 5 - 0. 32 4 45 31 00 - 45 35 15 - GL EA N 06 59 7 X l’ 56 83 16 5.O OE -3 6 45 85 00 — — 45 41 11 H yp ot he tic al pr ot ei n - 1. 30 1 - 2. 98 7 0. 26 3 45 31 00 - 45 49 35 - GL EA N 06 59 8 A BA 98 78 5 9.O OE -2 9 45 85 00 — 45 67 52 R et ro tra ns po so np ro te in [O ry za sa tiv a]. - 1. 30 1 - 2. 98 7 0. 26 3 45 31 00 - 45 68 52 - Co pi n- ty pe po ly pr ot ei n [A rab ido ps is - l 30 1 - 2. 98 7 0 26 3 GL EA N 06 59 9 A A G 50 69 8 l.O OE -4 0 45 85 00 — 45 79 28 th al ia na ]. 45 31 00 - 45 82 00 - GL EA N_ 06 60 0 BA D 34 49 3 l.O OE -3 2 45 85 00 45 88 05 G ag -P ol [lp om oe ab ata tas ]. - 1. 30 1 - 2. 98 7 0. 26 3 A V G LR Lo w es tL R fo rr eg io n H ig he st LR fo rr eg io n C oo rd .- G le nn n u m be r 5 G en B an k ID ’ E- va lu e Co or d. - CB S W M CB S W M CB S W M B t6 3 12 5. 91 C hr se gm cn t Ge ne ’ Pr ed ic te d fu nc tio n’ B t6 3 12 5.9 1 77 79 62 6 B t6 3 12 5. 91 77 79 62 6 77 79 62 6 45 36 00 - 45 35 15 - 45 84 00 G LE A N _0 65 97 X P_ 56 83 16 5.O OE -3 6 45 41 11 H yp ot he tic al pr ot ei n - 1. 39 9 - 2. 98 7 0. 26 3 45 40 00 - 45 35 15 - G LE A N 06 59 7 X P 56 83 16 5.O OE -3 6 45 77 00 — — 45 41 11 H yp ot he tic al pr ot ei n 0. 88 9 0. 75 5 - 0. 69 7 - 1. 17 8 2. 45 9 2. 10 8 45 40 00 - 45 49 35 - R et ro tra ns po so np ro te m ,T yl -c op ia 0. 88 9 0. 75 5 - 0. 69 7 - 1. 17 8 2. 45 9 2. 10 8 G LE A N 06 59 8 A BA 98 78 5 9.O OE -2 9 45 77 00 — 45 67 52 su bc la ss [O ryz as at iv a] 45 40 00 - 45 68 52 - Co pi a- ty pe po ly pr ot ei n [A rab ido ps is 0. 88 9 0. 75 5 - 0. 69 7 - 1. 17 8 2. 45 9 2. 10 8 G LE A N 06 59 9 A A G 50 69 8 l.O OE -4 0 45 77 00 — 45 79 28 th al ia na l 52 16 00 - N O G EN E N O G EN E - 0. 73 7 - 2. 32 2 0. 89 7 52 24 00 65 42 00 - 65 16 56 - G LE A N 06 63 7 XI ’ 76 11 06 3.O OE -l2 3 65 52 00 — — 65 45 05 H yp ot he tic al pr ot ei n [U sti lag o m ay di s]. - 1. 81 9 - 3. 22 7 0. 42 8 71 75 00 - N O G EN E N O G EN E - 1. 28 7 - 2. 39 7 - 0. 13 2 71 86 00 10 44 40 0- 10 45 01 2- G LE A N 06 71 0 A A 09 26 38 2.O OE -1 34 Pu ta tiv e tr an sp os as e - 1. 12 2 - 2. 44 4 0. 58 8 10 45 60 0 — 10 46 59 6 10 49 70 0- 10 49 44 5- G LE A N 06 30 5 X P 57 21 41 3.O OE -6 1 10 51 70 0 — — 10 51 17 2 Ef flu xp ro te in En cT - 2. 81 4 - 4. 72 1 0. 46 3 10 59 80 0- N O G EN E N O G EN E - 0. 97 0 - 2. 09 0 0. 07 5 10 66 30 0 10 59 80 0- 10 61 91 6- G LE A N 06 71 2 X P 77 24 01 9. OO E- l3 6 H yp ot he tic al pr ot ei n - 0. 97 0 - 2. 09 0 0. 07 5 10 66 30 0 — — 10 65 91 4 10 60 50 0- N O G EN E N O G EN E - 2. 21 8 - 4. 34 9 0. 66 5 10 66 30 0 10 60 50 0- 10 61 91 6- G LE A N 06 71 2 X l’ 77 24 01 9. OO E- l3 6 10 66 30 0 — — 10 65 91 4 H yp ot he tic al pr ot ei n - 2. 21 8 - 4. 34 9 0. 66 5 10 59 80 0- N O G EN E N O G EN E - 2. 49 5 - 5. 41 7 0. 41 6 10 66 30 0 10 59 80 0- 10 61 91 6- G LE A N 06 71 2 X P 77 24 01 9.O OE -1 36 10 66 30 0 — — 10 65 91 4 H yp ot he tic al pr ot ei n - 2. 49 5 - 5. 41 7 0. 41 6 — — — — 9 13 00 -8 00 0 G LE A N 03 97 8 X P7 72 64 4 1.O OE -1 20 11 59 -4 63 7 H yp ot he tic al pr ot ei n - 1. 67 8 - 4. 23 4 0. 52 3 13 00 - 80 00 G LE A N _0 39 77 X P_ 00 12 10 24 8 6.O OE -3 0 57 39 -7 03 5 Co ns er ve d hy po th et ic al pr ot ei n - 1. 67 8 - 4. 23 4 0. 52 3 lA sp er gi llu s t er re us ]. 13 00 -4 40 0 G LE A N _0 39 78 X P_ 77 26 44 l.O O E- l2 0 11 59 - 4 63 7 H yp ot he tic al pr ot ei n 1 38 6 - 0. 71 0 2. 18 7 50 00 -7 40 0 G LE A N 03 97 7 X P_ 00 12 l0 24 8 6.O OE -3 0 57 39 -7 03 5 Co ns er ve d hy po th et ic al pr ot ei n - 2. 57 6 - 4. 13 9 - 00 13 [A sp erg ill us te rr eu s]. 11 54 9- D U F8 95 do m ai n m em br an e pr ot ei n - 2 82 5 - 4 45 6 - 0 05 7 11 00 0- 17 70 0 G LE A N _0 39 76 X P_ 74 72 18 7.O OE -2 6 13 69 5 EA sp er gi llu sf itm ig at us ]. 14 23 0 M uc in -d es ul fa tin g su lfa ta se (N 11 00 0- 17 70 0 G LE A N _0 39 80 ZP _0 10 00 09 4 4. OO E- l6 7 - 16 09 0 ac et y1 gl Uc os o. fl5 in n- 6s ul th tas e) - 2. 82 5 - 4. 45 6 - 0. 05 7 [O cra nic ola ba tse ns is] . 17 63 8 - 11 00 0- 17 70 0 G LE A N _0 39 75 X P_ 56 70 67 I. 70 E+ 00 H yp ot he tic al pr ot ei n - 2. 82 5 - 4. 45 6 - 0. 05 7 17 87 1 32 20 0- 34 70 0 N O G EN E N O G EN E 1. 23 2 - 0. 15 8 3. 20 5 33 86 1 - A ld o/ ke to re du ct as e (M ag ne tos pir illu m 1 23 2 - 0 15 8 3 20 5 32 20 0 - 34 70 0 G LE A N _0 39 84 Y P_ 42 l8 19 9.O OE -2 4 35 40 8 m u gn et ic um ]. 73 84 7 - 74 00 9- 76 00 0 G LE A N _0 39 62 X P_ 57 30 19 3.O OE -8 7 74 74 0 H yp ot he tic al pr ot ei n 0. 38 5 - 0. 47 3 1. 14 6 75 33 3 - 74 00 9- 76 00 0 G LE A N _0 39 9l EA U 86 88 1 3. 30 E- 02 77 19 8 Pr ed ic te d pr ot ei n [C op rin op sis ci ne re a]. 0. 38 5 - 0. 47 3 1. 14 6 29 98 00 - 29 25 12 - H yp ot he tic al pr ot ei n (C op rin op sis G LE A N _0 40 33 EA U 9I O3 5 1. OO E- 1l 3 - 1. 53 6 - 3. 05 0 0. 92 9 29 26 00 29 49 15 ci ne re a]. - a A V G LR Lo w es t L R fo rr eg io n H ig he st LR fo r r eg io n Co or d. - G le n. n iim i,e rb G en B an k flY E- va )ue C oo rd .- CB S W M CB S W M CB S W M Pr ed ic te d fu nc tio n B t6 3 12 5.9 1 B t6 3 12 5. 91 B t6 3 12 5. 91 C br se gm en t G en e’ 77 79 62 6 77 79 62 6 77 79 62 6 71 80 00 - 71 84 24 - 71 88 00 G LE A N _0 41 23 X P_ 57 28 08 1.O OE -3 4 71 91 34 Tr af fic ki ng -re la te d pr ot ei n - 1. 60 7 - 3. 02 8 0. 11 5 10 38 40 0- 10 40 38 9- G LE A N 03 77 4 X P 77 40 01 5.O OE -1 38 H yp ot he tic al pr ot ei n - 0. 76 5 - 2. 22 2 1. 05 2 10 42 10 0 — — 10 43 14 2 10 63 60 0- 10 63 40 8- G LE A N 03 76 9 X i’ 96 47 02 2.O OE -0 9 10 66 10 0 — — 10 64 05 3 H yp ot he tic al pr ot ei n[ Ne ur os po rac ea aa aj. - 3. 26 7 - 4. 39 7 - 1. 62 6 10 63 60 0- 10 64 10 5- G LE A N 03 76 8 XI ’ 96 47 02 3. 30 E- 01 10 66 10 0 — — 10 65 00 9 H yp ot he tic al pr ot ei n [N eu ro sp or a c ra ss a]. - 3. 26 7 - 4. 39 7 - 1. 62 6 10 81 70 0- N O G EN E N O G EN E - 2. 19 2 - 3. 29 0 - 0. 38 6 10 82 80 0 11 70 20 0- 11 73 01 2- G LE A N 04 20 9 X P 77 26 44 2. 00 5- 12 0 11 86 50 0 — — 11 74 84 4 H yp ot he tic al pr ot ei n - 2. 60 7 - 4. 12 7 - 0. 48 1 11 70 20 0 - 11 75 58 6- D EA D /D EA lS bo x he lic as e, pu ta tiv e G LE A N 04 21 0 X l’ 00 12 57 27 3 2.O OE -0 9 - 2. 60 7 - 4. 12 7 - 0. 48 1 11 86 50 0 — — 11 80 93 5 [N eo sa rto ry a f isc he ri] . 11 70 20 0- 11 82 35 1- G LE A N 04 21 1 X i’ 77 50 62 2.O OE -4 2 11 86 50 0 — — 11 83 09 2 H yp ot he tic al pr ot ei n - 2. 60 7 - 4. 12 7 - 0. 48 1 11 70 20 0- 11 84 09 9- G LE A N 04 21 2 X P 57 10 65 4. IO E- 02 11 86 50 0 — — 11 84 27 2 H yp ot he tic al pr ot ei n - 2. 60 7 - 4. 12 7 - 0. 48 1 11 70 20 0- 11 84 51 2- G LE A N 04 21 3 XI ’ 77 50 62 4.O OE -4 9 H yp ot he tic al pr ot ei n - 2. 60 7 - 4. 12 7 - 0. 48 1 11 86 50 0 — — 11 84 95 4 — — — — M FS m o n o sa cc ha rid e t ra ns po rte r, pu ta tiv e 10 17 00 -4 00 0 G LE A N _0 69 77 X P_ 00 l2 60 91 9 5.O OE -9 7 19 08 -3 41 1 - 2. 51 7 4. 69 5 07 17 fN eo sa rto iy a fis ch er i 19 34 4- 18 80 0- 20 20 0 G LE A N 06 97 2 X P_ 56 92 89 1. 00 5- 72 20 37 5 H yp ot he tic al pr ot ei n 0.5 31 0. 03 0 1. 40 3 24 80 0- 26 30 0 N O G EN E N O G EN E - 2. 59 1 - 3. 78 8 0. 05 6 82 31 0- 83 10 0- 84 00 0 G LE A N _0 69 60 X P_ 57 03 19 5.O OE -1 30 M FS tr an sp or te r - 1. 95 8 - 3. 36 6 0. 68 6 84 32 7 58 17 00 - 58 21 43 - R et ro vi ru s- re la te d Po lp ol yp ro te in fro m G LE A N 07 08 8 P1 09 78 3.O OE -8 6 - 2. 92 1 - 4. 50 9 - 0. 21 2 58 70 00 — 58 65 13 tr an sp os on TN T 1- 94 58 16 00 - 58 21 43 - R et ro vi ru s- re la te d Po lp ol yp ro te in fro m G LE A N 07 08 8 P1 09 78 3.O OE -8 6 - 3. 19 8 - 4. 96 7 - 0. 11 0 58 70 00 — 58 65 13 tr an sp os on TN T 1. 94 80 68 00 - 80 81 70 - G LE A N 06 82 5 XI ’ 56 74 57 2. 00 5- 15 4 H yp ot he tic al pr ot ei n - 2. 20 1 - 3. 71 4 - 0. 24 4 80 80 00 — — 81 03 51 82 38 00 - NO G EN E NO G EN E - 0. 69 1 - 1. 26 6 - 0. 23 7 96 17 00 - N O G EN E N O G EN E - 1. 95 9 - 3. 96 8 0.2 21 96 32 00 Pr ob ab le di m et hy la ni lin e m o n o o x yg en as e 10 31 70 0- 10 29 79 7- G LE A N 06 78 4 Y P 70 56 33 4.O OE -8 1 (N -ox ide -fo rm ing ) [ Rh od oc oc cu s o p. - 3. 15 1 - 4. 98 8 - 0. 02 6 10 48 70 0 — — 10 32 09 5 R JI A I]. 10 31 70 0- 10 34 33 0- G LE A N 07 17 2 X P 77 25 66 4.O OE -6 6 H yp ot he tic al pr ot ei n - 3. 15 1 - 4. 98 8 - 0. 02 6 10 48 70 0 — — 10 37 70 5 10 31 70 0- 10 39 20 3- G LE A N 07 17 4 XI ’ 57 29 46 3.O OE -0 9 Po ly ad en yl at io n Ih ct or 64 kD a su bu ni t - 3. 15 1 4. 91 8 - 0. 02 6 10 48 70 0 — — 10 39 49 0 10 31 70 0- 10 41 53 4- G LE A N 06 78 3 XI ’ 57 27 73 7.O OE -7 2 En do pl as m ic re tic ul um re ce pt or - 3. 15 1 - 4. 98 8 - 0. 02 6 10 48 70 0 — — 10 43 16 8 10 31 70 0- 10 46 46 5 - Sh or t-c ha in de hy dr og en as e/ re du ct as e - 3 15 1 - 4 98 8 - 0 02 6 G LE A N 07 17 6 Y P 00 11 35 57 6 3.O OE -0 5 10 48 70 0 - - 10 47 59 0 SD R[ M yc ob ac ter ium gil vu m] . 10 53 50 0- 10 53 01 7- G LE A N 07 17 8 X P 57 24 74 3.O OE -0 3 10 59 20 0 — — 10 53 52 3 H yp ot he tic al pr ot ei n - 0. 75 4 2. 14 9 1. 17 8 10 53 50 0- 10 54 50 4- G LE A N _0 71 79 X t’_ 77 26 44 2.O OE -5 8 10 57 59 6 H yp ot he tic al pr ot ei n - 0. 75 4 - 2. 14 9 1. 17 8 10 59 20 0 A V G LR Lo w es t L R fo rr eg ioM H ig he st LR fo r r eg io u C oo rd .- G le . n u .ib er ’ G en B ai k W E -v al .e CO or d. - CB S W M CB S W M CB S W M Pr ed ic te d fn ac (io B t6 3 11 5. 91 B t6 3 12 5. 91 B t6 3 12 5.9 1 C kr se g. Ie nt G c. e’ 77 79 62 6 77 79 62 6 77 79 62 6 10 55 10 0- 10 04 89 2- rR N A (ad en ine -N 6,N 6-) - - 1. 19 1 - 3. 98 2 1. 34 7 10 10 59 20 0 G LE A N _0 71 67 X P_ 56 72 69 1.O OE -1 47 10 06 23 1 di m et hy ltr an sf er as e — — — — — 11 30 00 - 99 00 G LE A N _0 15 1 I X P7 72 64 4 I.O OE -5 5 36 30 - 52 05 H yp ot he tic al pr ot ei n - 3. 08 6 - 4. 66 8 - 0. 13 6 30 00 - 99 00 GL EA N_ 01 51 0 X P_ 77 50 62 2.O OE -4 2 67 29 - 74 70 H yp ot he tic al pr ot ei n - 3. 08 6 - 4. 66 8 - 0. 13 6 30 00 -9 90 0 G LE A N _0 15 09 X P_ 77 63 87 2.O OE -0 4 85 77 -9 28 6 H yp ot he tic al pr ot ei n - 3. 08 6 . 4. 66 1 - 0. 13 6 30 00 - 13 90 0 G LE A N _O lS il X P_ 77 26 44 1.O OE -5 5 36 30 - 52 05 H yp ot he tic al pr ot ei n - 2. 83 5 - 4. 69 6 0. 61 4 30 00 - 13 90 0 G LE A N _0 15 10 X P_ 77 50 62 2.O OE -4 2 67 29 - 74 70 H yp ot he tic al pr ot ei n - 2. 83 5 - 4. 69 6 0. 61 4 30 00 - 13 90 0 G LE A N _0 15 09 X P_ 77 63 87 2.O OE -0 4 85 77 -9 28 6 H yp ot he tic al pr ot ei n - 2. 83 5 - 4. 69 6 06 14 11 09 9 - So di um bi le ac id tr an sp or te r e m ily - 2. 83 5 - 4. 69 6 0. 61 4 30 00 - 13 90 0 G LE A N _0 15 08 X P_ 00 l2 61 16 3 2.O OE -3 7 13 03 4 pr ot ei n, pu ta tiv e [N eo sar tor ya fis ch er i]. - _ _ _ _ _ _ _ _ 13 61 1 - G A RP co m pl ex co m po ne nt (V ps 54 ), 30 00 - 13 90 0 G LE A N _0 1 5 07 X P_ 74 80 9l 5.O OE -2 4 - 2. 83 5 - 4. 69 6 0. 61 4 17 95 9 pu ta tiv e lA sp er gi llu s fu m ig at us ]. 84 00 - t0 00 0 G LE A N _0 15 09 3( 97 76 38 7 2.O OE -0 4 85 77 .9 28 6 H yp ot he tic al pr ot ei n 1. 53 8 0. 11 3 2. 35 9 95 20 9- Pu ta tiv e 0- ac et yl tr an sf er as e — 94 70 0- 96 60 0 G LE A N _0 15 23 A BE 73 17 7 9. OO E- lo l - 1. 99 5 - 4. 19 8 1. 64 5 95 98 3 [C ry pto co cc us n eo fo rm an s v at . gr ub ii] . 80 40 00 - 80 34 78 - Im po rt in ne rm em br an e tr an slo ca se G LE A N 01 35 3 X P 56 95 51 9. 00 E- 84 - 1. 69 0 - 3. 63 2 0. 72 4 80 59 00 — 80 41 98 su bu ni tt im 22 92 68 00 - 92 68 25 - Po ly pr ot ei n [P ha ne ro ch ae te G LE A N 01 67 7 AA .Z 28 93 5 7.O OE -0 3 - 1. 08 0 - 2. 99 3 1. 31 5 93 18 00 — 92 73 63 ch ry so sp or iu m ]. 96 66 00 - 96 61 40 - Re tro tra nu po sa bl e el em en ts lac u 13 2 kd a G LE A N 01 68 9 3(9 56 87 29 9.O OE -7 4 - 1. 23 3 - 3. 37 4 0. 13 5 96 86 00 — — 96 68 76 pr ot ei n 96 66 00 - 96 73 22 - G LE A N 01 69 0 2(9 57 21 49 2.O OE -6 9 96 86 00 — — 96 78 25 H yp ot he tic al pr ot ei n - 1. 23 3 - 3. 37 4 0. 13 5 10 40 80 0- 10 38 26 0- G LE A N 01 30 9 XI ’ 56 99 94 2.O OE -9 3 Re ce pt or , p ut at iv e - 1. 07 6 - 2. 51 5 0. 65 9 10 42 40 0 — — 10 41 36 1 10 40 80 0- 10 42 36 7- G LE A N 01 70 6 XI ’ 56 96 44 7. OO E- 16 6 Lo ng -c ha in ac yl -C oA sy nt he ta te - 1. 07 6 - 2. 51 5 0. 65 9 10 42 40 0 — — 10 44 26 2 11 69 30 0- 11 67 42 6- G LE A N 01 28 5 X P 56 76 97 0.O OE +0 0 Pr ot ei n- ly sin eN -m et hy ltr an sf er as e - 2. 22 1 - 4. 15 9 0. 25 3 11 72 50 0 — — 11 69 65 5 11 69 30 0- 11 71 24 7- G LE A N 01 28 4 2(9 77 27 56 l.O OE -4 5 11 72 50 0 — — 11 71 82 0 H yp ot he tic al pr ot ei n - 2. 22 1 - 4. 15 9 0. 25 3 12 81 70 0- N O G EN E N O G EN E - 2. 73 9 - 4. 48 7 - 1. 29 0 12 82 50 0 13 86 60 0- 13 87 17 8- G LE A N 01 77 1 XI ’ 77 18 21 2.O OE -1 29 13 90 00 0 — — 13 88 22 1 H yp ot he tic al pr ot ei n - 2. 61 5 - 3. 39 5 - 0. 61 9 13 86 60 0- 13 88 48 8- G LE A N 01 24 4 X P 77 18 22 4. OO E- 2l H yp ot he tic al pr ot ei n - 2. 61 5 - 3. 39 5 - 0. 61 9 13 90 00 0 — — 13 88 66 4 13 86 60 0- 13 89 04 0- G LE A N 01 24 3 X P 77 11 22 2.O OE -2 7 H yp ot he tic al pr ot ei n - 2. 6t 5 - 3. 39 5 - 0. 61 9 13 90 00 0 — — 13 89 58 3 13 87 80 0- 13 87 17 8- G LE A N 01 77 1 X l’ 77 18 21 2. OO E- l2 9 H yp ot he tic al pr ot ei n - 1. 95 6 - 2. 80 1 - 0. 81 4 13 92 20 0 — — 13 88 22 1 — 13 87 80 0- 13 88 48 8- G LE A N 01 24 4 X l’ 77 18 22 4.O OE -2 1 H yp ot he tic al pr ot ei n - 1. 95 6 - 2. 80 1 - 0. 81 4 13 92 20 0 — — 13 88 66 4 13 87 80 0- 13 89 04 0- G LE A N 01 24 3 X P 77 18 22 2.O OE -2 7 H yp ot he tic al pr ot ei n - 1. 95 6 - 2. 80 1 - 0. 81 4 13 92 20 0 — — 13 89 58 3 13 87 90 0- 13 87 17 8- G LE A N 01 77 1 XI ’ 77 18 21 2.O OE -1 29 13 94 20 0 — — 13 88 22 1 H yp ot he tic al pr ot ei n - 2. 98 7 - 3. 91 6 - 0. 50 9 14 05 20 0- 14 04 85 9- G LE A N _0 12 40 X P_ 00 13 44 50 3 2. 90 E- 01 14 11 70 0 14 05 73 7 H yp ot he tic al pr ot ei n [D an io cu rio ]. - 2. 85 8 - 4. 83 4 1. 00 3 A V G LR Lo w es tL R fo rr eg io n H ig he st LR fo r r eg io n C oo rd .- G le a. n u m be rb G ea B an k H )’ E- va ln e C oo rd .- CB S W M CB S W M CB S W M Pr ed ic te d fw nc tio ei B t6 3 12 5. 91 11 (63 12 5.9 1 C hr B t6 3 12 5.9 1 se gm en t G en e’ 77 79 62 6 77 79 62 6 77 79 62 6 14 05 20 0- 14 07 09 5 - 14 11 70 0 G LE A N 01 23 9 X P7 59 00 4 2.O OE -2 4 14 10 23 4 H yp ot he tic al pr ot ei n lU sti la go m ay di o]. - 2. 85 8 - 4. 83 4 1.0 03 14 05 60 0- 14 04 85 9- G LE A N 01 24 0 X P 00 13 44 50 3 2. 90 E- 01 H yp ot he tic al pr ot ei n [D an io re rio ]. - 3. 22 5 - 4. 70 0 0. 04 9 14 11 40 0 — — 14 05 73 7 14 05 60 0- 14 07 09 5- G LE A N 01 23 9 X l’ 75 90 04 2.O OE -2 4 H yp ot he tic al pr ot ei n [U sti lag o m ay di oj. - 3. 22 5 - 4. 70 0 0. 04 9 14 11 40 0 — — 14 10 23 4 15 31 20 0- 15 3( 32 8- G LE A N 01 21 9 Xl ’ 56 77 53 3.O OE -1 8 H yp ot he tic al pr ot ei n - 2. 40 8 - 4. 79 2 1. 52 2 15 62 00 0 — — 15 31 63 9 15 31 20 0- 15 32 81 2- G LE A N 01 79 5 X P 56 89 95 8.O OE -1 32 X en ob io tic .tr an op or tin g A TP as e - 2. 40 8 - 4. 79 2 1. 52 2 15 62 00 0 — — 15 35 62 1 15 31 20 0- 15 39 12 0- D U F1 47 9 do m ai n pr ot ei n [A Sp erg illu s - 2 40 8 - 4 79 2 1 52 2 G LE A N 01 79 6 Xl ’ 75 37 11 5.O OE -6 6 15 62 00 0 — — 15 41 01 8 fis m ig atu s]. 15 31 20 0- 15 41 78 3- G LE A N 01 79 7 Xl ’ 77 63 87 1.O OE -6 2 H yp ot he tic al pr ot ei n - 2. 40 8 - 4. 79 2 1. 52 2 15 62 00 0 — — 15 46 00 4 15 31 20 0- 15 49 07 9- G LE A N 01 79 8 X l’ 77 18 18 9.O OE -l5 15 62 00 0 — — 15 49 58 9 H yp ot he tic al pr ot tin - 2. 40 8 - 4. 79 2 1. 52 2 15 31 20 0- 15 53 48 2- G LE A N 01 80 0 X l’ 77 26 44 2. OO E- l2 0 15 62 00 0 — — 15 55 31 4 H yp ot he tic al pr ot ei n - 2. 40 8 - 4. 79 2 1. 52 2 15 31 20 0- 15 56 05 6- D EA D /D EA N bo x he lic as e. pu ta tiv e G LE A N 01 80 1 X P 00 12 57 27 3 2.O OE -0 9 - 2. 40 8 - 4. 79 2 1. 52 2 15 62 00 0 — — 15 61 22 5 [N eo sa ito ry a f isc he ril . — — — — — 12 0- 12 80 0 G LE A N _0 36 03 X P_ 77 26 44 2.O OE -3 2 96 2- 21 41 H yp ot he tic al pr ot ei n - 0. 81 5 - 2. 20 7 0, 97 8 0- 12 80 0 G LE A N _0 36 02 X l’ 77 32 52 2.O OE -1 22 74 75 - 84 94 H yp ot he tic al pr ot ei n - 0. 81 5 - 2. 20 7 0, 97 8 12 44 7- 0- t2 80 0 G LE A N 03 60 1 X l’_ 77 60 74 8.O OE -3 6 13 49 8 H yp ot he tic al pr ot tin - 0. 81 5 - 2. 20 7 0. 97 8 29 00 - 12 50 0 G LE A N _0 36 02 X l’_ 77 32 52 2. OO E- l2 2 74 75 -8 49 4 H yp ot he tic al pr ot ei n - 1. 85 0 - 3. 81 7 1.7 91 12 44 7- 29 00 - 12 50 0 G LE A N _0 36 01 X P_ 77 60 74 8.O OE -3 6 13 49 8 H yp ot he tic al pr ot ei n - 1. 85 0 - 3. 81 7 1.7 91 73 00 - 12 60 0 G LE A N _0 36 02 X P_ 77 32 52 2.O OE -1 22 74 75 - 84 94 H yp ot he tic al pr ot ei n - 0. 56 3 - (.1 08 0. 09 9 12 44 7- 73 00 - 12 60 0 G LE A N _0 36 01 X P_ 77 60 74 8.O OE -3 6 H yp ot he tic al pr ot ei n - 0. 56 3 - 1, 10 8 0. 09 9 (34 98 12 44 7 - 12 80 0- 16 30 0 G LE A N 03 60 1 X P7 76 07 4 8.O OE -3 6 H yp ot he tic al pr ot ei n 0. 61 8 0. 17 5 1. 53 2 13 49 8 13 91 7- 12 80 0- 16 30 0 G LE A N _0 36 00 X P_ 77 18 l8 l.O OE -8 3 15 48 5 H yp ot he tic al pr ot ei n 0. 61 8 0, 17 5 1. 53 2 15 53 2- 12 80 0- 16 30 0 G LE A N _0 35 99 X l’ 56 71 48 7.O OE -6 4 16 tH H yp ot he tic al pr ot ei n 0. 61 8 0. 17 5 1. 53 2 15 53 2- 15 60 0- 20 00 0 G LE A N _0 35 99 X P_ 56 7t 48 7.O OE -6 4 16 11 1 H yp ot he tic al pr ot ei n - 0. 93 3 - 2. 00 8 1. 82 4 17 14 5- C rp lh ni ily tra ns cr ip tio na lre gu la to sy 09 33 - 20 08 1. 82 4 15 60 0- 20 00 0 G LE A N _0 36 04 N P_ 92 32 7l 4. 30 E+ 00 19 08 2 pr ot ei n [G loe ob ac ter vio lac eu s] 38 31 8- 37 00 0- 40 90 0 G LE A N _0 36 08 X P_ 56 83 72 0.O OE +0 0 A m in oa ci dt ra ns po rte r - 2. 97 4 - 3. 16 4 - 5. 03 7 - 4. 56 8 1. 82 4 - 0. 21 2 41 39 2 10 26 00 - N O G EN E N O G EN E 1. 29 0 0. 53 6 0. 37 5 10 38 00 16 00 00 - 15 95 42 - G LE A N 03 57 4 X l’ 56 79 71 4O OE -8 5 R et ro tra ns po so n n u cl eo ca ps id pr ot ei n - 1. 03 2 - 2. 34 9 1. 07 9 16 56 00 — — 16 01 29 — 16 00 00 - 16 04 70 - G LE A N 03 57 3 XI ’ 56 79 71 0.O OE +0 0 16 56 00 — — 16 18 94 R et ro tra ns po so nn uc le oc ap si dp ro te in - 1. 03 2 - 23 49 1. 07 9 16 00 00 - 16 32 21 - G LE A N _0 35 72 A BE 82 05 3 4.O OE -3 2 16 53 20 In te gr as e [M ed ica go tr un ca tu la j. - 1. 03 2 - 2. 34 9 1. 07 9 16 56 00 A V G LR Lo w es t L R fo rr eg io n H ig he st LR fo rr eg io n C oo ni . C hr G le nn n u m be rs G en B an k 1W E- va hs e Co or d. - CB S W M CB S W M CB S W M Pr ed ic te d fu nc tio ns B1 63 12 5. 91 B t6 3 12 5. 91 B t6 3 12 5.9 1 se gm en . G en e’ 77 79 62 6 77 79 62 6 77 79 62 6 16 00 00 - 15 95 42 - ‘ 16 83 00 G LE A N _0 35 74 X P_ 56 79 71 4.O OE -8 5 16 01 29 R o tr a n o s o n n u c o te in - 0. 73 9 - 2. 74 2 1. 10 2 16 00 00 - 16 04 70 - GL EA N 03 57 3 X P 56 79 71 0.O OE +0 0 16 83 00 — — 16 18 94 Re tro tra ns po so n n u cl eo ca ps id pr ot ei n - 0. 73 9 - 2. 74 2 1. 10 2 16 00 00 - 16 32 21 - G LE A N 03 57 2 A BE 82 05 3 4.O OE -3 2 16 83 00 — 16 53 20 tz un ca tu 1a . - 0. 73 9 - 2. 74 2 1. 10 2 16 00 00 - 16 56 69 - G LE A N 03 57 1 X P 56 90 55 5.O OE -1 09 H yp ot he tic al pr ot ei n - 0. 73 9 - 17 42 1. 10 2 16 83 00 — — 16 64 18 16 00 00 - 16 73 71 - Y 1 do m am fit m ily [A ap ag ill ua - 0. 73 9 - 2. 74 2 1. 10 2 GL EA N 03 57 0 X P 75 01 51 4.O OE -1 2 16 83 00 — — 17 19 03 fu m ig at us ]. 18 86 00 - N O G EN E N O G EN E - 1. 65 9 - 3. 61 7 05 13 18 98 00 19 64 00 - 19 43 88 - GL EA N 03 56 6 X P 56 84 48 0. 00 E- 00 19 78 00 — — 19 66 24 H yp ot he tic al pr ot ei n - 1. 54 7 - 2. 84 3 - 0. 48 9 19 64 00 - 19 67 65 - G LE A N 03 64 0 X P 56 83 39 2.O OE -1 16 19 78 00 — — 19 78 42 Cy to pl as m ic pr ot ei n - 1. 54 7 - 2. 84 3 - 0. 48 9 25 46 00 - 25 62 16 - D el ay ed -ty pe hy pe rs en sit iv ity an tig en - - 0 71 6 - 2 26 3 0. 32 8 G LE A N 03 65 0 X P 56 83 98 3.O OE -1 30 25 64 00 — — 25 74 43 re la te d pr ot ei n 25 47 00 - 25 62 16 - D el ay ed -ty pe hy pe rs en sit iv ity an tig en - - 29 05 4 47 5 - 0 71 1 G LE A N 03 65 0 X P 56 83 98 3.O OE -1 30 25 73 00 — — 25 74 43 re la te d pr ot ei n 25 87 00 - 25 83 56 - R et ro tra ns po sa bl e el em en t s la ts 13 2 kd a 0. 78 0 - 0 38 4 1. 95 3 G LE A N 03 55 2 X P 56 87 29 3.O OE -1 04 26 04 00 — — 25 90 20 pr ot ei n 36 22 00 - 36 56 57 - G LE A N 03 66 9 X P 56 84 35 3.O OE -5 6 36 57 00 — — 36 69 09 Ca rd io lip in sy nt ha se - 2. 68 1 - 4. 77 3 0. 30 4 40 37 00 - 40 46 57 - C yc lin N -te m nn al do nm in pr ot ei n. G LE A N 03 52 6 X P 00 12 63 45 2 6.O OE -0 2 - 1. 54 4 - 3. 06 4 0. 32 2 40 51 00 — — 40 51 92 pu ta tiv e [N eo na rto sy a f lsc hm i]. 53 67 00 - 53 51 69 - GL EA N 03 69 9 X P 56 83 08 2. OO E- l6 7 53 75 00 — — 53 73 00 Cy to pl as m ic pr ot ei n - 1. 60 9 - 2. 20 0 . 0. 89 6 54 42 00 - 54 09 44 - GL EA N 03 50 0 X P 56 83 04 0.O OE +0 0 54 69 00 - - 54 43 19 D N A u n w in di ng -re la te d pr ot ei n - 2. 48 6 - 4. 31 7 . 0. 27 4 54 42 00 - 54 65 66 - G LE A N 03 49 9 X P 56 83 03 0.O OE +0 0 54 69 09 — — 54 94 39 O rig in re co gn iti on co m pl ex su bu ni t4 - 2. 48 6 4. 31 7 - 0. 27 4 60 56 00 - 60 41 77 - GL EA N 03 71 1 X P 56 85 01 1.O OE -1 54 60 78 00 — — 60 55 46 H yp ot he tic al pr ot ei n - 3. 17 7 4. 20 5 - 2. 03 3 60 56 00 - 60 57 02 - G LE A N 03 49 3 X P 77 20 30 4.O OE -7 1 60 78 00 — — 60 73 46 H yp ot he tic al pr ot ei n - 3. 17 7 - 4. 20 5 - 2. 03 3 75 73 00 - 75 66 70 - Co ns er ve dh yp ot he tic al pr ot ei n - 2. 86 7 - 4. 75 3 0. 68 8 G LE A N 03 46 9 X P 00 12 65 92 4 2.O OE -0 5 77 40 00 — — 75 73 67 [N eo sa ito tya fis ch er i]. 75 73 00 - 75 77 63 - G LE A N 03 74 2 X P 56 84 21 0.O OE +0 0 H yp ot he tic al pr ot ei n - 18 67 4. 75 3 0. 68 8 77 40 00 — — 75 93 98 75 73 00 - 76 14 11 - G LE A N 03 74 4 X P 77 52 43 1.O OE -4 3 H yp ot he tic al pr ot ei n - 2. 86 7 4. 75 3 0. 68 8 77 40 00 — — 76 43 08 75 73 00 - 76 45 09 - G LE A N 03 46 8 X P 57 10 44 8.O OE -1 13 77 40 00 — — 76 81 09 Be ta -g lu co sid as e - 2. 86 7 4. 75 3 0. 68 8 75 73 00 - 76 87 94 - G LE A N 03 74 5 X P 57 10 45 l.O OE -1 76 77 4(1 09 - - 77 09 09 H ex oa et ra ns po et .re bt ed pr ot ei n - 2. 86 7 - 4. 75 3 0. 68 8 75 73 00 - 77 32 14 - G LE A N 03 46 7 3(9 57 24 78 5.O OE -5 3 77 4(1 09 - - H yp ot he tic al pr ot ei n - 2. 86 7 - 4. 75 3 0. 68 8 76 21 00 - 76 14 11 - G LE A N 03 74 4 X P 57 10 43 2.O OE -4 4 77 40 00 — — 76 43 08 H yp ot he tic al pr ot ei n - 1. 02 2 - 3. 58 4 0. 13 5 76 21 00 - 76 45 09 - G LE A N 03 46 8 X P 57 10 44 8.O OE -1 13 77 40 00 — — 76 81 09 B et a. gl uc os id as e - 1. 02 2 - 3. 58 4 0. 13 5 76 21 00 - 76 87 94 - G LE A N _0 37 45 X P_ 57 l0 45 1. OO E- l7 6 77 09 08 H ex os et ra ns po rt. re lu te d pr ot ei n - 1. 02 2 - 3. 58 4 0. 13 5 77 40 00 A V G LR Lo w es t L R fo rr eg io n H ig he st LR fo rr eg io n Co or d. - G Ie .n n u m be rb G en B an k ID ’ E- va lu e C oo rd . - CB S W M CB S W M CB S W M O ir B t6 3 12 5. 91 1l t6 3 12 5. 91 se gm en t’ G en e Pr ed ic te d fa nc tio .’ B t6 3 12 5. 91 77 79 62 6 77 79 62 6 77 79 62 6 76 21 00 - 77 32 14 - 12 77 40 00 G I.E A IJ _0 34 67 X P_ 57 24 78 $.O OE -53 H yp ot he tic al pr ot ei n - 1. 02 2 . 35 84 0. 13 5 77 09 00 - 76 17 94 - G LE A N 03 74 5 X P 57 10 45 l.O OE -1 76 77 39 00 — — 77 09 08 H ex os e tr an sp or t-r el at ed pr ot ei n - 1. 24 2 - 3. 36 4 0. 45 4 77 09 00 - 77 32 14 - G LE A N 03 46 7 X P 57 24 78 5.O OE -5 3 H yp ot he tic al pr ot ei n - 1. 24 2 - 3. 36 4 0. 45 4 77 39 00 — — 77 39 09 — — — — — 13 0- 12 00 0 G LE A N _0 I0 76 X P_ 77 50 62 Se -6 8 21 18 - 29 73 H yp ot he tic al pr ot ei n CN BE 53 8O - 0 20 8 - 2. 84 3 1. 10 5 0- 12 00 0 G LE A N _0 t0 75 X P7 72 64 4 6e -3 9 51 54 -6 57 9 H yp ot he tic al pr ot ei n CN BK OI 8O - 0. 20 8 - 2. 84 3 1 10 5 0- 12 00 0 G LE A N _0 10 74 X P_ 56 92 89 7e -4 5 75 82 - 83 08 H yp ot he tic al pr ot ei n CN B0 54 40 - 0. 20 8 - 2. 84 3 1. 10 5 0- 12 00 0 G LE A N _0 10 73 X P_ 56 7l 48 2e -6 5 83 55 - 89 34 H yp ot he tic al pr ot ei n CN AO ZI ZO - 0. 20 8 - 2 84 3 1. 10 5 10 06 1 - Pr ed ic te d pr ot ei n (C op rin op sis ci ne re a - 0. 20 8 - 2 84 3 1 10 5 0- 12 00 0 G LE A N _0 10 77 EA U 87 76 9 2e -l0 10 88 3 o ka ya m a7 #1 30 ]. 12 54 4- 12 30 0- 25 30 0 G LE A N _0 10 78 X P_ 56 85 23 1.O OE -9 1 14 85 9 A lp ha -g lu co sid e: hy dr og en sy m po rte r - 2. 78 3 - 4. 66 8 0. 87 7 17 42 0- 12 30 0- 25 30 0 G LE A N _0 10 72 X P_ 56 85 22 O. OO E+ 00 H yd ro la se - 2. 78 3 4. 66 8 0. 87 7 19 82 8 20 42 0- 12 30 0- 25 30 0 G LE A N _0 10 79 X P_ 56 85 2t 2.O OE -1 7 H yp ot he tic al pr ot ei n - 27 83 4. 66 8 0. 87 7 20 69 3 20 79 1 - 12 30 0- 25 30 0 G LE A N _0 10 80 X P_ 77 18 29 l.O OE -7 2 21 29 1 H yp ot he tic al pr ot ei n - 2. 78 3 - 4. 66 8 0. 87 7 2l 72 6- 12 30 0 - 25 30 0 G LE A N _0 l0 81 CA B9 10 97 5.O OE -0 8 23 57 9 D ex tra na se (P en ici lliu m fli ni cu lo su m ]. - 2. 78 3 - 46 61 0. 87 7 24 66 7- 12 30 0- 25 30 0 G LE A N _0 l0 82 X P_ 77 l8 55 3. OO E- l4 25 11 0 H yp ot he tic al pr ot ei n - 2. 78 3 - 4 66 8 0. 87 7 52 92 4 - 52 00 0- 54 30 0 G LE A N _0 I0 66 N P_ l0 57 53 7.O OE -1 8 A rg in as e (M eso rhi zo biu m lo ti]. - 2. 06 9 - 3 82 4 1. 79 4 12 59 00 - 12 63 95 - G LE A N 01 09 8 X P 77 65 39 2.O OE -3 5 12 83 00 - — 12 82 21 H yp ot he tic al pr ot ei n - 0. 08 6 - 1. 31 9 - 0. 01 1 - 2. 45 0 0. 24 0 - 0. 13 9 23 95 00 - 23 47 22 - G LE A N 01 03 1 XI ’ 56 86 04 0 AB C tr an sp or te r P M R5 - 0. 30 6 - 1 32 2 0. 92 5 24 08 00 — — 23 99 16 36 64 00 - 36 66 11 - G LE A N 01 00 1 X P 77 19 36 2.O OE -6 7 36 76 00 — — 36 70 74 H yp ot he tic al pr ot ei n - 1. 16 2 - 3. 38 3 - 0. 25 6 36 64 00 - 36 75 56 - G LE A N 01 00 0 XI ’ 56 85 61 2. OO E- 14 l 36 76 00 — — 36 18 12 H yp ot he tic al pr ot ei n - 1. 16 2 - 3. 38 3 - 0. 25 6 36 77 00 - 36 75 56 - G LE A N 01 00 0 X P 56 85 61 2. 00 E- 14 t 36 88 00 — — 36 88 12 H yp ot he tic al pr ot ei n 2. 69 6 - 0. 07 2 5. 15 8 36 72 00 - 36 75 56 - G LE A N 01 00 0 X P 56 85 61 2.O OE -l4 1 36 84 00 — — 36 88 12 H yp ot he tic al pr ot ei n 1.7 63 - 0. 70 0 0. 57 0 36 70 00 - 36 66 11 - G LE A N 01 00 1 XI ’ 77 19 36 2.O OE -6 7 36 84 00 — — 36 70 74 H yp ot he tic al pr ot ei n 3. 67 3 2. 23 9 4. 86 3 36 70 00 - 36 75 56 - GL EA N 01 00 0 XI ’ 56 85 61 2.O OE -t4 1 36 84 00 — — 36 88 12 H yp ot he tic al pr ot ei n 3. 67 3 2. 23 9 4. 86 3 36 76 00 - 36 66 11 - G LE A N 01 00 1 XI ’ 77 19 36 2.O OE -6 7 36 86 00 — — 36 70 74 H yp ot he tic al pr ot ei n 3. 57 0 0. 89 2 53 45 36 76 00 - 36 75 56 - GL EA N 01 00 0 XI ’ 56 85 61 2. OO E- 14 l 36 86 00 — — 36 88 12 H yp ot he tic al pr ot ei n 3. 57 0 0. 89 2 5. 34 5 74 24 00 - 74 03 12 - G LE A N 01 21 5 XI ’ 57 22 46 0 G al ar to se tra ns po rte r - 0. 30 6 - 1. 32 2 0. 92 5 75 18 00 — — 74 24 68 74 24 00 - 74 36 93 - Pr ed ic te d pr ot ei n (C op rin op sis ci ne re a 0 30 6 - 1 32 2 0 92 5 G LE A N _0 09 28 EA U 93 68 8 6e -1 0 75 18 00 74 45 77 o lc ay am a7 #l 30 ]. - k A V G LR Lo w es tL R fo rr eg io n H ig he st LR ro r re gi on C oo rd . C hr G le an n u m be ra G en B an k ID ’ E- va lu e C oo rd .- CB S W M CB S W M CB S W M Pr ed ic te d fu nc tio n B t6 3 12 5. 91 B t6 3 12 5.9 1 flt 63 12 5. 91 se gm en . G en ed 77 79 62 6 77 79 62 6 77 79 62 6 74 24 00 - 74 55 38 - 75 18 00 G LE A N _0 12 16 X P_ 56 71 48 9e -5 1 74 90 88 H yp ot he tic al pr ot ei n CN AO 2I 2O - 0. 30 6 - 1. 32 2 0. 92 5 74 24 00 - 74 66 20 - G LE A N 01 21 7 X P 77 18 18 2e -7 9 H yp ot he tic al pr ot ei n CN BN I9 6O - 0. 30 6 - 1. 32 2 0. 92 5 75 18 00 — — 74 76 91 75 19 00 - 75 33 03 - G LE A N 01 21 8 X P 77 32 52 3.O OE -1 26 75 59 00 — — 75 43 22 H yp ot he tic al pr ot ei n - 2. 52 8 - 3. 83 8 0. 08 5 — — — — 14 50 0- 37 00 G LE A N _0 33 07 X P_ 56 85 23 5.O OE -6 4 56 9- 16 71 A lp ha -g lu co sid e: hy dr og en sy m po rte r - 2. 78 3 - 4. 35 2 0. 01 2 50 0- 37 00 G LE A N _0 33 08 X P_ 00 12 71 19 7 3.O OE -0 8 25 66 - 51 52 C6 tr an sc rip tio n fa ct or (M ut3 ). pu ta tiv e - 2. 78 3 - 4. 35 2 0. 01 2 IA sp er gi llu s cl av at us ]. 11 24 00 - N O G EN E - 2. 19 9 - 3. 68 5 - 0. 14 0 11 33 00 14 03 00 - 13 81 45 - G LE A N 03 28 6 X P 57 22 37 6.O OE -1 57 M yo -in os ito lt ra ns po rte r 1 - 0. 97 7 - 3. 38 0 1. 04 4 14 95 00 — — 14 03 59 14 03 00 - 14 14 31 - 3- hy dr ox ya cy l-C 0A de hy dr og en as e, G LE A N 03 28 5 ZP 01 22 53 40 2.O OE -1 0 pu ta tiv e [m ari ne ga m m a pr ot eo ha ct er iu m - 0. 97 7 - 3. 38 0 1. 04 4 14 95 00 — — 14 25 46 H TC C2 2O 7]. 14 03 00 - 14 30 93 - Tr an sc rip tio na lr eg ul at or ,1 cm G LE A N 03 33 3 Y P 00 11 03 86 6 1.O OE -2 0 14 95 00 — — 14 42 97 fa m ily /re gu ca lc in IS ac ch ar op ol yu po ra - 0. 97 7 - 3. 38 0 1. 04 4 u yt hr ae a]. 14 03 00 - 14 48 01 - G LE A N 03 33 4 X P 77 25 66 0.O OE +0 0 14 95 00 — — 14 86 67 H yp ot he tic al pr ot ei n - 0. 97 7 - 3. 38 0 1. 04 4 20 24 00 - 19 89 91 - G LE A N 03 27 5 X P 57 22 56 0.O OE +0 0 20 45 00 — — 20 27 66 R gu yn uc le ot id ee xc ha ng ef ac to r - 0. 68 9 - 2. 93 4 0. 40 2 20 85 00 - N O G EN E N O G EN E - 2. 38 7 - 3. 88 1 - 0. 25 5 20 93 00 28 04 00 - 27 85 37 - A ap ar ag in e ay nt ha se (g lut am ine G LE A N 03 25 5 X P 57 22 87 0.O OE +0 0 - 0. 97 8 - 4. 07 1 1. 30 4 28 64 00 — — 28 07 28 hy dr ol yz in g) 28 04 00 - 28 33 64 - G LE A N 03 35 1 N P 86 25 99 5.O OE +0 0 M ob pr ot ei n [S ino rh izo biu m m el ilo ti] . - 0. 97 8 - 4. 07 1 1. 30 4 28 64 00 — — 28 38 40 28 04 00 - 28 62 01 - G LE A N 03 35 2 X P 77 24 73 2.O OE -1 07 28 64 00 — — 28 70 28 H yp ot he tic al pr ot ei n - 0. 97 8 - 4. 07 1 1. 30 4 28 08 00 - 27 85 37 - G LE A N 03 25 5 X P 57 22 87 0.O OE +0 0 28 60 00 — — 28 07 28 A ap ar ag in e sy nt ha se - 0. 53 5 - 2. 29 0 0. 57 0 28 08 00 - 28 33 64 - G LE A N 03 35 1 N P 86 25 99 5.O OE +0 0 28 60 00 — — 28 38 40 M ob pr ot ei n [S ino rh izo biu m m el ilo ti] . - 0. 53 5 - 2. 29 0 0. 57 0 28 10 00 - 28 33 64 - G LE A N 03 35 1 N P 86 25 99 5.O OE +0 0 M ob pr ot ei n [S ino rh izo biu m m el ilo ti] . - 1. 29 5 - 4. 89 5 0. 90 6 28 91 00 — — 28 38 40 28 10 00 - 14 48 01 - G LE A N 03 33 4 X P 77 25 66 0.O OE +0 0 28 91 00 — — 14 86 67 H yp ot he tic al pr ot ei n - 1. 29 5 - 4. 89 5 0. 90 6 28 10 00 - 28 82 75 - G LE A N 03 35 3 X P 57 22 88 0.O OE +0 0 28 91 00 — — 29 11 48 Tr an sc rip tio ni ni tia tio nf ac to rT Fl lD - 1. 29 5 - 4. 89 5 0. 90 6 30 75 00 - 30 75 32 - G LE A N 03 35 6 X P 77 50 20 0.O OE +0 0 30 93 00 — — 30 89 23 H yp ot he tic al pr ot ei n 1.2 93 0. 10 0 1. 99 7 30 77 00 - 30 75 32 - G LE A N 03 35 6 XI ’ 77 50 20 0.O OE +0 0 H yp ot he tic al pr ot ei n 0.3 51 - 0. 31 3 0. 95 2 30 92 00 — — 30 89 23 35 16 00 - 35 16 26 - G LE A N 03 36 4 XI ’ 77 24 94 0.O OE +0 0 35 48 00 — — 35 30 14 H yp ot he tic al pr ot ei n - 1. 22 9 - 2. 17 6 0. 07 5 42 73 00 - 42 89 08 - G LE A N 03 37 6 X P 77 46 42 7.O OE -0 5 H yp ot he tic al pr ot ei n - 0. 78 3 - 2. 01 4 0. 84 4 43 09 00 — — . 42 91 03 44 51 00 - 43 78 92 - G LE A N 03 37 9 X P 57 23 33 2.O OE -1 48 M M S2 - 2. 16 3 - 3. 85 2 - 0. 13 8 46 04 00 — — 44 17 87 44 51 00 - 44 24 73 - G LE A N _0 33 80 X P_ 56 92 89 2.O OE -7 5 46 04 00 44 50 23 H yp ot he tic al pr ot ei n - 2. 16 3 - 3. 85 2 - 0. 13 8 A V G LR Lo w es tL R fo rr eg io n H ig he st LR fo rr eg io n C oo rd . C hr G le an n u m be r” G en B an k ID ’ B -v al ue Co or d. - CB S W M CB S W M CB S W M Pr ed ic te d fu nc tio n’ B t6 3 12 5. 91 B t6 3 12 5. 91 B t6 3 12 5. 91 se gm en t Ge ne d 77 79 62 6 77 79 62 6 77 79 62 6 44 51 00 - 44 79 18 - CE N P- B pr ot ei n; H om eo do m ai n- lik e - 2. 16 3 - 3. 85 2 - 0. 13 8 “ o to o G LE A N _0 32 20 A BE 94 50 6 3.O OE -1 9 44 89 57 IM ed ic ag ot ru nc at ul al 44 51 00 - 45 05 36 - GL EA N 03 38 1 X l’ 56 79 71 3.O OE -1 13 R et ro tra ns po so n n u cl eo ca ps id pr ot ei n - 2. 16 3 - 3. 85 2 - 0. 13 8 46 04 00 — — 45 33 40 44 51 00 - 45 57 78 - GL EA N 03 38 2 A BA 99 61 2 3.O OE -2 0 R et ro tra ns po so n pr ot ei n [O ryz as at iv a]. - 2. 16 3 - 3. 85 2 - 0. 13 8 46 04 00 — 45 77 70 44 51 00 - 45 82 57 - R et ro vi ru s- re la te d Po lp ol yp ro te in fro m - 2 16 3 - 3. 85 2 - 0 13 8 G LE A N 03 38 3 P1 09 78 3.O OE -5 7 46 04 00 — 45 92 56 tr an ap os on TN T 1- 94 44 55 00 - 44 79 18 - CE NP -B pr ot ein ;H om eo do m ai n- lik e - 0 53 8 - 20 01 0 07 2 GL EA N 03 22 0 A BE 94 50 6 3.O OE -1 9 46 05 00 — 44 89 57 EM ed ica go tru nc atu lal . 44 55 00 - 45 05 36 - G LE A N 03 38 1 XI ’ 56 79 71 3.O OE -1 13 Re tro tra ns po so n n u cl eo ca ps id pr ot ei n - 0. 53 8 - 2. 00 1 0. 07 2 46 05 00 — — 45 33 40 44 55 00 - 45 57 78 - R et ro tra ns po so n pr ot ei n, pu ta tiv e, 0 53 8 - 2. 00 1 0 07 2 G LE A N 03 38 2 A BA 99 61 2 3.O OE -2 0 46 05 00 — 45 77 70 u n cl as sif ie d [O ryz as at iv a]. 44 55 00 - 45 82 57 - R et ro vi ru s- re la te d Po lp ol yp ro te in fro m - 0. 53 8 - 2. 00 1 0 07 2 G LE A N 03 38 3 P1 09 78 3.O OE -5 7 46 05 00 — 45 92 56 tr an sp os on TN Tl -9 4 58 73 00 - 58 70 04 - GL EA N 03 40 4 XI ’ 56 83 61 1.O OE -0 5 59 13 00 — — 58 74 37 H yp ot he tic al pr ot ei n - 0. 71 6 - 2. 47 5 0. 41 7 58 73 00 - 58 89 28 - GL EA N 03 19 1 XI ’ 77 65 39 2.O OE -3 3 59 13 00 — — 59 07 33 H yp ot he tic al pr ot ei n - 0. 71 6 - 2. 47 5 0. 41 7 83 33 00 - 82 95 00 - G LE A N 03 44 9 X P 57 24 52 0.O OE -I0 0 83 49 00 — — 83 34 28 Pr ot ei n ty ro sin e/ th re on in e ph os ph at as e - 0. 84 4 - 2. 38 7 0. 64 1 83 33 00 - 83 46 15 - Tr am po rte rp ro te in sm f2 [A sp erg ill us - 0. 84 4 - 2. 38 7 0. 64 1 G LE A N 03 45 0 X P 00 12 71 96 5 J.O OE -1 9 83 49 00 — — 83 70 25 cl av at us ]. 91 30 00 - 91 38 68 - Cl as s V ch iti na se ,p ut at iv e EA sp ee gi llu s GL EA N 03 46 4 XI ’ 00 14 81 48 8 3.8 0E -fO O - 1. 61 6 - 1. 70 0 - 4. 11 3 - 4. 35 1 0. 69 5 1. 04 5 92 52 00 — — 91 48 40 fu m ig atu n]. 91 30 00 - 91 68 76 - GL EA N 03 46 5 XI ’ 77 65 39 1.O OE -2 7 92 52 00 — — 91 87 02 H yp ot he tic al pr ot ei n - 1. 61 6 - 1. 70 0 - 4. 11 3 - 4. 35 1 0. 69 5 1. 04 5 91 30 00 - 92 06 09 - GL EA N_ 03 46 6 X P_ 77 63 87 5.O OE -49 92 30 50 H yp ot he tic al pr ot ei n - 1. 61 6 - 1. 70 0 - 4. 1t 3 - 4. 35 1 0. 69 5 1. 04 5 92 52 00 00 A p p e n d i x B D at ab as e o fg en om ic v ar ia tio n o bs er ve d in se ro ty pe B st ra in s v ia CG H T a b l e B . 1 R e g i o n s o f d i f f e r e n c e i n t h e g e n o m e s o ft hr ee se ro ty pe B st ra in s co m pa re d w ith th e se qu en ce d ge no m e o fs tr ai n W M 27 6. Re gi on s o fd iff er en ce th at o v er la p ar e in th e sa m e co lo ur . T h e fir st co lu m n in di ca te st he c h r o m o s o m e ( C H R ) n u m b e r . a N u c l e o t i d e c o o r d i n a t e s o ft he se gm en ti de nt ifi ed by CG H .b G e fl ID o f t o p B L A S T n . T h e E - v a l u e o ft he BL A ST re su lt is in cl ud ed in th e fo llo w in g co lu m n. cC oo rd in at es o ft he sp ec ifi c ge ne in th e se gm en ti de nt ifi ed by CG H . dF un ct io na l i nf or m at io n ab ou tt he to p BL A ST hi t. Th e G O o n to lo gy is in cl ud ed in th e fo llo w in g co lu m n. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A ve ra ge LD St an da rd D ev ia tio n o fL R C oo rd in at es C H R de sc rib in g Pr ot ei n ID a E- va lu e C oo rd in at es Pr ed ic te d Fu nc tio nd G en e O nt ol og y R7 94 K B 38 64 K B 78 92 R 79 4 K B 38 64 K B 78 92 o fG en e th e a re ? X en ob io tic -tr an sp or tin g A TP as e, A 45 18 - 10 68 7 A A W 41 68 8. I 0 59 1 - 62 77 pu ta tiv e (C ryp toc oc cu sn eo fo rm an s G O _c om po ne nt , p la sm a m e m br an e, G O fu nc ti on . x e n o b io ti c- tr am p o rt m g - 1. 23 1 0. 85 0 A TP as e ac tiv ity ;G O _p ro ce ss : dr ug tr an sp or t; GO _p ro ce ss : re sp on se to dr ug v ar . n e o fo rm an s JE C 2I I G lu co se 1- de hy dr og en as e, pu ta tiv e G O _c om po ne nt : p er ox iso m al m at rix ;G O _f un ct io n: 2, 4- di en oy l-C 0A A A W 42 OI O. l 0 10 50 1 - 11 52 7 (C ryp toc oc ca sn eo fo rn ia ns va r. re du ct as e (N AD PH ) a ct iv ity ; G O_ pr oc es s: sp or ul at io n (se ns u - 1. 23 1 0. 85 0 n eo fo rm an sJ EC 2I ] Sa cc ha ro m yc es ); GO pr oc es s: fa tty ac id ca ta bo lis m G lu co se 1- de hy dr og en as e, pu ta tiv e G O _c om po ne nt : p er ox iso m al m at rix ;G O _f un ct io n: 2, 4- di en oy l-C oA 67 76 - 10 51 6 A A W 42 OI O. l 0 10 50 1 - 11 52 7 (C ryp toc oc cu sn eo fo rm an o va r. re du ct as e (N AD PH ) a ct iv ity ;G O_ pr oc es s: sp or ul at io n (se nsu - 1. 53 9 - 1. 51 0 0.6 71 0. 64 7 n eo fo rm as s IE C2 I] Sa cc ha ro m yc es ); GO _p ro ce ss :f att y ac id ca ta bo lis m 5” O xi do re du ct as e, pu ta tiv e 5’: G o_ co m po ne nt n u cl eu s; G o_ co m po ne nt : cy to pl as m ; G O _f un ct io n: [C ry pto co cc us n eo fo rm an s v ar al de hy de re du ct as e ac tiv ity ; G o_ fu nc tio n: al do -k et o re du cta se ac tiv ity ; G o_ fu nc tio n: o x id or ed uc ta se ac tiv ity ; G O _p ro ce ss :a ra bi no se m et ab ol ism ; 26 98 42 - 5’: A A W 4I 72 7. 1; 0 26 89 96 - n eo fo rm on s JB C2 1]; 27 20 58 3’: A A W 41 65 5. 1 27 00 05 3’: Co ns er ve d hy po th et ic al pr ot ei n GO _p ro ce ss : D -x yl os e m et ab ol ism - 0. 72 6 0.4 11 [C ry pto co cc us n eo fo rm am v ar 3’ : G o_ co m po ne nt :m ito ch on dr ia l m at rix ;G o_ fu nc tio n: th io l-d isu lfi de n eo fo rm an s JE C2 1] ex ch an ge in te rm ed ia te ac tiv ity ;G O _p ro ce ss : re sp on se to o sm o tic st re ss ; GO _p ro ce ss :r es po ns e to o x id at iv e st re ss O xi do re du ct as e, pu ta tiv e G o_ co m po ne nt : n u cl eu s; G O _c om po ne nt : c yt op la sm ;G o_ fu nc tio n: al de hy de 27 05 53 - re du cta se ac tiv ity ;G o_ fu nc tio n: al do -k et o re du cta se ac tiv ity ;G o_ fu nc tio n: A A W 41 72 7. 1 0.O OE +0 0 [C rp to co cc us n eo fo rm an s va r. o x id or ed uc ta se ac tiv ity ;G O _p ro ce ss :a ra bi no se m et ab ol ism ;G O _p ro ce ss : D - - 0. 72 6 0.4 11 27 20 31 n eo fo rm an s J EC 21 ]; x yl os e m et ab ol ism 5’: G o_ co m po ne nt :n u cl eu s; G o_ co m po ne nt : c yt op las m ;G o_ fu nc tio n: 5’. Ox id or ed uc tas e, pu ta tiv e al de hy de re du ct as e ac tiv ity ;G o_ fu nc tio n: al do -k et o re du cta se ac tiv ity ; [C ryp toc oc cu sn eo fo rm an sv a r. G o_ fu nc tio n: o x id or ed uc ta se ac tiv ity ;G o _ p ro ce ss : ar ab in os e m et ab ol ism ; 26 99 23 - 5’: A A W 41 72 7. 1; 0 26 89 96 - se o fo rm an s JE C2 I]; 27 26 27 3’: A A W 41 65 5. 1 27 00 05 3’: Co ns er ve d hy po th et ic al pr ot ei n G o_ pr oc es s: D -x yl os e m et ab ol ism - 0. 68 7 0. 42 8 [C ryp toc oc cu sn eo fo rm am 3’ : G o_ co m po ne nt :m ito ch on dr ia l m at rix ; G o_ fu nc tio n: th io l-d isu lfi de s e o fo rm an s JE C2 I] e x c h an g e in te rm ed ia te a c ti v it y ; G o_ pr oc es s: r e s p o n se to o s m o ti c o tr e st ; G o pr oc es s: re s p o n se to o x id at iv e s tr e ss O x id o re d u ct as e, p u ta ti v e G o _ co m p o n en t: n u c le us ; G o _ co m p o n en t: c y to p la sm ; G o _ fu n ct io n : a ld eh yd e 27 05 53 - re du ct aa e a c ti vi ty ; G o _ fu n ct io n : a ld o -k et o re du ct as e a c ti v it y ; G o _ fu n ct io n : - 0. 68 7 0. 42 8 A A W 41 72 7. 1 0.O OE +0 0 27 20 31 [C ry pt oc oc ca s n e o fo rm an s va r. o x id or es tu ct as e a c ti vi ty ; G o _ p ro ce ss : a ra b in o se m e ta bo li sm ; G o _ p ro ce ss : D n e o fo rn ia n s JE C 2I ], x y lo se m e ta bo li sm SA R s m a ll m o n o m e ri c G T P as e, 57 63 91 - 5 7 5 5 1 9 - G o _ co m p o n en t CO P1 1 v e s ic le c o a t; G o _ fu n ct io n : SA R s m a ll m o n o m e ri c A A W 4I 6I O .1 0 p u ta ti v e [C ry pt oc oc ca s n eo fo rm an s - 0. 49 6 0. 41 0 58 17 52 57 65 39 G TP as e a c ti vi ty ; G o _ p ro ce ss : ER to G ol gi tr a n sp o rt va r. n e o fo rm an s JE C2 II 5’: C on se rv ed h y p o th et ic al p ro te in [C ry pt oc oc ca s n e o fo rm an s va r. 5’: n o G o te rm s 5’: A A W 41 60 9. l; 57 74 16 - n e o fo rm an sJ E C 2 lj; 5.O OE -2 0 3’ A A W 4I 61 O 1 57 80 78 3” SA R s m a ll m o n o m e ri c G T P as e, 3’ : G o _ co m p o n en t: CO PU v e s ic le c o a t; G o _ fu n ct io n : SA R s m a ll m o n o m e ri c - 0. 49 6 0. 41 0 G TP as e a c ti vi ty ; G O _p ro ce ss : ER to G ol gi tr a m p o rt p u ta ti v e [C ry pt oc oc cu s n e o fo rm an s v a r. n e o fo rm an s JE C2 I] _ _ _ _ _ _ _ _ _ _ _ A ve ra ge LR St a. ”a rd D ev ia tr i o fL R Co or di na te s C H R de sc rib in g Pr ot ei n 1D b E- va lu e Co or di na te s Pr ed ic te d F. .c tio nd G en e O nt ol og y R 79 4 K B 38 64 K B 78 92 R 79 4 K B 38 64 K B 78 92 o fG en e’ th e a re a 57 83 85 H yp ot he tic al pr ot ei n [C ryp toc oc cu s A X P_ 77 72 90 .l l.O OE -1 42 - n eo fo rm an sv ar . n eo fo rm an sB - no GO te rm s - 0. 49 6 0. 41 0 58 04 38 35 01 Al G O _c om po ne nt : S AG A co m pl ex ; G O _f un ct io n st ru ct ur al m o le cu le ac tiv ity ; 58 05 83 - H yp ot he tic al pr ot ei n IC ry pto co cc us G O _p ro ce ss co n jug ati on w ith ce llu la r f us io n, G O _p eo ce ss :p ro te in co m pl ex A A W 4l 60 8. 1 5.O OE -t6 2 - 0. 49 6 0. 41 0 58 39 51 n eo fo rm an s v at . n eo fo rm an s JE C2 I] as se m bl y, G O _p ro ce ss : ch ro m at in m o di fic at io n. G O _p ro ce ss hi sto ne ac et yl at io n H yp ot he tic al pr ot ei n [C ryp toc oc cu s 57 83 85 - X P_ 77 72 90 .t l.O O E- t4 2 n eo fo rm an s va r. n eo fo rm an s B- no GO te rm s - 0. 49 6 0 41 0 58 04 38 35 01 A] 5’: Ch ol in e- ph os ph at e cy tid yl yt tra ns fe ra se , p ut at iv e 5’: G O _c om po ne nt : n u cl eu s; G O _c om po ne nt :n u cl ea r m em br an e; [C ryp toc oc cu s n eo fo rm an s va r, G O _c om po ne nt : G ol gi ap pa ra tu s; G O _f un ct io n: ch ol in e- ph os ph at e 83 94 26 - 5’: A A W 40 94 8. 1; 83 96 34 - n eo fo rm an s JE C2 I] cy tid yl yl tra ns fe ra se ac tiv ity ;G O _p ro ce ss : ph os pl ia tid yl ch ol in e bi os yn th es is; 5.O OE -3 9 - 0. 80 8 0. 84 0 84 39 11 3’: A A W 40 95 0. I 83 98 43 3’ :E R o rg an iz at io n an d bi og en es is- G O _p ro ce ss : C D P- ch ol in e pa th w ay re la te d pr ot ei n, pu ta tiv e 3’ :G O _c om po ne nt : e n do pl as m ic re tic ul um m em br an e; G O _p ro ce ss :E R [C sy pto co cc us n eo fo rm an s v at , o rg an iz at io n an d bi og en es is . _ _ _ _ _ _ _ n eo fo rm an s J EC 21 I ER o rg an iz at io n an d bi og en es is- 84 02 30 - re la te d pr ot ei n, pu ta tiv e G O _c om po ne nt : e n do pl as m ic re tic ul um m em br an e; G O _p ro ce ss : ER AA W 4O 9S O. i 9.O OE -1 32 - 0. 80 8 0. 84 0 84 28 47 [C ry pto co cc us n eo fo rm an s va r. o rg an iz at io n an d bi og en es is n eo fo rm an s JE C2 1I Ty pe 2C Pr ot ei n Ph oa ph at as e, 84 29 58 - A A W 40 95 2. I 0.O OE +0 0 84 49 70 pu ta tiv e [C ry pto co cc us n eo fo rm an s G O _f un ct io n: pr ot ei n ph os ph at as e ty pe 2C ac tiv ity - 0. 80 8 0. 84 0 va r, n eo fo rm an s JE C2 I] 5’. Co ns er ve d hy po th et ic al pr ot ei n 5’: G O _c om po ne nt :m em br an e; G O _f un ct io n: tr an sp or te ra ct iv ity ; [C ryp toc oc cu s n eo fo rm am va r. G O _p ro s’ tr an sp or t 19 81 13 0- 5’ .A A W 4I 2S 3. t, 19 80 66 9- n eo fo rm an s JE C2 I] 3” G O _c om po ne nt : p la sm a m em br an e; G O _f un ct io n fru ct os e tra ns po rte r - 0 89 8 - 0. 85 1 0. 40 6 0. 39 5 0.O OE +0 0 19 84 76 0 3’: AA W 41 25 4.1 19 82 08 2 3’: H ex os e tr an sp or t-r el at ed pr ot ei n, pu ta tiv e [C ryp toc oc cu sn eo fo rm an s ac tiv ity ;G O _f un ct io n: gl uc os e tr an sp or te r a ct iv ity ;G O _f un ct io n: m an n o se v ar n eo fo rm an s JE C2 I] tr an sp or te ra ct iv ity ; G O_ pe oc es s: he xo se tr an sp or t 19 84 41 3 - H yp ot he tic al pr ot ei n IC ry pt oc oc cu s GO ta nm - 0 89 8 - 0. 85 1 0. 40 6 0. 39 5 X P_ 71 t8 79 . 1 5. 30 E- 01 19 84 62 2 n eo fo rm an s v ar n eo fo rm ai m JE C2 I] 5’. Co ns er ve d hy po th et ic al pr ot em 5’: G O _c om po ne nt : m em br an e; G O _f un ct io n tr an sp or te r a ct iv ity ; lC ry pt oc oc cn s n eo fo rm an sv ar . G O _l m s’ tr an sp or t 19 81 48 7- 5’ :A AW 41 25 3. 1; 19 80 66 9- n eo fo rm an s JE C2 IJ GO co m po ne nt pl as m a m em br an e; G O _f un ct io n fru ct os e tr an sp or te r - 0 99 5 0 54 3 0.O OE +0 0 19 84 76 0 3’ :A AW 41 25 4. 1 19 82 08 2 3’: H ex os e tr an sp or t-r el at ed pr ot ei n, pu ta tiv e [C iyp toc oc cu s n eo fo rm an s a c tiv ity ;G O _c tio n: gl uc os e tr an sp or te r a ct iv ity ,G O_ fu nc tio n’ m an n o se tr an sp or te r a ct iv ity ;G O_ pr oc es a: he xo se tr an sp or t va r. n eo fo rm an sJ EC 2l ] 19 84 41 3- H yp ot he tic al pr ot ei n [C ry pto co cc us GO t e - 0 99 5 0 54 3 X P_ 7t 88 79 .1 5. 30 E- 01 19 84 62 2 n eo fo rm an s va r. n eo fo rm an s JE C2 IJ — — G O _c om po ne nt : n uc les is; G O _f un ct io n: tr an sc rip tio n lit ct or ac tiv ity ; H yp ot he tic al pr ot em (C ryp toc oc cu s G O _p ro ce ss : ca rb oh yd ra te m et ab ol ism ,G O _p ro ce ss re gu la tio n o f - 1. 47 6 1. 35 5 B 16 56 -6 93 6 A A W 42 7O I. I 0.O OE +0 0 93 3- 39 73 n eo fo rn ia ns v ar . n eo fo rm an sJ EC 2I I tr an sc rip tio n, D N A -d ep en de nt A A W 4I 4I O .1 0.O OE +O 0 51 11 - 63 53 Ex pr es se d pr ot ei n [C ry pto co cc us n o G O te rm s - 1. 47 6 1.3 55 n eo fo na an s va r. n eo fo rm an s JE C2 I] Ex pr es se d pr ot em [C ry pto co cc us no GO te rm s - l 47 6 1 35 5 A A W 40 61 6. 1 0.O OE +0 0 68 32 - 86 24 n eo fo rm an s va r. n eo fo rm an sJ EC 2I ] G O _c om po ne nt : n u cl eu s; G O _f un ct io n: tr an sc rip tio n Iit cto ra ct iv ity ; H yp ot he tic al pr ot ei n [C ryp toc oc cu s G O _p ro ce ss : ca rb oh yd ra te m et ab ol ism ; G O _p ro ce ss : r eg ul at io n o f - 1. 05 7 1.2 13 16 56 -7 73 5 A A W 42 70 1. 1 00 0E +0 0 93 3- 39 73 n eo fo rm an s va r. n eo fo rm an s JE C2 IJ tr an sc rip tio n, D N A -d ep en de nt A A W 4I 4I O .I 0.O OE +0 0 51 11 - 63 53 Ex pr es se d pr ot ei n lC ry pt oc oc cu s n o G O te rm s - 1. 05 7 1.2 13 n eo fo rm an s va r. n eo fo rm an s JE C2 I] A A W 40 61 6. l 0.O OE +0 0 68 32 - 86 24 Ex pr es se d pr ot ei n [C ry pto co cc us no GO te rm s - 1. 05 7 1.2 13 n eo fo rm an s v at . n eo fo rm an s J EC 2I ] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A v e ra g e L it S ta n d a rd _ D e v ia ti o n o f L R C o o rd in a te s CH R de sc rib in g P ro te in 11 3’ 8 -v a ln e C o o rd in a te s Pr ed ic te d F u n c ti o n ’ G e n e O n to lo g y R 7 9 4 K B3 86 4 K .B 7 8 9 2 R 7 9 4 K B3 86 4 K B 7 8 9 2 o fG en e’ th e a re a B 2 8 8 5 8 4 Cr yp to co cc us g a tt ii st ra in E5 66 A Y 71 04 29 5.O OE -1 75 M A Ts lo cu s, co m pl et e se qu en ce , N O n /a - 1 84 6 0. 92 1 29 06 30 G E N E 34 84 10 - 3 4 7 9 8 4 H yp ot he tic al pr ot ei n [C ryp toc oc cu s X P 77 81 23 .1 5. O O E- 65 n e o fo rm an s v a r. n e o fo rm am B - n o (3 0 te rm s - 3. 67 0 1. 03 3 34 95 05 — 34 82 08 35 01 A 1 N A D H d e h y d ro g e n a se s u b u n it S 34 97 03 - [m ito ch on dr ion Cr yp toc oc cu a n o G O te im u - 3 6 7 0 1 .0 3 3 A A N 3 7 5 8 4 .1 1 .O O E -7 1 34 99 06 n eo fo rm an s va r. gr ub ii (F ilo ba sid iel la n eo fo rm an s se ro ty pc A)1 34 81 28 - H yp ot he tic al pr ot ei n IC ry pt oc oc cu s X P 77 81 23 .1 5. O O E- 65 n eo fo rm as u va r. n eo fo rn ian a B - n o G O te rm s - 2. 75 5 1. 36 9 35 05 42 — 34 82 08 35 01 A ] N A D H d e h y d ro g e n su e su bu ni t 5 34 97 03 - [m ito ch on dri on Cr yp to co cc us n o te rm a - 2 75 5 13 69 A A N 3 7 5 8 4 .1 1 .O O E -7 1 3 4 9 9 0 6 n e o fo n n a n s v a r . gt ub ii (F il ob ne id ie ll a n eo fo rm an ss er o ty pe A) ] 34 81 28 - 34 79 84 - H yp ot he tic al pr ot ei n [C ryp toc oc cu u Xl ’ 77 81 23 .1 5.O OE -65 n e o fo rm a n s v a r . n eo fo rm an aB - n o GO te rm s - 2. 41 2 1. 50 9 35 05 81 — 34 82 08 35 01 A] N A D H d e h y d ro g e n a u e su bu ni t5 34 97 03 - [m ito ch on dri on Cr yp to co cc us n o GO te rm s - 2. 41 2 1. 50 9 A A N 37 58 4. 1 1.O OE -71 34 99 06 ne of lu rm am va r. gr ub ii (Fi lob na idi ell a - _ _ _ _ _ _ _ _ n e o fo rn ia n s se ro ty pe A) ] 1: tE N A iso pe nt en yl tra ns fe ra se , pu ta ti ve [C ry pt oc oc cu s n e o fo rm an s 1: GO _c om po ne nt :n u c le us ; G O _c om po ne nt : n u c le ol us ; GO _c om po ne nt : m it oc ho nd ri on ; G O _c om po ne nt : c yt os ol ; G O _f un ct io n: tR N A 93 16 65 - 1. A A W 40 86 1. l, 89 08 24 - va r. n e o fo rm an s JE C2 IJ ; is op en te ny lt rs ns fe ra se a c ti vi ty ; G O pr oc es s: tR N A m o di fi ca tio n - 1. 10 4 - 1. 12 7 0. 93 0 0. 95 0 0.O OE +0 0 96 29 18 2. A A W 40 86 3. 1 90 88 84 2. ri bo nu cl ea se H , pu ta tI ve 2: G O _c om po ne nt : ce ll; G O _f un ct io n: ri bo nu cl ea se H a c tiv ity ; G O pr oc es s: IC iw to co cc us n e o fo rm am V at D N A re pl ic at io n; G O _p ro ce ss : ce ll w a ll o rg an iz at io n an d bi og en es is n eo fo rm an s J EC 21 ]; 1: E xp re ss ed pr ot ei n IC ry pt oc oc cu s 90 89 58 n e o fo ri si an s v a r. n e o fo rm an s JE C2 I]; 1 n o GO te rm s; 1: A A W 40 87 5. 1; - 0. O O E+ 00 2 A A w 41 42 0 1 92 25 94 2: Co ns er ve d hy po th et ic al pr ot ei n 2: n o GO te rm s - 1 .t 04 - 1. 12 7 0. 93 0 0. 95 0 IC iy pt oc oc cu s n eo fo rm an sv a r. n e o fu rm an si E C 2l ] G O _c om po ne nt : s yn ap to ne m al c o m pl ex ; G O _c om po ne nt : n u c le us ; D N A to po is om er as e II , pu ta ti ve G O _f un ct io n: D N A to ps is om er as e (A TP -h yd ro ly zi ng ) a c ti vi ty ; G O _p ro ce ss : 92 30 16 - A A W 40 88 1. 1 0. O O E+ 00 [C ryp toc oc cu sn e o fo rm am v a r. re gu la ti on o fm it ot ic re c o m bi na ti on ;G O _p ro ce ss : D N A to po lo gi ca l c ha ng e; - 1. 10 4 - 1. 12 7 0. 93 0 0. 95 0 92 81 31 n e o fo rm an s JE C2 I] G O _p ro ce ss : D N A s tr an d e lo ng at io n; G O _p ro ce ss : c h ro m a ti c a s s e m bl y o r - di sa ss em bl y; GO pr oc es s: m e io ti c re c o m bi na ti on 5’: G O _c om po ne nt : s yn ap to ne m al c o m pl ex ; G O _c om po ne nt : n u c le us ; G O _f un ct io n: D N A to po is om er as e (A TP -h yd ro ly zi ng ) a c tiv ity ; G O _p ro ce ss : 5’: D N A to po is om er as e II, pu ta ti ve re gu la ti on o fm it ot ic re c o m bi na tio n; G O _p ro ce ss : D N A to po lo gi ca lc ha ng e; [C ry pt oc oc cu s n e o fo rm an s v a r. G O _p ro ce ss : D N A s tr an d e lo ng at io n; G O _p ro ce ss : c h ro m a ti c a s s e m bl y o r 5’: A A W 4O 88 1. 1; 94 36 68 - n e o fo rm an s JE C 2I ] di su ss em bl y; G O _p ro ce ss : m e io ti c re c o m bi na ti on ; 0. O O E+ 00 - 1. 10 4 - 1. 12 7 0. 93 0 09 50 3’: A A W 40 88 4. 1 94 81 68 3’ :c al ci um tra ns po rti ng A T Pa se , 3’: G O _c om po ne nt G ol gi m e m br an e; G O _c om po ne nt : pl as m a m e m br an e; pu ta ti ve [C ry pt oc oc cu s n e o fo rm an s G O _c om po ne nt : tr an s- G ol gi n e tw o rk tr a n s p o rt v e s ic le ; G O _f un ct io n: v a r. n c o fo rm an s JE C2 I] ph os ph ol ip id -t ra ns lo ca ti ng A TP as e a c ti vi ty ; G O _f un ct io n: A TP as e a c tiv ity : G O _p ro ce ss : in tr ac el lu la r pr ot ei n tr an sp or t; G O _p ro ce ss : po st -G ol gi tr an sp or t; G O _p ro ce ss : pr oc es si ng o f2 0S pr e- rR N A A ve ra ge L R S tn ’4 a rd D e v ia t o fU k Co or di na te s C H R de sc rib in g Pr ot ei n fl) h E -v al ue Co or di na te s Pr ed ic te d Fu nc tio nd G en e O nt ol og y R 79 4 K B 38 64 K B 78 92 R7 94 1C B3 86 4 K B 78 92 o fG en e th e a re a 5’: G O _c om po ne nt : sy na pt on em al c o m pl ex ; G O _c om po ne nt : n u c le us ; G O _f un ct io n: D N A to po is om er as e (A TP .h yd ro ly zi ng )a c tiv ity ; G O _p ro ce ss : 5’: D N A to po is om er as e II , pu ta tiv e re gu la tio n o fm ito tic re c o m bi na tio n; GO _p ro ce ss :D N A to po lo gi ca l c ha ng e; Cr yp to co cc us n e o fo rm an s v a r. G O pr oc es s: D N A st ra n d e lo ng at io n; G O _p ro ce ss : ch ro m at in a ss e m bl y o r B 5’: A A W 4O 88 1. 1; 94 82 35 - n e o fo rn ia m JE C 2I 1 di sa ss em bl y; GO _p ro ce ss :m e io tic re c o m bi na tio n; - 1. 10 4 - 1. 12 7 0. 93 0 0. 95 0 0. 00 E +0 0 3’ : A A W 40 88 4. 1 95 27 35 3’ :c al ci um tr an sp or tin g A T Pa se , 3’ :G O _c om po ne nt : G ol gi m e m br an e; G O _c om po ne nt : pl as m a m e m br an e; pu ta tiv e [C ry pt oc oc cu s n e o fo rm an s G O _c om po ne nt : ts an s- G ol gi n e tw o rk tr an sp or tv e si cl e; G O _f un ct io n: v a r. n eo fo rm an s JE C 2I I ph os ph ol ip id -t ra ns lo ca tin g A T Pa se a c tiv ity ; G O _f un ct io n: A TP as e a c tiv ity ; G O _p ro ce ss : in tr ac el lu la r pr ot ei n tr an sp or t; G O _p ro ce ss : po st -G ol gi tr an sp or t; G O pr oc es s: _p ro ce ss in g o f2 0S _p re -r It N A 1: tR N A is op en te ny ltr an sf er as e, pu ta tiv e (C ry pt oc oc cu s n eo fo rn sa ns 1: G O _c om po ne nt : n u c le us ; G O _c om po ne nt : n u c le ol us ; G O _c om po ne nt m ito ch on dr io n; G O _c om po ne nt c yt os ol ; G O _f un ct io n: tR N A 93 18 78 - 1: A A W 40 86 1 . 1; 89 08 24 - va r. n eo fo rm sn sJ E C 2I J; is op en te ny ltr an sf er as e a c tiv ity ; G O pr oc es s tR N A m o di fi ca tio n - l 14 1 0 94 9 0.O OE +0 0 96 30 47 2: A A W 40 86 3. 1 90 88 84 2: rib on uc lea se H ,p ut at iv e [C ryp toc oc cu sn eo fo rn ia ns 2: G O _c om po ne nt ce ll; GO _f un cti on :r ib on uc le as eH ac tiv ity ;G O_ pr oc es s: D N A re pl ic at io n; GO _p ro ce ss : ce ll w al lo rg an iz at io n an d bi og en es is n e o fo rm an s JE C 2I ]; 1: Ex pr es ae d pr ot ei n [C ryp toc oc cu s n eo fo rm an s va r. n eo fo rm an sJ EC 21 ]; 1 no GO te rm s; I: A A W 40 87 5. l; 90 89 58 - - 1. 14 1 0. 94 9 0. O O E+ 00 2: Co ns er ve d hy po th et ic al pr ot ei n 2 n o GO te rm s 2: A A W 4I 42 O. 1 92 25 94 [C ryp toc oc cu sn eo fo rm an s va r. n eo fo rm an sJ EC 2l ] G O _c om po ne nt :s yn ap to ne m al c o m pl ex ; G O _c om po ne nt : n u c le us ; D N A to po is om er as e II , pu ta tiv e G O _f un ct io n: D N A to po is om er as e (A TP -h yd ro ly zi ng )a c tiv ity ; G O _p ro ce ss : 92 30 16 - A A W 40 88 1. I 0. O O E+ 00 [C ry pt oc oc cu sn eo fo rm an s va r. re gu la tio n o fm ito tic re c o m bi na tio n; G O _p ro ce ss : D N A to po lo gi ca lc ha ng e; - 1. 14 1 0. 94 9 92 81 31 n e o fo rm an s JE C 2I I G O _p ro ce ss : D N A st ra n d e lo ng at io n; G O _p ro ce ss : c hr om at in a ss e m bl y o r di sa ss em bl y: _G O _p ro ce ss :_ m ei ot ic re c o m bi na tio n 5’: G O _c om po ne nt : sy na pt on em al c o m pl ex ; G O _c om po ne nt : n u c le us ; G O _f un ct io n: D N A to po is om er as e (A TP -h yd ro ly zi ng )a c tiv ity ; G O _p ro ce ss : 5’: D N A to po is om er as e II , pu ta tiv e re gu la tio n o fm ito tic re co m bi na tio n; G O _p ro ce ss : D N A to po lo gi ca lc ha ng e: [C ryp toc oc cu an eo fo rm an s v a r. G O _p ro ce ss : D N A st ra n d e lo ng at io n; G O _p ro ce ss : c br on ia tin a ss e m bl y o r 5’: A A W 4O 88 1. 1; 94 36 68 - n eo fo rm an s JE C 2I J di sa ss em bl y; G O _p ro ce ss : m e io tic re c o m bi na tio n; 0.O OE +0 0 - 1. 14 1 0. 94 9 3’ : A A W 40 88 4. 1 94 81 68 3’ :ca lci um tr an sp or tin g A TP as e, 3’ : G O _c om po ne nt Go lg im e m br an e; G O _c om po ne nt pl as m a m e m br an e; pu ta tiv e [C ryp toc oc cu sn e o fo rm an a G O _c om po ne nt : tr an s- G ol gi n e tw o rk tr an sp or t v es ic le :G O _f un ct io n: va r. n e o fo rm an s JE C 2I ] ph os ph ol ip id -t ra ns lo ca tin g A TP as e ac tiv ity ;G O _f un ct io n: A TP as e a c tiv ity ; G O _p ro ce ss : in tr ac el lu la r pr ot ei n tr an sp or t; G O _p ro ce ss : po at -G ol gi tr an sp or t; GO _p ro ce ss :_ pr oc es sin g o f2 0S _p re -r R N A 5’: G O _c om po ne nt sy na pt on em al c o m pl ex ; G O _c om po ne nt : n u c le us ; G O _f un ct io n: D N A to po is om er as e (A TP -h yd ro ly zi ng )a c tiv ity ; G O _p ro ce ss : 5’: D N A to po is om er as e U , pu tat iv e re gu la tio n o fm ito tic re co m bi na tio n; G O _p ro ce ss :D N A to po lo gi ca lc ha ng e; [C ry pto co cc us n eo fo rm an sv a r. G O _p ro ce ss : D N A st ra n d e lo ng at io n; G O _p ro ce ss : c hr om at in a ss e m bl y o r 5’ : A A W 40 88 1. 1; 94 82 35 - n e o fo rm am JE C 21 I di sa ss em bl y; G O _p ro ce ss : m e io tic re c o m bi na tio n; - l 14 1 0 94 9 0.O OE +0 0 3’ : A A W 40 88 4. 1 95 27 35 3’ :c al ci um tr an sp or tin g A T Pa se , 3’: G O _c om po ne nt : Go lg im e m br an e; G O _c om po ne nt pl as m a m e m br an e; pu ta tiv e [C ry pt oc oc cu s n e o fo rn ia ns G O _c om po ne nt : tr an s- G ol gi n e tw o rk tr an sp or t v e si cl e; G O _f un ct io n: v a r. n eo fo rm am JE C2 1] ph oa ph ol ip id -tr an slo ca tin g A TP as e ac tiv ity ;G O _f un ct io n: A T Pa se a c tiv ity ; G O _p ro ce sa : in tr ac el lu la r pr ot ei n tr an sp or t; G O _p ro ce ss : po st -G ol gi tr an sp or t: G O _p ro ce ss :_ pr oc es sin g o f2 0S _p re -r R N A 11 31 83 3- 11 30 69 5- C A P4 p tC ry pt oc oc cu a n eo fo rm an s no GO te rm s - 0 66 6 0. 61 6 D A A 05 95 4. 1 0. O O E+ 00 11 34 85 7 11 33 55 5 v a r. gr ub iij 11 33 87 5 - Ex pr es se d pr ot ei n C ry pt oc oc cu a n o G O te rm s - 0. 66 6 0. 61 6 A A W 41 70 0. 1 l.O O E -1 6l 11 36 05 4 n eo fo rm am va r. n eo fo rm am JE C2 1] 11 97 00 9- 11 97 08 2- Ex pr es se d pr ot ei n [C ryp toc oc cu s no GO te rm s 10 51 0. 70 9 A A W 41 83 2. 1 0. O O E+ 00 11 98 15 1 11 99 34 3 n e o fo rm as sv ar . n e o fo rm am JE C 2l l 11 97 90 9- 11 97 08 2- Ex pr es se d pr ot ei n [C ry pto co cc as n o G O te rm s 1 07 5 0 71 6 A A W 41 83 2. 1 0. O O E+ 00 11 98 23 2 11 99 34 3 n eo fo rm an s va r. n eo fo rm an s JE C2 1] L’ J A v e r a g e L R S t a n d a r d D e v ia ti o n o f L R C o o r d in a te s C o o r d in a te s P r e d ic te d F u n c ti o n ’ G e n e O n to lo g y R 7 9 4 K B 3 8 6 4 K B 7 8 9 2 R 7 9 4 K B 3 8 6 4 K B 7 8 9 2 CU R d e sc ri b in g P ro te in ID ” K -v a lu e o f G en e’ th e ar ea ’ 5’ :C o n se rv e d h y p o th e ti c a l p ro te in B 12 38 56 0- 5’: A A W 45 24 7. 1; 12 38 93 2- [C iyp toc oc cu s n eo fô rm an sv a r . 0.O OE +0 0 n eo fo nn ar n JE C2 IJ n o GO te n n n - 2 .3 6 1 - 2 .3 7 4 1 .1 2 5 1. 14 5 12 42 12 5 3 ’: A A W 45 02 6.1 12 40 91 7 3 ’: M M S2 , p ut at iv e [C ryp toc oc cu s n eo fls rm an sv ar . ne of or na an n JE C2 IJ 1. Ex pr tss ed pr ot ein [C ryp toc oc cu s n eo fo rm an sv ar . n eo fo rm an s JE C 2 II 2 1 2 6 3 3 4 - 1 . A A W 4 1 7 6 7 .1 ; 1 9 6 8 4 4 9 - 2. V a c u o le o r g a n iz a ti o n a n d 0 .0 0 E + 0 0 n o GO te ne ts - 2 .1 7 9 1.4 65 21 27 53 6 2. A A W 4 1 7 6 8 .l 2 1 4 9 9 6 4 b io g e n e s is -r e la te d p ro te in , p u ta ti v e [C ry pt oc oc cu s n e o fo rm a n s v a r . n e o fo rn ea n s JB C 2 I] 2 1 6 1 1 8 0 - 2 1 6 3 4 5 2 - U ro p o rp h y rm o g e n d e c a rb o x y la s e , G O _ c o m p o n e n t: n u c le u s ; GO c o m p o n e n t: c y to p la s m ; G O _ fu n c ti o n : A A W 4 1 4 4 2 . I 0 .O O E + 0 0 - 0 .5 6 5 0 .5 3 5 2 1 6 5 2 6 8 2 1 6 4 7 0 5 p u ta ti v e [C ry pt oc oc cu s n eo fo rm an s u ro po rp hy rin og en d ec ar b o x y la se a c ti v it y ; G O _ p ro ce ss : h o m e b io sy n th es is v a r . n e o fo rm a n sJ tr C 2 l[ 21 72 13 7- 21 72 77 4- C yt ok in e m du ci ng .g ly co pr ot ein , A A W 42 66 1. I 0 .O O E + 0 0 21 86 55 2 21 73 47 6 pu ta tiv e [C iyp toc oc cu sn eo fo nn an s no GO te em s - 1. 87 3 1. 41 6 va r. n eo fo rin an sJ EC 2l l Co ns er ve d hy po th et ic al pr ot ein 21 75 26 1 - A A W 4 2 0 4 4 . I 0 .O O E + 0 0 21 76 22 9 [C ryp toc oc cu n n eo fo rin an sv ar . no G O te em s - 1. 87 3 1. 41 6 n e o fo ri n a n s JE C2 1] 21 76 76 0- Hy po th eti ca l p ro tei n [C ryp toc oc cu n n o G O te ne ts - 1. 87 3 1 .4 1 6 A A W 45 16 7. 1 0.O OE fO O 21 78 06 1 n eo fo rm an sv ar . n eo fo rm an s JE C2 I] 5’ :G O_ co m po ne nt :n u cle us ;G O_ fu nc tio n: D N A bi nd in g; G O _p ro ce ss : ch ro m at in sil en ci ng a t te lo m er t; GO _p ro ce ss :s ho rt- ch ai n fa tty a c id 5’: hs t3 pr ot ein ,p ut at iv e m et ab ol ism 5’ .A A W 40 94 3. I, 21 79 35 1 - 3’ .T ra ns cr ip tio na la ct iv at or gc n5 , 3’ : G O_ co mp on en t: SA G A co m pl ex ;G O_ co mp on en t: A d a2 JG cn 5 /A d a3 - 1. 87 3 1. 41 6 0 .O O E + 0 0 3 ’: A A W 40 83 0. 1 21 80 17 9 pu ta tiv e [C ryp toc oc cu n n eo fu rm an n v a r . n e o fo rm an n tr a n s c ri p ti o n ac tiv ato rc o m p le x ; G O _ f u n c ti o n : hi sto ne ac et yl tra ns fe ra ae ac tiv ity ; G O_ pr oc es s: ch ro m at in m o di fic ati on ,G O_ pr oc es s: hi sto ne ac et yl ati on 5’: Ex pr es se d pr ot ei n IC ry pto co cc us n eo fo rm an s va r. n eo fo rm an s J EC 2I J; 5’: A A W 47 18 4. 1; 21 81 05 3- 2.O OE -1 53 3’ :A A W 47 19 1. 1 21 81 76 5 3’ :E po xi de hy dr ol as e 1, pu ta tiv e n o G O te nn s - 1. 87 3 1. 41 6 IC ry p to c o c c u s n e o fb rm an s v a r . ne of or m aa jJ EC 2I J 21 81 91 9- B xp rm ne dp ro te in [C ry pto co cc us A A W 4I 4I O .1 0. 00 E4 -0 0 n o G O te rin a - 1. 87 3 1. 41 6 21 83 16 0 ne of or m ain va r. n eo fls rm an sJ EC 2I ] 21 84 29 0- Hy po th eti ca l p r o te in [C ry pt oc oc cu s G O _ co m p o n en t: n u cle us ; G O_ fu nc tio n: tr an sc rip tio n fa ct or ac tiv ity ; A A W 42 7O I. I 0.O OE +0 0 21 87 33 0 n eo fb rin an sv ar n eo fo rm an s JE C2 I] GO _p ro ce ss: ca rb oh yd ra te m et ab oli sm ; G O _p ro ce ss : r eg ul at io n o f - 1. 87 3 1. 41 6 tra ns cr ip tio n, D N A -d qe en de nt 21 77 51 3 - 21 76 76 0- H yp ot he tic al pr ot ei n [C ryp toc oc cu s A A W 45 16 7. l 0.O OE +0 0 n o G O te nn s - 2. 44 7 1. 14 9 21 86 55 2 21 78 06 1 n eo fu rm an sv ar . n eo fo rm an n JE C2 1I 5’: G O _c om po ne nt n u cle us ;G O_ fu nc tio n: D N A bi nd in g; GO _p ro ce ss : ch ro m at in sil en ci ng a t te lo m er e; GO _p ro ce ss : s ho rt- ch ai n fa tty ac id 5’: hs t3 pr ot ei n, pu ta tiv e m et ah ol sm 5’. A A W 40 94 3. 1, 21 79 35 1 - 3’ :T ra n sc ri p ti o n a la c ti v a to rg c n s, 3’ :G O _c om po ne nt SA G A co m pl ex ; G O_ co m po ne nt :A da 2/ Gc nS /A da 3 - 2. 44 7 1. 14 9 0.O OE -I- 00 3’ :A A W 40 83 0. 1 21 80 17 9 pu ta tiv e [C ry pto co cc us n eo fo rm an s va r. n e o fo rm a n s JE C 2I ] tr an sc rip tio n ac tiv at or co m pl ex ;G O_ fu nc tio n: hi nt on e ac et yl tra m fe ra ae ac tiv ity ;G O _p ro ce ss :c hr om at in m o di fic at io n; G O _p ro ce ss :h in to ne ac et yl at io n 5’ :E xp re ss ed pr ot ei n (C ryp toc oc cu s n eo fo rin an sv ar . n eo fo rm an sJ EC 2I J; 5’ :A A W 47 18 4. 1; 21 81 05 3- 2.O OE -1 53 3’ :A A W 47 19 1. 1 21 81 76 5 3’ :E po xi de hy dr ol an e 1, pu ta tiv e no G O te rm s - 2. 44 7 1. 14 9 [C ry pt oc oc cu s n eo fo rm an s v ar . n e o fo rm a n sJ E C 2 lj 21 81 91 9 - E x p re ss e d pr ot ei n [C iyp toc oc cu s A A W 4I 4I O .l 0.O OE -I- 00 n o G O te rm s - 2. 44 7 1. 14 9 21 83 16 0 ne of ur m an sv ar . ne of or m sn s J EC 2I I _ _ _ _ _ _ _ _ _ _ _ _ A ve ra ge LR St an da rd D ev iat io n o fL R Co or di na te s Co or di na te s Pr ed ic te d Fu nc tio n’ G en e O nt ol og y R 79 4 1C B3 86 4 K B 78 92 R 79 4 KB 38 41 4 1C B7 89 2 C H It de sc rib in g Pr ot ei n ID h E- va lu e o fG en e’ th e a r e ? G O _c om po ne nt n u cl eu s; G O _f un ct io n: tr an sc rip tio n fa ct or ac tiv ity ; 21 84 29 0. H yp ot he tic al pr ot ei n [C ryp toc oc cu s GO _p ro ce ss : c a rb oh yd ra te m et ab ol is m ; GO _p ro ce ss :r e gu la tio n o f - 2. 44 7 1. 14 9 B A A W 42 70 1. 1 0. OO E+ 0O 21 87 33 0 n e o fo rm an s v ar . n e o fo rm as s JE C 2I J tr an sc rip tio n, D N A -d ep en de nt — M ito ch on dr ial c a rr ie r pr ot ei n, G O _c om po ne nt : m ito ch on dr ia l in ne rm em br an e; GO _f un cti on :t ra ns po rte r 1. 21 5 0. 59 2 C 79 21 5 - 80 06 1 A A W 42 54 9. 1 0 OO E+ 00 77 15 6- 79 38 4 pu ta tiv e [C ryp toc oc cu sn eo fo rm an s ac tiv ity ;G O_ pr oc es s tr an sp or t va r. n eo fo rm an s JE C2 II H yp ot he tic al pr ot ei n [C ryp toc oc cu s G O t e 1 21 5 0 59 2 A A W 42 52 t.t 0. O O E+ 00 79 97 2- 82 59 6 n eo fo rm an s v ar . n e o fo rm an s JE C 2I ] 5’: H yp ot he tic al pr ot ei n [C ry pt oc oc cu sn eo fo rm an sv ar . 5’ :n o G O te rm s 61 99 42 - 5’ :A A W 45 16 5. 1; 62 24 28 - n e o fo rm an s JE C 21 j 3’ : G O _c om po ne nt : n u cl eu s; G O _c om po ne nt : cy to pl as m ;G O _f un ct io n: 0.O OE +0 0 1. 58 2 0. 43 0 62 51 62 3’ : A A W 45 16 4. 1 62 49 77 3’ : B ra nc he d- ch ai n- am in o- ac id br an ch ed -c ha in -a m in o- ac id rr an sa m in as ea ct iv ity ;G O _p ro ce ss : am in o ac id tr an sa m in as e, pu ta tiv e [C ryp toc oc cu s c a ta bo lis m ;G O_ pr oc es s: br an ch ed ch ai n fa m ily a m in o ac id bi os yn th es is n e o fo rm an s v ar n e o fo rm an s JE C 2I ] C on se rv ed hy po th et ic al pr ot ei n 11 11 49 5- 11 10 48 5- A A W 42 27 3. I 0.O OE +0 0 [C ryp toc oc cu sn eo fo rm an s va r. no GO te rm s - 3. 17 9 1.3 73 11 15 32 0 11 11 62 9 n eo fo rm an s JE C 2I [ 11 12 90 6- H yp ot he tic al pr ot ei n [C ryp toc oc cu s n o G O te rm s - 3 17 9 1 37 3 A A W 46 79 9. I I. OO E- 10 9 11 14 54 2 n e o fo rm am v ar . n e o fo rm an s JE C 2l j C on se rv ed hy po th et ic al pr ot ei n 11 11 49 5- 11 10 48 5- A A W 42 27 3. l 0. O O E+ 00 [C ryp toc oc cu sn eo fo rm an sv ar . no GO te rm s - 2.3 08 1.7 72 11 16 83 8 11 11 62 9 n eo fo rm an s SE C 2I ] 11 12 90 6 - H yp ot he tic al pr ot ei n [C ryp toc oc cu s no GO te rm s - 2 30 8 1 77 2 A A W 46 79 9. 1 l.O O E- l0 9 11 14 54 2 n e o fo rm an s va r. n e o fo rm an s JE C 2I I Tr an sc rip tio n fa ct or ,p ut at iv e 11 15 27 0- A A W 41 85 6. l 0. O O E+ 00 11 19 11 5 [C ryp toc oc cu sn e o fo rm am va r. n o G O te rm s - 2. 30 8 1. 77 2 n e o fo rm am JE C 2I ] G O _c om po ne nt : n u cl eu s; G O _f un ct io n: tr an sc rip tio n fa ct or ac tiv ity ; 11 86 44 4- 11 87 08 4 - H yp ot he tic al pr ot ei n [C ry pt oc oc cu s G O _p ro ce ss c a rb oh yd ra te m et ab ol is m ; G O _p ro ce ss : re gu la tio n o f - 0 67 6 0. 53 1 A A W 42 70 1 . 1 0.O OE -fO O 11 90 32 8 11 90 12 5 n eo fo rn ss ns v ar . n eo fo rm am JE C2 IJ tra ns cr ip tio n, D N A -d ep en de nt C on se rv ed hy po th et ic al pr ot ei n 14 21 57 1- 14 20 28 1- A A W 42 7I I I 0. O O E+ 00 IC ty pt oc oc cu s n e o fo rm an s v ar . n o G O te rm s - 1. 05 1 0. 68 9 14 22 47 5 14 21 72 7 n e o fo rm am JE C 21 I 5’ : C on se rv ed hy po th et ic al pr ot ei n [C ry pto co cc us n e o fo rm an s v ar . 5’: n o G O te rm s 5’ : A A W 42 71 1. 1; 14 22 00 6 - n e o fo rm am JE C 2I J 3’ : G O _c om po ne nt : m ito ch on dr ia li nn er m e m br an e pe pt id as e co m pl ex ; 4. O O E- 14 8 - 1. 05 1 0. 68 9 3’ : A A W 42 71 4. 1 14 22 58 7 3’ : Si gn al pe pt id as e 1, pu ta tiv e G O _f un ct io n: pe pt id as e a c tiv ity ;G O _p ro ce ss : m ito ch on dr ia lp ro te in [ C r 3p to co cc u s n e o fo rm an s v ar . pr oc es si ng n eo fo rm an s J E C 2I ] — I. Co ns er ve d hy po th et ic al pr ot em 1: G O _c om po ne nt : n u cl eu s; G O _c om po ne nt : cy to pl as m ;G O _f un ct io n: (C ry pto co cc us n e o fo rm an s va r. 10 16 99 - 1 A A W 46 64 7 I; 10 49 42 - n e o fo rm an s JE C 2I pr ot ei n e a rn e r ac tiv ity ;G O _p ro ce ss : pr ot ei n- nu cl eu s Im po rt 0.O OE +0 0 2: G O _c om po ne nt : ac tin ca p (se ns u Sa cc ha ro m yc es ); G O _f un ct io n: SN A R E - 2. 19 2 1. 14 5 10 51 71 2. A A W 46 64 8. 1 11 42 13 2. H ’ 1so th et ic aI pr ot ei n bi nd in g; G O _p ro ce ss : e x o c yt os is ;G O _p ro ce ss : v e si cl e do ck in g da rin g [C ry pt oc oc cu sn e o fo rm an s va r. ex o cy to si. s; G O _p ro ce ss . v e si cl e fu si on n eo fo rm an sJ EC 2I ) I Co ns er ve d hy po th et ic al pr ot ei n I: G O _c om po ne nt : n u cl eu s; G O _c om po ne nt : c yt op la sm ;G O _f un ct io n: [C ry pt oc oc cu sn e o fo rm an s v ar . pr ot ei n ca rr ier a c tiv ity ;G O _p ro ce ss : pr ot ei n- nu cl eu s im po rt 10 16 99 - I. A A W 46 64 7. l, 10 49 42 - n e o fo rm an s JE C ZI ] 2: G O _c om po ne nt : ac lin ca p (se ns u Sa cc ha ro m yc es ); G O _f U nc tio n: SN A R E - 2. 02 9 1. 22 4 0 OO E+ 00 10 54 18 2. A A W 46 64 8. 1 11 42 13 2. H yp ot he tic al pr ot ei n bi nd in g; G O _p ro ce ss : e x o c yt os is ;G O _p ro ce ss : v e si cl e do ck in g du rin g [C ry pto co cc us n eo fo rtn an s va r. e x o c yt os is ;G O _p ro ce ss : v e si cl e fu si on n e o fo rm am JE C 21 ( 10 50 89 C on se rv ed hy po th et ic al pr ot ei n A A W 46 72 8. I 0. O O E+ 00 - 10 63 33 [C ry pt oc oc cu sn e o fo rm an s v ar . n o G O te rm s - 2. 02 9 1 22 4 n eo fo rm an sJ EC 2l j _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A ve ra ge LR St an da rd D ev ia tio n o fL it C oo rd in at es C H it de sc rib in g Pr ot ei n 1D b it- va lu e C oo rd in at es Pr ed ic te d Fu nc tio n’ 1 G en e O nt ol og y R 79 4 K B 38 64 K B 78 92 R 79 4 K B 38 64 K D 78 92 o fG en e’ th e_ nr ea I: Co ns er ve d hy po th et ic al pr ot ei n [Cr ypt oco ccu sn eo for ma ns I GO _c om po ne nt :n u c le us ; G O _c om po ne nt : c yt op la sm ; G O _f un ct io n: 10 16 99 - 1 A A W 46 64 7 I; 10 49 42 - n eo fo rm an sJ EC 2l J pr ot er n co rn er ac tiv ity ,G O_ pr oc es s. pr ot ei n- nu cl eu s im po rt 0.O OE +0 0 10 52 19 2: A A W 46 64 8 I 11 42 13 2: H yp ot he ti ca l pr ot ei n 2. G O _c om po ne nt . a c tin ca p (s en su Sa cc ha ro m yc es ); G O _f un ct io n. SN A R E - 2. 65 6 1. 28 2 [C ryp toc oc cu sn eo fo rm an s bi nd in g; GO _p ro ce ss :e x o c yt os is ; GO _p ro ce ss :v e s ic le do ck in g du rin g e x o c yt os is ; GO _p ro ce ss :v e s ic le fu si on n eo fo rm an s JE C2 I] Co ns er ve d hy po th et ic al pr ot ei n A A W 46 72 8. I 0. O O E+ 00 - 10 63 33 [C ryp toc oc cu sn e o fo rm an s v a r. n o G O te rm s - 2. 65 6 1. 28 2 n eo fo rm an s JE C2 I] 5’ : Co ns er ve d hy po th et ic al pr ot ei n [C ry pt oc oc cu s n e o fo rm an s v a r. 5’ : n o G O te rm s 21 34 06 5- 5’: A A W 45 24 7. 1; 21 34 45 0- 0. O O E+ 00 n e o fo rm an s JE C 21 ] 3’ : G O _c om po ne nt : n u c le ar po re ; G O _c om po ne nt : c yt op la sm ; G O _f un ct io n: - 2. 35 6 - 2. 34 1 1. 13 3 1. 19 7 21 37 60 3 3’ : A A W 45 02 6. 1 21 37 20 8 3’ : M M S2 , pu ta ti ve [C ry pt oc oc cu s s tr u ct u ra lc o n s ti tu en t o fn u c le ar po re ; G O _p ro ce ss : pr ot ei n- nu cl eu s im po rt n e o fo rm an s va r. n e o fo rm an s JE C 2I ] 22 22 10 5 - 22 21 21 1 - Ex pr es se d pr ot ei n [C ryp toc oc cu s n o G O te rm s 0 7 99 0 40 9 A A W 45 74 6. I 0. O O E+ 00 22 23 86 9 22 22 45 9 n eo fo rm an s va r. n e o fo rm am JE C 2I ] 22 23 79 9- H yp ot he ti ca lp ro te in [C ry pt oc oc cu s G O _c om po ne nt : in te gr al to pl as m a m e m br an e; G O _f un ct io n: n ic ot in am id e A A W 44 41 0. I 0. O O E+ 00 m o n o n u c le ot id e pe rm ea se a c ti vi ty ; G O _p ro ce ss : n ic ot in am id e m o n o n u c le ot id e 0. 70 9 0. 40 9 22 25 78 1 n e o fo rm an s va r. n eo fo rm an sJ E C 2I ] tr an sp or t 22 22 23 7- 22 21 21 1 - Ex pr es se d pr ot ei n [C ryp toe oc cu s , G O 0 71 7 0 37 2 A A W 45 74 6. I 0. O O E+ 00 22 23 86 9 22 22 45 9 n eo fo rm an s va r. n e o fo rm am JE C 2I ] 22 23 79 9- H yp ot he ti ca lp ro te in [C ry pto co cc us G O _c om po ne nt : in te gr al to pl as m a m e m br an e; G O _f un ct io n: n ic ot in am id e A A W 44 41 0. 1 0. O O E+ 00 m o n o n u c le ot id e pe rm ea se a c ti vi ty ; G O _p ro ce ss : n ic ot in am id e m o n o n u c le ot id e 0. 71 7 0. 37 2 22 25 78 1 n eo fo rm an s va r. n e o fo rm am JE C 2I ] tr an sp or t — H yp ot he ti ca l pr ot ei n [C ry pt oc oc cu s n o G O t e - 3 61 9 0 83 9 E 8 9 -9 7 6 9 X P _3 89 25 8. l 6. O O E -l 3 50 9- 26 56 n eo fo rm an s v a r. n eo fo rm an sJ E C 2I ] A lp ha -g lu co si de .h yd ro ge n s ym po rt er , A A W 47 00 6 I I OO E- lS 31 16 - 55 24 pu ta ti ve [C ry pt oc oc cu s n eo fo rm an s G O _c om po ne nt : m e m br an e fr ac tio n; G O _f un ct io n: a lp ha -g lu co si de :h yd ro ge n - 3. 61 9 0. 83 9 s ym po rt er a c tiv ity ; G O _p ro ce ss : a lp ha -g lu co si de tr an sp or t v a r. n eo fo rm an sJ EC 2I ] H yp ot he ti ca l p ro te in [P ar am ec iu m n o G O te rm s - 3. 61 9 0. 83 9 X P _0 0t 44 44 20 .l 1. 90 E +0 0 64 05 - 87 15 te tr au re li a s tr ai n d4 -2 ] H yp ot he tic al pr ot ei n [C ry pto co cc us n o G O te ri ns - 3. 96 8 - 3. 46 5 1. 18 8 1. 13 6 8 9- 10 93 9 X P _3 89 25 8. l 6. O O E -l 3 50 9- 26 56 n eo fo rin an s v a r. n e o fo rm an s JE C 2I ] A lp ha -g lu co si de :h yd ro ge n s ym po rt er , A A W 47 00 6 I I.O OE -I5 31 16 - 55 24 pu ta tiv e [C ry pl oc oc cu s n e o fo ri na ns G O _c om po ne nt . m e m br an e fr ac ti on , G O _f un ct io n. a lp ha -g lu co si de .h yd ro ge n - 3. 96 8 - 3. 46 5 1. 18 8 1. 13 6 s ym po rt er a c tiv ity ; G O _p ro ce ss : a lp ha -g lu co si de tr an sp or t va r. n e o fo rm an s JE C 2I ] H yp ot he ti ca lp ro te in [P ar am ec iu m n o G O te rm s - 3. 96 8 - 3 46 5 1. 18 8 1. 13 6 X P _0 0l 44 44 20 .I l.9 0E + 00 64 05 - 87 15 te tr au re li a s tr ai n d4 -2 ] 5’: H yp ot he ti ca l pr ot ei n [C ry pto co cc us n eo fo rm an s v a r. 5’: n o G O te rm s 5’: A A W 45 16 5. 1; n e o fo rm an s JE C 21 ] 3’ :G O _c om po ne nt : n u c le us ; G O _c om po ne nt : c yt op la sm ; G O _f un ct io n: 11 41 1- 16 64 8 0.O OE +0 0 12 23 2- 16 45 1 1. 06 7 0. 65 4 3’ :A A W 45 16 4. l 3’ :B ra nc he d- ch ai n- am in o- ac id br an ch ed -c ha in -a m in o- ac id tr an sa m in as e a c ti vi ty ; G O _p ro ce ss : a m in o a c id tr an sa m in as e, pu ta ti ve [C ry pt oc oc cu s c a ta bo lis m ; G O _p ro ce ss : br an ch ed ch ai n fa m il y a m in o a c id bi os yn th es is n e o fo rm am v a r. n e o fo rm an s JE C 2I ] 5’ : H yp ot he ti ca l pr ot ei n [C iy pt oc oc cu s n e o fo rm an s v a r. 5’ : n o G O te rm s 5’: A A W 45 16 5. 1; n e o fo rm an s JE C 21 ] 3’ :G O _c om po ne nt : n u c le us ; G O _c om po ne nt : c yt op la sm ; G O _f un ct io n: 15 48 5 - 22 69 7 0. O O E+ 00 12 23 2- 16 45 1 - 0. 85 4 0. 43 6 3’ :A A W 45 16 4.1 3’ :B ra nc he d- ch ai n- am in o- ac id br an ch ed -c ha in -a m in o- ac id tr an sa m in as e a c ti vi ty ; G O _p ro ce ss : a m in o a c id tr an sa m in as e, pu ta ti ve (C ry pt oc oc cu s c a ta bo lis m ; G O _p ro ce ss : br an ch ed c ha in fa m il y a m in o a c id bi os yn th es is n e o fo rm an s va r. n e o fo rm an s JE C 2I ] A sp ar ti c- ty pe e n do pe pt id as e, pu ta ti ve n o GO te rm s - 0. 85 4 0. 43 6 X P _0 0l 25 87 63 .l 1. 60 E -0 1 17 52 2- 18 20 4 [N eo na rto ry af is ch er iN R R L 18 1] L -a ra bi ni to l4 -d eh yd ro ge na se n o G O te rm s - 0. 85 4 0. 43 6 X P _0 0l 54 63 02 .l 2. O O E- 04 17 98 9- 19 48 5 [B ot ry ot in ia fu ck el ia na B 05 .1 0] H yp ot he ti ca lp ro te in [C ry pt oc oc cu s n o GO te rm s - 0 85 4 0 43 6 X P _6 60 56 2. 1 3.O OE -2 2 20 06 9- 21 98 9 n e o fo rm an s va r. n e o fo rm an s JE C 21 ] 1’— ) A v e ra g e L R S ta n d a r d D e v ia ti o n o f L R C o o rd in a te s C o o rd in a te s Pr ed ic te d F w ic ti o n ’ G en e O nt o’ og y R7 94 JC B3 86 4 KB 7& 92 R 7 9 4 K B 3 8 6 4 K B 78 92 C H R de sc rib in g P r o te in W b E - v a h ie o f G e n e ’ th e a re a E 43 59 5- 4 5 1 3 0 A A W 4 3 4 1 0 .1 3 .O O E -4 7 4 3 2 1 9 - 4 3 6 8 2 H y p o th e ti c a l p ro te in [C ry pto co cc us CX ) - 0 .8 9 2 0 .7 9 5 n e o fo rm a n s va r. n e o fo rm a n s J E C 2 II Co ns er ve d h y p o th e ti c a l p ro te in A A W 4 3 7 7 8 I 0.O OE +0 0 4 3 8 3 8 - 4 5 3 4 6 [C ry p to co cc u s n eo fo rm an s va r. n o GO te rm s - 0 .8 9 2 0 .7 9 5 n e o fo rm a n s J E C 2 II 8 8 7 1 6 - 9 0 6 1 1 A A W 4 3 7 8 9 .1 0 .O O E + 0 0 8 8 1 4 9 - 89 04 8 Ex pr es se d pr ot ei n IC ry pt oc oc cu s n o GO - 0 .6 4 2 0.7 81 n eo fo rm an sv a r . n eo fo rm an s J E C 2 1 ] Co ns er ve d h y p o th e ti c a l p ro te in A A W 43 81 9. 1 0.O OE +0 0 90 52 1 - 93 79 0 [C ry p to co cc u s n eo fo rm an s v a r . n o GO te rm s - 0. 64 2 0.7 81 n eo fo rm an s JE C2 II 35 41 95 - 35 21 40 - A sp ar ag in e- tR N A lig as e, pu ta tiv e GO co m po ne nt : c yt op la sm ;G O _f un ct io n: as pa ra gi ne -tR N A lig as e ac tiv ity ; A A W 43 42 7. I 0.O OE +0 0 - 2. 06 5 - 2. 07 5 1. 04 4 10 97 35 81 21 35 42 50 , t 0 c o e d n n eo fo rm an s va r. G O fit nc tio n: A T? bi nd in g; G O _p ro ce ss : as pa ra gi ny l-t RN A am in oa cy la tio n n eo fo rm an sJ EC 2I ] 5’: Co ns er ve d hy po th et ic al pr ot ei n 5’: A A W 45 24 7. 1; 35 49 70 - [C ry pto co cc us n eo fo nn aa s va r. 5’: no GO te rm s 0.O OE +0 0 n eo fo rm an s JE C2 IJ 3’ : G O _c om po ne nt n u cl ea rp or e, G O _c om po ne nt :c yt op la sm ;G O _f un ct io n: - 2. 06 5 - 2 07 5 1. 04 4 10 97 3’ :A A W 45 02 6. l 35 77 34 3’ : M M S2 ,p ut at iv e C ry pt oc oc cu s st ru ct ur al co n st itu en to fn u cl ea r p or e; G O _p ro ce ss : pr ot ei n- nu cl eu s im po rt n eo fo rm an sv ar . n eo fo rm an s JE C2 I] Co ns er ve d hy po th et ic al pr ot ei n G O _c om po ne nt :i nt ra ce llu la r; G O _c om po ne nt :c yt op la am ;G O _f un ct io n: 10 86 38 7- t0 86 30 1 - A A W 43 77 4. 1 0.O OE +0 0 10 87 70 3 10 89 50 4 [C ’,t oc or m sn eo fo rm an s va r. pe pt id e a lp ha -N -a ce ty ltr an sf er as e ac tiv ity ;G O _p ro ce ss :p ro te in am in o ac id 1. 29 0 1. 23 9 0. 96 7 0. 95 9 n eo fo rm an s JE C2 1] ac et yl at io n — — Co ns er ve d hy po th et ic al p ro te in F 10 68 6- 14 35 6 A A W 44 14 7. l 4. OO E- l7 5 10 96 5- 14 12 1 [C ry pto oe cc us ne of or ma os vn r. n o G O te rm s - 1. 21 3 - 1. 22 0 0. 41 2 04 39 n e o fo rm an sJ EC 2l l A TP .d ep en de nt pe rm ea se ,p ut at iv e G O _c om po ne nt cy to pl as m ;e n do pl as m ic re tic ul um ; in te gr al to m em br an e; X P_ 57 16 29 .l 0.O OE +0 0 14 28 9- 18 37 3 [C ry pto co cc us ne of or ma ns va r. G O _f iin ct io n= ’A TP -b in di ng ca ss et te (A BC )tr an sp or ler ac tiv ity ;G Op ro ce ss= - 1. 21 3 - 1. 22 0 0. 41 2 0. 43 9 n eo fo rm an s)E C2 11 tr an sp or t V es ic le -m ed ia te d tr an sp or t-r el at ed G O _c om po ne nt :i nt eg ra l t o m em br an e; G O _p ro ce ss : v ea ic le -m ed ia te d 25 92 77 - 25 66 40 - A A W 44 05 1. l 0.O OE +0 0 - 1. 73 9 1. 38 6 26 10 37 25 93 39 pr ot ei n, pu ta tiv e [ C r 35) to co ce ss tr an sp or t n eo fo rm an s va r. n eo fo rm an sJ EC 21 ] 5’: B io tin sy nt ha se ,p ut at iv e IC iy pt oc oc cu sn eo fo rm an s va r. 5’: G O _c om po ne nt :m ito ch on dr io n; G O _f un ct io n: bi ot in sy nt ha se ac tiv ity : 5’: A A W 44 37 5. l; 25 95 72 - n eo fis rm ac oJ EC 2l l G O _p ro ce ss : bi ot in bi os yn th es is - l 73 9 13 86 0.O OE +0 0 3’ :A A W 44 05 1 . 1 26 14 41 3’ :V es ic le -m ed ia te d tr an sp or t-r el at ed 3’ :G O_ co m po ne nt :i nt eg ra l t o m em br an e; GO _p ro ce ss :v es ic le -m ed ia te d pr ot ein ,p ut at iv e [C ry pto co cc us tr a n s p o rt n e o fo rm a n s v a r . n e o fo rm a n s J E C 2 II 5’: B io tin sy nt ha se ,p ut at iv e [C ry pto co cc ut n eo fo rm an sv ar . 5’: G O _c om po ne nt :m ito ch on dr io n; G O _f un ct io n: bi ot in sy nt ha se ac tiv ity : 25 94 28 - 5’: A A W 44 37 5. 1, 25 95 72 - n eo fo rm am JE C2 I] G O _p ro ce ss : bi ot in bi os yn th es is - 1 82 4 1 41 5 0.O OE +0 0 26 10 37 3’ :A A W 44 O5 I.1 26 14 41 3’ :V m ic le -m ed ia te d tr an sp or t-r el at ed 3’ :G O _c om po ne nt :i nt eg ra lt o m em br an e: G O _p ro ce ss :v es ic le -m ed ia te d pr ot ei n, pu ta tiv e [C ry pto co cc us tr an sp or t n eo fo rm an s va r. n eo fo rm am JE C2 I] 5’: Ex pr es se d pr ot ei n (C ryp toc oc cu s n eo fo rm an sv ar . n eo fo rm an sJ EC 2I J 48 13 02 - 5’: A A W 44 06 8. 1, 48 20 98 - 0.O OE +0 0 48 16 03 3’ :A A W 44 06 9. 1 48 62 20 3’ :H yp ot he tic al pr ot ei n n o G O te rm s - 1. 57 9 0. 93 0 [C ry pto co cc us n eo fo rm an s v ar n eo fo rm an s JE C2 I( 5’: Ex pr es se d pr ot ei n (C ryp toc oc cu a 48 20 98 n eo fo rm an sv ar . n eo fo rm an sJ EC 2I J 48 13 02 - 5’: A A W 44 06 8. l, - 0.O OE +0 0 50 27 79 3’ :A A W 44 06 9. 1 48 62 20 3’ :H yp ot he tic al pr ot ei n no GO te rm s - 1. 41 7 0. 98 4 [C ry pto co cc us n eo fo rm as s v ar n eo fo rm an s JE C2 I] Te lo m er e le ng th co n tr ol pr ot ei n, G O _c om po ne nt .n u cl eu s; G O _f un ct io tc in os ito lo r ph os ph at id yl in os ito lk in as e 75 78 95 - 74 90 27 - A A W 44 22 8. 1 0.O OE +0 0 75 94 12 75 86 43 pu ta tiv e [C ry pto co cc us n eo fo rm an s ac tiv ity :G O _p ro ce ss : re sp on se to D N A da m ag e st im ul us ;G O _p ro ce ss : - 1. 21 9 - 1. 21 5 1. 05 4 1. 05 7 va r. n eo fo rm an sJ EC 2I ] te lo m er as e- de pe nd en tt el om er e m ai nt en an ce — — A ve ra ge LR St an da rd D ev ia tio n o fL R C oo rd in at es C oo rd in at es Pr ed ic te d Fu nc tio nt G en e O nt ol og y R 79 4 K B 38 64 8(1 17 89 2 R 79 4 1(1 13 86 4 1(1 17 89 2 C H R de sc rib in g Pr ot ei n W b E- va lu e o f G en e’ th e ar ea ’ 5’ : G O _ co m p o n en t: m e m b ra n e fr ac ti o n ; G O _ fu n ct io n : a lp h a . 5’ . A lp h a -g lu c o si d e .h y d ro g e n g la c o si d e h y d ro g en sy m po rte ra c ti v it y ; GO _p ro ce ss :a lp h a .g lu c o si d e tr a m p o rt s y m p o rt e r, pa ta ti ve Ic rY P to c o c c u s 3’ : G O _c om po ne nt :p las m a m e m b ra n e. G O _ fu n ct io n : tw o -c o m p o n e n t s e n s o r 5’ . A A W 4 7 0 0 6 .l n eo fo rm an s v a r . n e o fo rm a n s JE C 2 II m o le c u le a c ti v it y ; G O _ fu n ct io n : p ro te in -h is ti d m e ki na se a c ti v it y ; 1 .3 4 6 0 .6 7 1 7 .O O E -1 1 2 3 1 9 -2 0 0 6 (3 8 5 -1 7 9 1 3’ . A A W 47 00 7. 1 3’ . P ro te in -h is ti d in e km as e, pu ta tiv e G O _f un ct io n: o s m o s e n s o r ac tiv ity ;G O_ pr oc es s: p ro te in a m in o a c id [C ryp toc oc cu sn eo fo rm an s Va T. p h o sp h o ry la ti o n ; GO _p ro ce ss : o s m o s e n s o r y s ig n a li n g p a th w a y v ia tw o . n eo fo rm an s JE C 2 IJ co m po ne nt s y st em ; G O _p ro ce ss : re sp on se to h y d ro g en pe ro xi de 2 5 9 1 9 - 2 7 9 8 0 A A W 4 4 8 3 6 . I 0. O O E + 0O 2 5 9 4 5 - 2 7 5 7 3 Ex pr es se d p ro te in [C ry pt oc oc cu s n o G O te rm s 1 .0 7 0 0 .4 0 8 n eo fo rm an s v ar ne of or m aim JE C2 I] 4 7 7 4 5 3 - 4 7 7 6 3 8 - Ex pr es se d pr ot ei n [C ry pto co cc ss no GO te em s - 2 33 2 1.6 32 A A W 44 88 4 I 0 .O O E + 0 0 47 90 17 4 7 9 5 9 3 n eo fo rm an s va r. n eo fo rm an sJ EC 2I ] 5 ’: In os ito lo x yg en as e, pu ta tiv e IC ly pt oc oc cu sn eo fo rm an sv ar . 5 1 6 5 2 0 n eo fo rm an sJ EC 2l ] 51 77 40 - 5 ’: A .A W 44 68 7, 1; - 0.O OE +0 0 3 : G lu ta ry l. C 0 A d eh y d ro g en as e, n o G O te r m s - 3 .4 7 3 0 .9 6 2 5 3 2 7 9 5 3 ’: A A W 4 4 6 8 9 .l 5 1 8 0 1 0 m ito ch on dr ia lp re cu rs or ,p u ta ti v e [C ry pt oc oc cu s n e o fo rm am v a r . n e o fo rm a n s JE C 2 t] 5 1 8 4 9 3 In o s it o l o x yg en at e, pu ta tiv e A A W 44 68 7. I 0 .0 0 E + 0 0 - 5 1 9 6 7 0 [C ry pt oc oc cu s n eo fo rin an s v a r . n o G O te rm s - 3 .4 7 3 0. 96 2 n eo fo rm an s JE C2 1I 52 12 20 C o n se rv ed hy po th et ic al p ro te in A A W 44 68 3. 1 0.O OE +0 0 - 52 23 63 IC Iy pt oc oc cu s n eo fo rm an sv ar . n o GO te rm s - 3. 47 3 0. 96 2 n eo fo rm am JE C2 1] 5’: M y o -i n o si lo l tr a n s p o rt e r 2, p u ta ti v e [C ry pt oc oc cu s n e o fo rm a n s 5 A A W 44 68 0 1; 52 29 95 - v ar n e o fo rm a n s JE C2 I) 5’ : G O _ co m p o n en t: m e m b ra n e; G O _ fu n ct io n : m y o -i n o si to l tr a m p o rt e r 0.O OE +0 0 3 ’ A A W 4 4 6 8 2 1 5 2 4 3 8 2 3’ : Co ns er ve d h y p o th e ti c a l p ro te in a c ti v it y ; G O _ p ro ce ss : m y o .i n o a it o l tr an sp or t - 3. 47 3 0. 96 2 3’ : n o G O te rm s [C ry pt oc oc cu s n e o fo rm a n s v a r . n eo fo rm an s JE C2 I) 52 52 78 - M y o -i n o si to l tr a n s p o rt e r 2, pu ta tiv e G O _ co m p o n en t: m e m b ra n e; G O _ fu n ct io n : m y o -i n o si to l tr a n s p o rt e r a c ti v it y ; - 3. 47 3 0. 96 2 A A W 44 68 0. I 0.O OE +0 0 52 76 26 [C ry’ sto co ccu sn eo fo rm ar n va r. GO _p ro ce ss :m y o .i n o si to l tr a n s p o rt n eo fo rm an a JE C2 II 52 94 13 - Ex pr es se d pr ot ei n [C ry pto co cc us GO - 3. 47 3 0. 96 2 A A W 44 88 3. 1 0.O OE +0 0 53 21 75 n eo fo rn ia ns v a r , n e o fo rm a n s JE C2 I] 5 3 2 5 9 6 C o n se rv ed hy po th et ic al p ro te in A A W 44 67 7. l 0.O OE +0 0 - [C ry pt oc oc ca sn eo fo rm nn sv ar . n o G O te rm s - 3. 47 3 0. 96 2 53 37 66 n eo fo rm an sJ EC 2l ] 5’: In os ito lo x yg en as e, p u ta ti v e [C ry pto co cc us n eo fo rm an s va r. n eo fo rm an s JE C2 I] 51 77 40 - 5’ .A A W 44 68 7. l, 51 65 20 - 0 .O O E + 0 0 3’ : G lu ta ry l. C 0 A d eh y d ro g en as e, n o GO te rm s - 3. 32 7 1. 02 2 53 31 76 3’ :A A W 44 68 9. l 51 80 10 m it o c h o n d ri a l p re c u rs o r, p u ta ti v e [C ry pt oc oc cu s n e o fo rm a n s va r. n e o fo rm a n s JE C2 IJ 51 84 93 In o si to l o x y g e n a se , p u ta ti v e A A W 44 68 7. 1 0.O OE +0 0 - 51 96 70 IC ry p to co cc u s n e o fo rm a n s v a r . n o GO te rn ss - 3. 32 7 1. 02 2 n eo fo rm an sJ EC 2I ] 52 12 20 C o n se rv e d h y p o th e ti c a l p ro te in A A W 44 68 3. 1 0.O OE +0 0 - [C cy pt oc oc cu s a e o fo rm a m v a r . n o GO te rm s - 3. 32 7 1. 02 2 52 23 63 n eo fo rm an tJ EC 2I ] 5’: M y o -i n n n it o l tr an sp or te r2 , p u ta ti v e [C sy p to c o c c u s n e o fo rm a n s 5’ A A W 44 68 0 I; 52 29 95 - v a r n e o fo rn m n s JE C2 IJ 5’ :G O _ c o m p o n e n t: m e m b ra n e ; G O _ fu n c ti o n : m y o .i n o si to l tr a n s p o n e r 0.O OE +0 0 3’ :A A W 44 68 2. 1 52 43 82 3’ :C o n se rv e d hy po th et ic al p ro te in a c o ’ G O _p ro ce ss .m yo -in os ito ltr an sp or t - 3. 32 7 1. 02 2 3’ :n o G O te rm s [C ryp toc oc cu sn eo fo rm an s va r. n eo fo rm sn sJ EC 21 ] t’ J _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A v e ra g e L R S ta n d a rd D e v ia ti o n o f E R C o o rd in a te s C o o rd in a te s P re d ic te d F u n c ti o n G e n e O n to lo g y R 7 9 4 K B 3 8 6 4 1 C B 7 8 9 2 R 7 9 4 K B 3 8 6 4 K B 7 8 9 2 C H R d e sc ri b in g P ro te in 1 D b E - v a h ie o f G e n e ’ th e a re a ’ M yo -in os ito l t ra ns po rte r2 , p u ta ti v e 52 52 78 - 52 76 26 (C ryp toc oc cu sn eo fo rm an s G O _c om po ne nt m em br an e; G O _f un ct io n: m yo -in os ito lt ra ns po rte ra ct iv ity ; - 3. 32 7 1. 02 2 0 A A W 4 4 6 8 0 .1 0.O OE +0 O G O _p ro ce ss : m yo -in os ito lt ra ns po rt n e o fo rm a n .s JE C2 1] 52 94 t3 - Ex pr es se d pr ot ei n [C ry pto co cc us no GO te r m s - 3. 32 7 1. 02 2 A A W 4 4 8 8 3 . I O. OO E+ 00 53 21 75 n eo fo rm an s va r. n eo fo rm an sJ EC 2I ] Co ns er ve d hy po th et ic al p ro te in 5 3 2 5 9 6 - A A W 44 67 7. 1 0.O OE 40 0 [C iyp toc oc cu s n eo fo rm an s v a r . no G O te r m s - 3. 32 7 1. 02 2 53 37 66 n eo fo rn ia ns JE C2 IJ 5’: Ex pr es se d pr ot ei n [C ryp toc oc cu s n eo fo rm an sv ar . n eo fo rm an sJ EC 2I J 12 43 24 8- 5’: A A W 44 06 8. I, 12 43 67 7- 0. 00 5+ 00 12 44 77 5 3’ :A A W 44 06 9. l 12 44 15 5 3’ :H y p o th e ti c a l pr ot ei n no GO te r m s - 0. 52 5 0. 55 3 [C ry pto co cc us n eo fo rm an s v ar n eo fo m ia ns JE C2 I] 5’: Ex pr es se d p ro te in IC iy pt oc oc cu s n eo fo rm an s va r. n eo fo rm an sJ EC 2I ] 12 43 48 0- 5’: A A W 44 06 8. t; 12 43 67 7- 0.O OE +0 0 12 44 77 5 3’ :A A W 44 06 9. 1 12 44 15 5 3’ :H y p o th e ti c a l p ro te m n o GO te rm s - t.0 12 0. 63 4 [C ry pto co cc us n e o fo rm a m v at n eo fo rm an s JE C2 IJ — — Co ns er ve d hy po th et ic al p ro te in H 13 - 13 84 7 A A W 44 14 7. t 2.O OE -1 06 14 3- 17 58 [C ry pto co cc us n eo fo rm an s v ar no GO te r m s - 2. 78 1 - 2. 79 5 0 99 4 1 .0 0 7 n eo fo rm an s JE C 2I j H ex os e tr an sp or t-r el at ed pr ot ei n, 6. 00 5- I 6 31 34 - 54 04 pu ta tiv e [C ry plo co cc us n eo fo rm an s n o ta pp lic ab le - 2. 78 1 - 2. 79 5 0. 99 4 1. 00 7 va r. _n eo fo rm an s JE C2 II H yp ot he tic al pr ot ei n [C ry pto co cc us X P_ 77 l 7 94 .1 7.O OE -0 3 59 31 - it It n e o fo rm a n a v a r . n e o fo rn ia n s B- no GO te r m s - 2. 78 1 - 2. 79 5 0. 99 4 1. 00 7 35 01 A] H yp ot he tic al pr ot ei n [C ry pto co cc us X P_ 77 63 33 .t 1. 80 5- 01 92 03 -9 96 2 n e o fo rm a n s v a r . n e o fo rm a n s B- n o GO te rm s - 2. 78 1 - 2. 79 5 0. 99 4 1. 00 7 35 01 A] Pu ta tiv e m e m b ra n e p ro te m n o GO te rm s - 2. 78 1 - 2 79 5 0. 99 4 1. 00 7 B A C7 42 81 .l 5. 90 E- 02 10 43 0- 11 30 8 [S tre pto my ce sa v tr m iti lis M A -4 68 0] Co ns er ve d hy po th et ic al p ro te in n o GO te rm s - 2. 78 1 - 2. 79 5 0. 99 4 1. 00 7 X P_ 74 90 01 .1 t.2 0E +0 0 11 99 3- 13 37 3 [A sp erg illu s_ fum iga tus A f2 93 ] Co ns er ve d hy po th et ic al p ro te in 13 - 18 85 9 A A W 44 14 7. 1 2 OO E- 10 6 14 3- 17 58 [C ry pto co cc us n eo fo rm an s v at no GO te n n s - 2. 76 8 13 84 n e o fo im an sJ EC 2l ] H ex os e tr an sp or t-r el at ed p ro te in , 6. 00 E- l6 31 34 -5 40 4 p u ta ti v e (C ry p to co cc u s n e o fo n n a n s n o t a p p li c a b le - 2. 76 8 1. 38 4 va r. ne of or m an sJ EC 2I ] H yp ot he tic al pr ot ei n (C ryp toc oc cu s X P_ 77 1 79 4.1 7.O OE -0 3 59 31 - 71 1 t n eo fo rm an s va r. n eo fo rm an s B- no GO te rm s - 2. 76 8 t.3 84 35 01 A] H yp ot he tic al pr ot ei n [C ry p to c o c c u s X P7 76 33 3. I I 8 0 E - 92 03 - 99 62 n e o fo n n a n s v a r n e o fo rm a n s B- n o GO te rm s - 2. 76 8 1. 38 4 35 01 A1 Pu ta tiv e m em br an ep ro te in BA C7 42 81 .1 5. 90 E- 02 t0 43 0- 11 30 8 n o G O te rm s - 27 68 1. 38 4 [S tr e p to m y c e s av er m iti lis M A -4 68 0] Co ns er ve d hy po th et ic al p ro te m , GO m s - 2. 76 8 1. 38 4 X P_ 74 90 01 .1 l.2 0E +0 0 11 99 3- 13 37 3 [A sp ex gil lus fu m ig at us A f2 93 ] 5’ Co ns er ve d hy po th et ic al p ro te in [C typ toc oc cu sn eo fo rm an a v ar 5’: G O _ c o m p o n e n t: m e m b ra n e ; G O _f un ct io n tr an sp or te r a ct iv ity , 5’: A A W 41 25 3. 1; n e o fo rm a n s JE C2 I] G O _ p ro c e ss : tr a n s p o rt 0.O OE +0 0 17 47 0- 19 22 0 3’ A A W 41 25 4 1 3’ H ex os e tr an sp or t-r el at ed pr ot ei n, 3’ : G O _ c o m p o n e n t: pl as m a m em br an e; G O _f un ct io n: fr u c to s e tr a n s p o rt e r - 2. 76 8 t.3 84 p u ta ti v e (C ry p to co cc u s n e o fo rm a n s a c ti v it y ; G O _f un ct io n: gl uc os e tr an sp or te ra c ti v it y ; G O _f un ct io n: m an n o se v ar n eo fo rm an s JE C2 I] tr an sp or te r a c ti v it y ; G O _ p ro c e ss : h e x o se tr an sp or t [‘3 A ve ra ge L P S tr ”a rd D ev ia t. ” o fL R Co or di na te s C H It de sc rib in g P ro te in ID ” £- va lu e Co or di na te s P re di ct ed F un ct io n” G en e O nt ol og y R 79 4 K E 38 64 K B 78 92 R 79 4 K B3 86 4 K B 78 92 o fG en e’ th e ar ea ’ 5’ : Co ns er ve d hy po th et ic al pr ot ei n , GO Co IS IP On en t m em br an e; GO _f un cti on :t ra ns po rte ra ct iv ity ; [C ryp toc oc cu nn eo fo rm an s va r. 5’ : A A W 41 25 3. 1; n eo fo nn an tJ EC 2I J GO _p ro ce ss .t ra ns po rt H 15 37 5- 21 76 7 0.O OE +0 0 17 47 0- 19 22 0 3’ : G O _c om po ne nt : pl as m a m e m br an e; GO _f un cti on :f ru ct os e tr an sp or te r - 1. 51 9 - 1. 50 4 1. 01 4 1. 03 6 3’ :A A W 41 25 4. l 3’ :H ex ot e tr an sp or t-r el at ed pr ot ein , pu ta tiv e[ Cr yp toc oe cu sn eo fo rin an s ac tiv ity ;G O _f un ct io n: gl uc os e tr an sp or te r ac tiv ity ;G O_ fu nc tio n: m an n o se JE CZ I] tr an sp or te ra ct iv ity ; G O pr oc cs s: he xo se tr an sp or t 5’: Co ns er ve d hy po th et ic al pr ot ei n 5’ G O _c om po ne nt :m em br an e; G O _f un ct io n: tr an sp or te r a ct iv ity ; [C ry pto co cc ss n eo fo nn an s‘ a G O _p ro ce ss : tr an sp or t 5’: A A W 41 25 3. 1; n eo lb rm an sl EC 2l ] 6. O O E- 45 19 47 2- 19 68 7 3’ : G O _c on sp on en t pl as m a m em br an e; GO _f un cti on :f ru ct os et ra sn po rtu - 15 19 - 1. 50 4 1. 01 4 1. 03 6 3’ :A A W 41 25 4.1 3’ :h ex os et ra ns po rt- re la te dp co te in , [C ry pt oc oc cu s n eo fo rm am ac tiv ity ;G O_ fu nc tio n: gh ic os et ra ns po rte ra ct iv ity ; G O fu nc tio n: m an n o se va r. n eo lit rm an s tr an sp or te ra ct iv ity ; G O_ pr oc es s: he xo se tr an sp or t 5’ .C on se rv ed hy po th et ic al pr ot ei n 5’: G O _c om po ne nt : m em br an e; G O _f un ct io n: tr an sp or te r a c tiv ity ; [C njp toc oc cu sn eo fo rn ia ns GO _p ro ce ss :t ra ns po rt 5’ .A A W 41 25 3. 1, n eo fu rm an ai EC 2l l 3’ :G O _c om po ne nt :p la am am em br an e; G O _f im nc tio n: fru ct os et ra ns po rte t - 1. 51 9 - 1. 50 4 1. 01 4 1. 03 6 3. O O E- 30 20 57 5 - 20 84 7 3’ :A A W 41 25 4. 1 3’ :H er co ne tr an sp or t-r el at ed pr ot ein , pu ta tiv e [C ryp toc oc cu sn eo fo rm an s ac tiv ity ;G O_ fu nc tio n: gl uc os e tr an sp or te ra c tiv ity ;G O _f un ct io n: m a n n o se v ar n eo fo nn an s J E C 2I I tr an sp or te ra c tiv ity ;G O _p ro ce ss : he xo se tr an sp or t 50 08 39 RN A he lic as e, pu ta tiv e 50 66 15 - - A A W 45 48 7. 1 0. O O E+ 00 IC ry pt oc oc cu s n ro fo nn an sv a r. G O _f un ct io n: RN A he li ca sr a c tiv ity ; GO _p ro ce ss : re gu la tio n o ft ra ns la tio n - 1. 27 3 0. 98 0 51 40 12 50 71 92 n e o fo ri na ns JE C 2I J 50 76 20 - St er ol -b in di ng pr ot ei n IC ry pt oc oc cu s GO fu o rt in st er ol c a rr ie r ac tiv ity - 1 27 3 0 98 0 A A W 45 50 7. I 0. O O E+ 00 50 84 02 n eo fo rm an s v a r. n e o fo r, na ns JE C 2I I 50 95 55 - Ex pr es se d pr ot ei n [C ryp toc oc cu s no GO te rm s - t 27 3 0 98 0 A A W 45 66 2. 1 0.O OE +0 0 51 14 31 n ro fo rn sa ns v a r. n ro fo rm an s JE C2 I] 5’ : Co ns er ve d hy po th et ic al pr ot ei n [C ry pt oc oc cu s n eo fo rn ia ns va r. 5’ : n o G O te rm s 5’ : A A W 45 24 7. 1; 51 36 48 - 0.O OE -f0 0 n eo fo rm am JE C2 I] 3’ : GO _c om po ne nt :n u cl ea rp or t; GO _c om po ne nt .c o pl as m GO _f un cti on : - 1. 27 3 0. 98 0 3’ : A A W 45 02 6. 1 51 69 07 3’ :M M S2 , p ut at iv e [C ry pto co cc us st ru ct ur al co n st itu en to fn u cl ea rp or e; G O _p ro ce ss : pr ot ei n- nu cl eu s im po rt n eo fo rs na ns va r. n eo fo rn ia ns JE C2 I) 55 89 42 - 55 91 63 - Ex pr es se d pr ot ei n [C ry pto co cc us no G O te rm s - 1 42 9 1 24 6 A A W 45 65 9. I 0.O OE +0 0 56 28 67 . 56 10 23 n eo fo rm an sv ar . ise of or m an sJ EC 2I ] 56 20 93 - Ex pr es se d pr ot ei n [C ryp toc oc cu a (1 0t er m s - 1 42 9 1 24 6 A A W 45 66 2. 1 0.O OE +0 0 56 39 51 n eo fo rm an sv ar . n ro fo rm an s JE C2 I] 10 39 61 1- N O G EN E - 1. 11 7 0. 82 9 10 42 67 7 10 39 80 1- N O G EN E - 1. 37 5 0. 58 2 10 40 50 1 5’ :H yp ot he tic al pr ot ei n 11 20 68 9- 5’: A A W 47 16 2. t; tt 2l 25 2. fC ry pt oc oe cu sn ro fo rm an sv ar . 5’ : n o G O trm m s - 2. 91 2 1. 22 2 0.O OE +0 0 n eo fo rm an s JE C2 IJ 3 . G O fu nc tio n: am id as e ac tiv ity 11 26 25 5 3’ : A. AW 47 16 4.1 11 22 90 2 3’ : A m id as e, pu ta tiv e [C ry pto co cc us n eo fo rm am va r. n eo fo rm an s J E C 21 I 5’: H yp ot he tic al pr ot ei n 5’ : A A W 47 16 2. 1; 11 23 04 5 - [C ry pto co cc us n eo fo nn ai vs V & 5’ n o G O te rm s - 2. 91 2 1. 22 2 0.O OE +0 0 n eo fo rm ai n JE C2 IJ 3 G O fu nc tio n: a m id as e ac tiv ity 3’ : A A W 47 16 4. 1 11 25 05 3 3’ :A m id as e, pu ta tiv e [C ry pto co cc us n eo fo rm as s_ va r._ ne oi bn na ns JE C2 I} 11 26 04 9 - H yp ot he tic al pr ot ei n [C ry pto co cc ua no GO te rm s - 2 91 2 I2 22 A A W 44 06 9. 1 2.O OE -1 73 11 27 16 1 n eo fo rin an s va r. n eo fo rm an sJ EC 2I [ 5’: H yp ot he tic al pr ot ei n 11 20 68 9- 5’: A A W 47 16 2. 1; 11 21 25 2 - [C ry pto co cc us n eo fo rm an sv ar . 5’: no GO te rm s - 2. 73 1 1. 29 0 0.O OE +0 0 n eo fo rn sa ns JE C2 I] 3’ :G O _f un ct io n: am id as r a ct iv ity 11 26 93 9 3’ :A A W 47 16 4. 1 11 22 90 2 3’ :A m id as e, pu ta tiv e [C ry pto co cc us ne of om m an sv ar ._ n eo fo rm as sJ EC 21 J _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A v e r a g e L R S t a n d a r d D e v ia ti o n o f L R C o o r d in a te s Co or di na te s Pr ed ic te d F u n c t i o n G e n e O n to lo g y R 7 9 4 K B3 86 4 K B 7 8 9 2 R 7 9 4 K B3 86 4 K B 7 8 9 2 CH R d e s c r ib in g P r o t e i n W a E- va hi e o f G en e’ t h e a re a 5 ’: H y p o th e ti c a l pr ot ei n [C ry p to c o c c u s n e o f o r n ia n s v a r . 5 ’ n o G O te rm s 5 ’: A A W 4 7 1 6 2 .l ; 1 1 2 3 0 4 5 - - 2. 73 1 1. 29 0 H 3’ : A A W 47 16 4. 1 11 25 05 3 0.O OE +0 0 n eo fo rm an sJ E C 2I ] 3’ : G O fu nc tio n: am id as e ac tiv ity 3’ : A m id as e, pu ta tiv e IC ry pt oc oc cu s — n eo fo rm an sv ar . n eo fo rm an s JE C2 I] 11 26 04 9 - H yp ot he ti ca lp ro te in IC ry pt oc oc cu s n o G O te rm s - 2. 73 1 1. 29 0 A A W 44 06 9. I 2. O O E -l 73 11 27 16 1 n eo fo rin an sv ar . n eo fo rm an s JE C 2I ] 5’ : H yp ot he tic al pr ot ei n 11 20 73 2- 5’ : A A W 47 16 2. l; 11 21 25 2- (C ryp toc oc cu sn eo fo rm an sv ar . 5’ : n o GO te rm s 1. 49 2 0. 39 2 0. O O E+ 00 n e o fo rm an s JE C 21 ] 3’ : G O fu nc tio n: am id as e a c ti vi ty 11 28 02 7 3’ : A A W 47 16 4. I 11 22 90 2 3’: A m id as e, pu ta ti ve [C ryp toc oc cu s — n eo fo rin an sv ar . n eo fo rm an s JE C2 I] 5’ : H yp ot he ti ca lp ro te in [C ry pt oc oc cu s n eo fo rm an sv ar . 5’ n o G O te rm s 5’ : A A W 47 16 2. 1; 11 23 04 5- 1. 49 2 0. 39 2 0. O O E+ 00 n e o fo ri na ns JE C 2I ] 3’ : G O fu nc tio n: am id as e a c ti vi ty 3’ : A A W 47 16 4. 1 11 25 05 3 3’: A m id as e, pu ta tiv e [C ryp toc oc cu s n eo fo rm an s va r. n eo fo rm an sJ EC 2I I 11 26 04 9- H yp ot he ti ca l pr ot ei n [C ry pt oc oc cu s G O 1. 49 2 0. 39 2 A A W 44 06 9. 1 2. O O E- 17 3 11 27 16 1 n e o fo rn sa ns va r. n e o fo rm an s JE C 2I ] 12 51 69 3 - 12 51 84 2- K in as e, pu ta ti ve (C pt oc oc cu s G O _c om po ne nt : m it oc ho nd ri on ; G O _f un ct io n: ki na se a c ti vi ty 0. 85 5 0. 34 6 A A W 45 59 8. l 0. O O E+ 00 12 54 17 2 12 53 91 7 n eo fo rin an s va r. n e o fo rm an s JE C 2I ] H yp ot he ti ca l pr ot ei n (C N BF I8 3O ) 12 57 64 1 - 12 58 49 5 - X I’ 77 50 20 .1 0. O O E+ 00 12 63 54 7 — 12 60 10 9 ii A lC ry p to co cc u sn eo fo rm an s n o G O te rm s - 1. 89 1. 06 9 v a r. _ n e o fo rm an s B -3 SO lA l 12 60 30 3 - Co ns er ve d hy po th et ic al pr ot ei n A A W 42 04 4. l 0. O O E- I0 0 12 61 72 3 [C ryp toc oc cu sn eo fo rm an sv ar . n o G O te rn is - 1. 89 1. 06 9 n e o fo rm an s JE C 2I I — — U nn am ed pr ot ei n pr od uc t, pr ed ic te d n o G O te rm s - 3. 33 8 - 3. 39 3 0. 80 1 0. 81 6 8 5 -2 3 6 9 B A E 55 59 8. l 4. O O E- 40 68 1 - 2 58 2 pr ot ei n [A sp er gi lln s o ry za e] U nn am ed pr ot ei n pr od uc t; pr ed ic te d n o G O te rm s - 3. 35 8 1. 19 0 8 5- 40 66 B A E 55 59 8. 1 4. O O E- 40 68 1 - 25 82 pr ot ei n_ [A sp er gi llu s_ or yn se ] 5’ : M yo -i no si to l tr an sp or te r, pu ta ti ve [C ryp toc oc cu sn e o fo rt na ns va r. 5’ : G O _c om po ne nt : m e m br an e; G O _f un ct io n: m yo -i no si to lt ra n sp or te r 5’ : A A W 43 04 0. 1; n e o fo rm an s JE C 21 ] a c tiv ity ; G O _p ro ce ss : m yo -i no si to l tr an sp or t 0. 92 3 0. 36 8 4 82 4- 57 32 7. O O E- 34 53 04 -5 66 4 3’ : A A W 43 04 8. 1 3’ : M al to se 0- ac et yl tr an sf er as e. 3’ : G O _f un ct io n: m a lt os e O -a ce ty lt ra ns fe ra se a c ti vi ty ; G O _f un ct io n: pu ta ti ve [C iy pt oc oc cn s n e o fo rm an s a c e ty lt ra ns fe ra se a c ti vi ty va r. n eo fo rm an s JE C 21 ] U nn am ed pr ot ei n pr od uc t; pr ed ic te d n o G O te rm s 1 12 4 0 44 6 24 40 - 53 31 B A E 55 59 8. 1 4. O O E- 40 68 1 - 25 82 pr ot ei n [A sp er gi llu s_ or yz ae ] 5’: M yo -i no si to lt ra n sp or te r, pu ta ti ve [C ry pt oc oc cu s n e o fo rm sn s va r. 5’ :G O _c om po ne nt : m e m br an e; G O _f un ct io n: m yo -i no si to lt ra n sp or te r 5’ :A A W 43 04 0. 1; n eo fo rm an sJ E C 2I ] a c tiv ity ; G O _p ro ce ss : m yo -i no si to l tr an sp or t 7. O O E- 34 53 04 -5 66 4 - 0. 82 6 1. 06 2 53 49 - 11 01 8 3’ : A A W 43 04 8. 1 3’ : M al to se 0- ac et yl tr an sf er as e, 3’ :G O _f un ct io n: m a lt os e 0- ac et yl tr an sf er as e a c ti vi ty ; G O _f un ct io n: pu ta ti ve IC ry pt oc oc cu s n e o fo rm an s a c e ty lt ra ns fe ra se a c ti vi ty v a r. _ n e o fo rm an s JE C 21 I H yp ot he ti ca l pr ot ei n (C lia et om iu m n o G O te rm s - 0. 82 6 1 06 2 X P _0 01 22 09 54 .1 l.5 0E + 00 89 06 - 94 56 gl ob os um CB S_ 14 8. 51 ] 5’ : M yo -i no si to lt ra n sp or te r, pu ta ti ve IC ry pt oc oc cu s n e o fo rm an s v a r. 5’ :G O _c om po ne nt : m e m br an e; G O _f un ct io n: m yo -i no si to lt ra n sp or te r 5’: A A W 43 04 0. 1; n e o fo rm am JE C 2I ] a c ti vi ty ; G O _p ro ce ss : m yo -i no si to l tr an sp or t 0 85 5 0 78 4 7. O O E- 34 53 04 -5 66 4 57 79 - 11 96 2 3’ : A A W 43 04 8. 1 3’ : M al to se O- ac ety ltr an sfe rn se , 3’ :G O _f un ct io n: m a lt os e 0- ac et yl tr an sf er as e a c ti vi ty ; G O _f un ct io n: pu ta ti ve IC ry pt oc oc cu s n e o fo rm an s a c e ty lt ra ns fe ra se a c ti vi ty v a r. _ n e o fo rm an s JE C 21 ] H yp ot he ti ca l pr ot ei n [C ha et om iu m n o G O te rm s - 0 85 5 0 78 4 X P _0 01 22 09 54 .1 t. 50 E + 00 89 06 - 94 56 gl ob os um CB S_ 14 8.5 11 () C _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A ve ra ge L R St an da rd D ev ia ti on o fL R C oo rd in at es C oo rd in at es R 79 4 K B 38 64 K B 78 92 R 79 4 K B3 86 4 K B 78 92 P re di ct ed F un ct io n’ G en e O nt ol og y C B R de sc ri bi ng P ro te in 1D b E -v al ue o fG en e’ th e a re a H yp ot he tic al pr ot ei n [C ry pt oc oc cu s n o G O te rm s - 0 85 5 0 78 4 A A W 43 79 6. 1 0.O OE -I- 00 11 80 2- 13 27 6 n e o fo rm an s v ar n eo fo rm an s JE C 2I ] Ex pr es se d pr ot ei n (C ryp toc oc cu s n o G O te rm s - 0. 91 2 0. 53 7 37 07 1 - 39 22 2 A A W 43 24 6. l 0.O OE +0 0 36 82 4- 38 09 0 n e o fo rm an s va r n e o fo rm an s JE C2 I] 5’: Su lf ite tr an sp or te r, pu ta tiv e [C r3 ’P to co cc us n eo fu rm an sv a r. 5’ :G O c o m po ne nt : pl as m a m e m br an e; G O _f un ct io n: su lf ite tr an sp or te r 15 95 80 - 5’: A A W 40 99 O .1 ; 3 00 E 43 16 00 95 - n e o fo rm an s JE C 2I ] a c tiv ity ; G O _p ro ce ss su lf ite tr an sp or t - 2. 87 7 - 2. 50 7 1. 45 4 1. 44 8 16 67 39 3’ : A A W 40 99 2. 1 16 12 12 3’ : Co ns er ve d hy po th et ic al pr ot ei n 3’ n o G O te rn is [C ryp toc oc cu sn eo fo rm an s v a r. n e o fo rm an s JE C 2I ] 5’ . Sp ec ifi ct ra ns cr ip tio na lr e pr es so r, 5’ : G O c o m po ne nt n u c le us ; G O _f un ct io n: sp ec if ic tr an sc ri pt io na lr e pr es so r pu ta ti ve LC r3 ’l, toc oc cu sn eo fo rm an o ac tiv ity ;G O_ pr oc es s: n e ga tiv e re gu la tio n o ft ra n sc ri pt io n fr om Po lI I 5’ : A A W 45 26 1. 1; 16 25 78 - va r. n eo fo rm an sJ EC 2I ] pr om ot er ; G O _p ro ce ss D N A re pa ir - 2. 87 7 - 2. 50 7 1. 45 4 1. 44 8 4. O O E- 58 3’ : A A W 45 26 2. 1 16 40 69 3’ : D ih yd ro fo la te sy nt ha se ,p ut at iv e 3’ : G O c o m po ne nt : c yt op la sm ; G O _f un ct io n: di hy dr of ol at e sy nt ha se a c tiv ity ; [C ryp toc oc cu on eo fo rm an s v a r. GO _p ro ce ss :f ol ic a c id an d de ri va tiv e bi os yn th es is n eo fo rm an s JE C 21 I 5’ : Sp ec if ic tr an sc ri pt io na l re pr es so r. 5’: G O c o m po ne nt : n u c le us ; G O _f un ct io n: s pe ci fi c tr an sc ri pt io na lr ep re ss or pu ta ti ve [C ryp toc oc cu sn eo fo rm an s a c tiv ity ; G O _p ro ce ss : n e ga tiv e re gu la tio n o ft ra n sc ri pt io n fr om Po lI I 5’ : A A W 45 26 1. 1; 16 41 49 - v a r. n e o fo rm an s JE C 2I ] pr om ot er ; G O _p ro ce ss D N A re pa ir - 2. 87 7 - 2. 50 7 1. 45 4 1. 44 8 0. O O E+ 00 3’ . A A W 45 26 2. 1 16 57 62 3’ . D ih yd ro fo la te sy nt ha se ,p ut at iv e 3’ : G O c o m po ne nt : cy to pl as m ;G O _f un ct io n: di hy dr of ol st e sy nt ha se a c tiv ity ; [C ryp toc oc cu sn eo fo rn sa ns va r. GO _p ro ce ss :f ol ic a c id a n d de ri va tiv e bi os yn th es is n e o fo rm an s JE C 21 ] 5’ . Sp ec ifi c tr an sc rip tio na lr ep re ss or , 5’ : GO co m po ne nt :n u cle us ;G O_ fu nc tio n: sp ec ifi ct ra n sc ri pt io na lr e pr es so r pu ta tiv e [C ry ist oc oc cu sn e o fo rm an o a c tiv ity ; G O _p ro ce ss : n e ga tiv e re gu la tio n o ft ra n sc ri pt io n fr om Po lH 5’ : A A W 45 26 1. 1; 16 61 08 - va r. n e o fo rm an s JE C 21 J pr om ot er ;G O _p ro ce ss : D N A re pa ir - 2. 87 7 - 2. 50 7 1. 45 4 1. 44 8 6. O O E- 58 3’ . A A W 45 26 2. 1 16 63 56 3’ . D ih yd ro fo la te sy nt ha se , pu ta tiv e 3’ : G O c o m po ne nt : cy to pl as m ;G O _f un ct io n: di hy dr of ol ate sy nt ha se a c ti vi ty ; IC ry pt oc oc cu s n eo fo rm an s va r. GO _p ro ce ss :f ou ra c id a n d de ri va tiv e bi os yn th es is n e o fo rm an s JE C 2I ] 5’ : Su lfi te tr an sp or te r, pu ta tiv e [C ryp toc oc cu sn eo fo rin an s va r. 5’ : GO co m po ne nt :p la sm a m em br an e; GO _f un cti on :s u lfi te tr an sp or te r 15 97 65 - 5’ : A A W 40 99 0. 1; 3 00 E -1 3 16 00 95 - n eo fo rm am JE C2 I] a c ti vi ty , GO _p ro ce ss :s u lfi te tr an sp or t - 2. 43 3 1. 42 1 16 68 70 3’ : A A W 40 99 2. 1 16 12 12 3’ : Co ns er ve d hy po th et ic al pr ot ei n 3’ : n o G O te rm s [C ry pt oc oc cu s n e o fo rm an s v a r. n eo fo rm an sJ EC 2I ] 5’ .S pe cif ic tr an sc rip tio na lr ep re ss or , 5’ : G O c o m po ne nt : n u c le us ; G O _f un ct io n: sp ec if ic tr an sc ri pt io na lr e pr es so r pu ta ti ve [ C r 3’p to co cc us n eo fo rm an s ac tiv ity ;G O_ pr oc es s: n eg at iv er eg ul at io n o f t ra ns cr ip tio n fro m Po t 1 1 5’ : A A W 45 26 1. 1; 16 25 78 - v a r. n e o fo rm an s JE C 2I ] pr om ot er ; G O _p ro ce ss D N A re pa ir - 2 43 3 14 21 4.O OE -5 8 3’ . A A W 45 26 2. 1 16 40 69 3’ : D ih yd ro fo la te sy nt hs se ,p ut at iv e 3 ’ G O c o m po ne nt : cy to pl as m ;G O _f un ct io n: di hy dr of ol at e sy nt ha se a c ti vi ty ; [C ryp toc oc cu sn eo fo rm an sv a r. GO _p ro ce ss :f ol ic a c id a n d de ri va tiv e bi os yn th es is n eo fo rm an sJ EC 21 J 5’ . Sp ec ifi ct ra n sc ri pt io na l re pr es so r, 5’ : GO co m po ne nt :n u cle us ;G O_ fu nc tio n: sp ec ifi c tr an sc rip tio na lr ep re ss or pu ta ti ve [C ryp toc oc cu sn eo fo rm an s a c tiv ity ; G O _p ro ce ss : n e ga ti ve re gu la ti on o ft ra n sc ri pt io n fr om P ol U 5’ : A A W 45 26 1. 1; 16 41 49 - v a r. n e o fo rm an s JE C 2I ] pr om ot er ; G O _p ro ce ss D N A re pa ir - 2 43 3 1 42 1 0. O O E+ 00 3’ A A W 45 26 2. 1 16 57 62 3’ : D ih yd ro fo la te sy nt ha se ,p ut at iv e 3’ : G O c o m po ne nt : cy to pl as m ;G O _f un ct io n: di hy dr of ol at e sy nt ha se a c tiv ity ; [C ryp toc oc cu sn e o fo rm an s v a r. G O _p ro ce ss : fo lic a c id a n d de ri va tiv e bi os yn th es is n e o fo rm an s JE C 2I J 5’: Sp ec ifi c tr an sc ri pt io na lr e pr es so r, 5’ : G O c o m po ne nt : n u c le us ;G O _f un ct io n: sp ec if ic tr an sc rip tio na lr e pr es so r pu ta tiv e [C ryp toc oc cu sn eo fo rm an s ac tiv ity ;G O _p ro ce ss : n e ga tiv e re gu la tio n o ft ra n sc ri pt io n fr om Po lI I 5’ : A A W 45 26 1. 1; 16 61 08 - v a r. n eo fo rm an sJ E C 2I J pr om ot er ; G O _p ro ce ss D N A re pa ir - 2 43 3 1 42 1 6. O O E- 58 3’ . A A W 45 26 2. 1 16 63 56 3’ : D ih yd ro fo la te sy nt ha se , pu ta tiv e 3’ : G O c o m po ne nt cy to pl as m ;G O _f un ct io n: di hy dr of ol at e sy nt ha se a c tiv ity ; [C ry pt oc oc cu s n e o fo rm an s v a r. G O _p ro ce ss : fo lic a c id an d de ri va tiv e bi os yn th es is n e o fo rn sa ns JE C 21 ] C on se rv ed hy po th et ic al pr ot ei n G O _c om po ne nt c yt os ol ic la rg e ri bo so m al su bu ni t (s em u E uk ar yo ta ); 16 68 52 - 1. 42 1 A A W 43 23 3. I O. OO E- 1- 00 16 74 53 [C ry pt oc oc cu s n e o fo rm an s v a r. G O _f un ct io n: st ru ct u ra lc o n st itu en t o fr ib os om e; G O _p ro ce ss : pr ot ei n - 2. 43 3 n e o fo rm an s JE C 21 I bi os yn th es is — Ex pr es se d pr ot ei n [C ry pt oc oc cu s n o GO te rm s 0. 75 9 0. 38 7 J 52 8- 21 82 A A W 47 21 8. I 0. O O E+ 00 11 29 -2 64 3 n eo fo rm an s v a r. n eo fo rm an sJ E C 2I ] A v e ra g e L R S ta n d a rd D ev ia t,o . o fL R C o o rd L a te s C U R d ts c n b m g Pr ot ei n ID h E -v a lu e C o o rd in a te s Pr ed ic te d Fu nc tio n’ G o n e O n to lo g y R 7 9 4 K B 3 8 6 4 K B 78 92 R 7 9 4 K B 3 8 6 4 K 87 89 2 o fG en e’ th e a r e a M u c in -a s s o c ia le d su rfu ce p ro te in 3 2 5 0 6 0 - 2 6 1 1 2 X P _ 8 2 0 7 4 2 . 1 0 .6 3 2 5 4 2 1 - 25 66 6 (M A S P ) [T ry p a n o so m a c r u z i st ra in n o G O te r m s - 1 .0 0 2 0 .6 8 9 CL Br en er ] 1 3 8 6 1 2 - 14 01 40 - Co ns er ve dh yp ot he tic al pr ot ei n 0 .9 6 5 0 7 6 3 X P 3 8 0 5 6 3 .1 0 .7 3 1 4 2 5 0 1 — 1 4 1 0 3 0 [G ibh ere lh ze ae P H -I l 1 4 1 9 7 5 - C y to p la sm p ro te in , pu ta tiv e A A W 4 6 O S L I 0. OO Et 00 j4 4 3 [C ryi *o co cc us n eo fo ci na m va r. G O co m po ne u8 c y to p la sm 0. 96 5 0 .7 6 3 n eo fu rm ar uJ EC 2I I 3 9 7 9 3 1 - N O G EN E - 1 .3 4 5 - 1 .3 3 0 0 .6 9 4 0 .6 9 5 4 0 1 1 4 8 3 9 7 8 6 0 - N O G EN E - 1 .5 5 8 0 ,7 6 7 4 0 1 1 4 8 Co ns er ve d hy po th et ic ei pr ot ei n 50 00 74 - 49 93 84 - A A W 44 31 4. 1 4.O OE -1 80 5 0 8 0 9 7 5 0 0 1 2 7 (C typ toc oc cu sn rof orm ain va r. n o G O te rn o - 2 .1 2 5 - 2 .1 5 3 1 .2 7 8 1 .3 1 7 n eo fu rm az Lt JE CZ Il 50 07 48 - H yp ot hr tic al pr ot nh i[C uy pto co cc us n o G O te im s - 2. 12 5 - 2 ,1 5 3 1 .2 7 8 1 .3 1 7 A A W 47 21 1. 1 2.O OE -1 22 50 17 24 ne of isr m an sv ar . n eo fo rin an sJ EC 2I I 5 0 2 4 0 4 - H yp ot he tic al pr ot ei n[ Cr yp toc oc cu s n o G ot ei no - 2. 12 5 . 2. 15 3 1. 27 8 1. 31 7 A A W 44 84 1. 1 3.O OE -1 10 50 28 85 n eo fo rm an sv ar . n eo fb rm an sJ EC 2I ] 50 39 85 - Co ns er ve d hy po th et ic al pr ot ei n n o G O te rn is - 2, 12 5 - 2. 15 3 1. 27 8 1. 31 7 X P _ 0 0 1 2 1 2 0 9 8 . I 3. 60 E. 02 50 58 18 EA sp er gi llu st e ir m is N1 H2 62 41 50 03 55 - 50 07 48 - H yp ot he tic al pr ot ei n IC ry pt oc oc cu n G o t - 3 31 3 1. 27 8 A A W 47 21 1. 1 2, 00 E- 12 2 52 26 13 50 11 24 n eo fo rm ai n va r. o eo fo rn ia ns JE C2 I] 50 24 04 - H yp ot he tic al pr ot ei n [C ryp toc oc cu s n o G o te rm s - 3. 31 3 1.2 78 A A W 44 84 1. 1 3, 00 E- 11 0 5 0 2 8 8 5 n e o fo rn ia n a v a r . n e o fi tr m a n s JE C2 IJ 50 39 85 - C o n se rv e d hy po th et ic al p ro te in n o G o te rm s - 3. 31 3 1. 27 8 X P_ 00 12 12 09 8. 1 3, 60 E- 02 50 58 18 [A sp erg ill us ter reu sN fll 26 24 ] 50 86 26 Co ns er ve d hy po th et ic al pr ot ei n A A W 44 93 1. 1 5.O OE -4 2 - 50 91 32 [C uy pto co cc ua n eo fo rm an sv ar . n o G o te cin s - 3. 31 3 1. 27 8 n e o fo rm am jE C2 ll 51 12 23 - M iio ge n. ac tiv at ed pr ot ei n ki tm se -li ke X P_ 00 16 86 66 3. I 7, 60 E+ 00 pr ot ei ti[ Le ish ma nia ma jor stx ain n o G O te rn is - 3, 31 3 1. 27 8 Fr ie dl in ] 51 21 59 - H yp ot he tic al pr ot ei n L0 C2 07 80 6 G O _c om po ne nt =m em br an e; G O _ ft m c ti o n = c a lc iu m io n bi nd in g; m an no sy l- - 3. 31 3 1. 27 8 N P_ 00 10 25 06 0. 1 9. 40 E- 01 51 25 06 [M en m u sc u lu sj o lig os ac ch ar id e 1, 2- au ha -n ia nn os id as e ac tiv ity 51 44 27 Co ns er ve d hy po th et ic al pr ot ei n A A W 44 92 9. 1 1.O OE -1 9 - 51 57 48 [C iyp toc oc cu sn eo fo rm an sv ar. n o G ot er m s - 3. 31 3 1. 27 8 n eo fo rm an a JE C2 I) Pr ot ei n co di ng [C ryp toc oc cu s X P_ 77 68 71 .I 1.O OE -6 0 51 62 83 - n eo fo tm an u va r. n eo fo rm an sB - no G o te rm s - 3. 31 3 1. 27 8 51 84 87 35 01 A] 51 90 77 - In hi bi to r o f gr ow th fa m ily ,m em be r no G o te rm s - 3, 31 3 1. 27 8 X P_ 51 93 34 .2 1. IO E- 02 51 92 65 (Pa n t ro gl od yt ni l M yo -in os ito lt ra ns po rte r, pu ta tiv e 52 09 75 - GO _c om po ne nt :m em br an e; GO _f un cti on :m yo -in ou ito lt ra ns po rte ra ct iv ity ; - 3 31 3 1. 27 8 A A W 43 04 0. I 0.O OE +0 0 [C ryp toc oc cu sn eo fo rm am va r. 52 26 82 G Qp ro ce ss: m yo -in os ito lt ra ns po rt n e o fo rm a m JE C2 I] Co ns er ve d hy po th ec ica lp ro te in 50 98 65 - 50 86 26 - A A W 44 93 1. 1 5.O OE -4 2 51 43 87 50 97 32 [C ryp toc oc cu sn eo fu rm am va r. no G o te rm s - 2. 93 9 - 3. 09 4 0. 91 3 0. 87 8 n eo fo rn ia in JE C2 II 51 12 23 - M it o g e n -a c ti v a te d p ro te in ki na se -li ke X P_ 00 16 86 66 3. 1 7. 60 E+ 00 51 14 65 pr ot ei n [L cis hm an io m ajo rs tr ai n no G o te rm s - 2. 93 9 - 3. 09 4 0. 91 3 0. 87 8 Fr ie dl in ] M on oo xy ge ns se pr ot ei n, pu ta tiv e 51 76 02 - 51 62 83 - X P 56 96 93 .1 1.O OE -6 0 52 26 13 — 51 84 87 IC uy pt oc oc cu sn eo fo rm am va r. n o G o te rm s - 3. 25 6 0. 96 6 n e o fo rm am JE C 2l j 51 90 77 - In hi bi to ro fg ro w th fa m ily ,m em be r3 no G o te rm s - 3. 25 6 0. 96 6 X P_ 51 93 34 .2 l.I O E- 02 51 92 65 [P an tr og lo dy te si - A _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A v e ra g e L R S ta n d a r d D ev ia tio n o f L R C o o rd in a te s C H R de sc rib in g Pr ot ei n 1D b v a lu e Co or di na te s P r e d ic te d P u n c ti o n d G e n e O n to lo g y R 7 9 4 K B 38 64 K B 78 92 R 7 9 4 K B 38 64 K B 78 92 o f G en e’ th e a re ? 52 09 75 - M y o -i n o s it o l tra ns po rte r, p u ta ti v e G O _c om po ne nt : m e m br an e; G O _f un ct io n: m yo -i no si to l tr an sp or te r a c tiv ity ; J A A W 43 04 0. 1 O. OO E+ 00 - 3. 25 6 0. 96 6 52 26 82 [C ry pto co cc us n e o fo rm an s G O _p ro ce ss : m yo -m os it ol tr an sp or t n eo fo nn an s JE C 2I ] 51 81 14 - 51 62 83 - M on oo xy ge na se pr ot em , pu ta ti ve X P _5 69 69 3. 1 I.O O E- 60 52 26 13 51 84 87 [C ry pto co cc us ne of or ma ns va r. n o G O te rm s - 3. 40 9 0 8 9 n eo fo m sa ns JE C 2I ] 51 90 77 - ln lt ib it or of gr ow th fa m il y, m ei nb er 3 n o t3 O te rn is . 34 09 0 8 9 X P _5 19 33 4. 2 1. IO E -0 2 51 92 65 IP an tr og lo dy te s] 52 09 75 - M yo -m os ito l t ra ns po rte r, pu ta tiv e G O _c om po ne nt m em br an e; G O _f un ct io n: m yo -i no si to l t ra n sp or te r a c tiv ity ; A A W 43 04 0. 1 0. O O E+ 00 - 3. 40 9 0. 89 52 26 82 [C ry pto co cc us n eo fo rm am ‘ a r G O _p ro ce ss : m yo -i no si to l tr an sp or t n eo fo rm an s JE C 2I ] — — C on se rv ed hy po th et ic al pr ot ei n IC 28 90 -6 44 7 A A W 45 74 3. 1 0, 00 0+ 00 31 09 -4 21 5 [C ry pto co cc us ne of on sta ns va r. n o G O te nn s - 1. 68 1 1. 37 3 n eo fo nn an s J E C 2 1 I X P _3 83 69 2. 1 2. O O E- 05 52 63 - 73 90 H yp ot he tic al pr ot ei n [G ib be re lla ze ae n o G O te lm s . 1. 68 1 - 1. 37 3 P H -I ] Co ns er ve d hy po th et ic al pr ot ei n 2 8 9 0 - 7 7 8 6 A A W 4 5 7 4 3 .1 0. 00 0+ 00 31 09 -4 21 5 [C ry pt oc oc cu sn eo fo rm an sv ar . n o G O te rm s - 1. 28 9 - 2. 65 0 1. 33 3 1. 51 8 n eo fo rm an s JE C 2I ] X P _3 83 69 2. l 2. O O E- 05 52 63 -7 39 0 H yp ot he ti ca l p ro te m (G ib be re lla ze ae n o G O te rm s - 1 2 89 - 2. 65 0 1. 33 3 1. 51 8 P H -I ] C on se rv ed hy po th et ic al pr ot ei n 79 67 7 - 80 3t I A A W 46 24 7, t 0. 00 0+ 00 79 46 3 - 80 00 1 (C ryp toc oc cu s n eo fo rm at is v ar . n o G O te rm s - 2. 43 1 09 94 n eo fo rm an s JE C 21 ] C on se rv ed hy po tls ut ic al pr ot ei n 7 9 6 7 7 - 80 42 1 A A W 46 24 7. l 0. O O E+ 00 78 46 3- 80 00 1 [C ry pto co cc us n eo fo rtn an sv ar . n o G O te rm s - 2. 21 6 1. 20 6 n eo fo rm an s JE C 2I ] Co ns er ve d hy po th et ic al pr ot ei n 79 67 7 - 80 97 0 A A W 46 24 7. 1 0. 00 0+ 00 78 46 3 - 80 00 1 [C ry pt oc oc cu s n eo fo rm an s v a r. n o G O te rm s - 1. 72 9 1 13 7 n e o fo rm sn s JE C 2I ] G O _c om po ne nt : n u c le us ; G O fu nc tio n: tr an sc ri pt io n fa ct or a c tiv ity ; 63 05 73 - 63 19 15 - H yp ot he ti ca l pr ot ei n [C ry pt oc oc cu s G O _p ro ce ss : re gu la ti on o f t ra ns cr ip tio n, D N A -d ep en de nt ; G O _p ro ce ss : u ra c il - 1. 08 4 0. 78 4 A A W 46 21 6. I 0. O O E+ 00 63 20 70 63 47 61 n e o fo rn sn ns v a r. n e o fo rm am JE C 2I ] bi os yn th es is 70 18 94 - 70 07 68 - E xp re ss ed pr ot ei n [C ry pt oc oc cu n n o G O te rm s - 0 91 5 0 62 4 A A W 46 26 9. I 0. O O E+ 00 70 36 01 70 30 73 n eo fo rtn an s va r. n e o fo rm an s JE C2 1] C on se rv ed hy po th et ic al pr ot ei n 70 33 36 - G O _c om po ne nt : m e m br an e; G O _f un ct io n: v -S N A R E a c tiv ity ; G O _p ro ce ss : A A W 46 26 8. 1 1. 00 0- 16 8 - 0. 91 5 06 24 70 42 86 [C ry pt oc oc cu s n e o fo rm an s va r. in tr a- G ol gi tr an sp or t; G O _p ro ce ss : v e s ic le fu si on n e o fo rm an s JE C 2I ] 83 46 78 - N O G EN E - 0. 92 2 06 02 83 61 31 5 ’: Ex pr es se d pr ot ei n [C ry pto co cc us 5 ’: n o G O te on s n eo fo rm an s va r. n eo fo rn ia su JE C2 1] 3 ’: G O _ c o m p o n e n t: in n e r p la q u e o f s p in d le p o le bo dy ;G O _c om po ne nt : o u te r 95 07 88 - 5: A A W 46 O 71 .l; 9 4 9 5 3 3 - 0. 00 0+ 00 3 ’: Tu bu lin ga m m a ch ai n( Ga mm a pl aq ue o fs pi nd le po le bo dy ;G O _f un ct io n st ru ct ur al co n st itu en to f - 3. 16 1 - 3. 39 3 - 3. 19 4 0. 06 8 1. 24 5 0. 99 5 95 35 15 3 ’: A A W 46 21 3. l 95 09 42 tu bu lin ), pu ta tiv e [C ry pt oc oc cu s c yt os ke le to n; GO _p eo ce ss : m it ot ic s pi nd le a s s e m bl y (s en su Sa cc ha ro m yc es ); n e o fo rm an s va r. n e o fo rm an s JE C2 I] G O _p ro ce ss m ic ro tu bu le n u c le at io n 95 40 99 - 95 53 17 - H yp ot he ti ca lp ro te in [C ry pt oc oc cu s n o G O t 04 11 03 28 A A W 46 07 O. l 5. O O E- 15 0 95 5g 49 95 63 91 n eo fo rm an s v a r. n e o fo rm an s JE C2 I] C on se rv ed hy po th et ic al pr ot ei n 9 7 0 3 0 8 - 96 85 24 - A A W 46 23 9. I 0. O O E+ 00 97 23 69 97 12 44 E C ry pt oc oc cu s n e o fo rm an s v a r. n o G O te rm s - 0. 65 2 0. 52 1 n e o fo m in ns JE C 2l ] B et a- fr ts ct of ur an os id as e, pu ta tiv e G O _c om po ne nt : ex tr ac el lu la r re gi on ; G O _c om po ne nt : c yt op la sm ; 98 16 13 - 97 98 52 - A A W 46 25 8. t 0.O OE +0 0 98 41 46 98 16 30 [C ry pto co cc us n eo fo rtn an s v ar . G O _f un ct io n: he ta -f ru ct of is ra no si da se a c ti vi ty ; G O _p ro ce ss : s u c ro s e - 2. 17 1 0. 95 5 n e o fo rm an s JE C 2I ] c a ta bo lis m A cs en ite tr an sp or te r, pu ta ti ve 98 37 87 - 98 56 84 [C ry pt oc oc cu s n e o fo rm an s v a r G O _c om po ne nt : in te gr al to pl as m a m e m br an e; G O _f un ct io n: a rs e n it e A A W 46 25 6. l 0. O O E+ 00 - 2. 17 1 0. 95 5 n eo fo rm an s JE C 2I I tr an sp or te r a c tiv ity ; G O _p ro ce ss : a rs e n it e tr an sp or t A v e ra g e Li i S ta n d a r d D e v ia ti o n o f L R C o o rd in a te s d R d e sc ri b in g P ro te in 1 D b I - v a ln e C o o rd in a te s P re d ic te d F u n c ti o n s G e n e O n to lo g y R 7 9 4 K B 3 8 6 4 K B 7 8 9 2 R 7 9 4 K B 3 8 6 4 K B 7 8 9 2 o f G e n e ’ th e a r e a 99 51 04 - N O GE NE - 1 .2 4 5 0 .5 8 8 K 9 9 5 9 3 0 — Co ns er vt d h y p o th e ti c a l p ro te in L 41 85 - 12 20 9 A A W 43 79 4. 1 0. 00 0+ 00 2 4 2 1 - 53 00 [C ry pto co cc us ne of or ma ns va r. n o G O te e rn s - 1 .8 0 1 - 1 .8 0 9 0 .9 8 0 1. 00 0 n eo fo rm an s J E C 2 I] 5’: Ex pr es se d p ro te in [C ry pto co cc us n eo fo rm an s va r. n eo fo nn an s JE C2 I] 5’ : A A W 47 18 4. l, 4 .O O E -1 6 6 2 4 5 - 9 1 3 9 3 ’: E p o x id e hy dr ol as e 1, p u ta ti v e n o G O te em s - 1 .8 0 1 - 1 .8 0 9 0 .9 8 0 1. 00 0 3 ’: A A W 4 7 1 9 1 .1 [C ry pto co cc us n eo fo nn at m v a r . n e o fo rm a n s JE C2 IJ O x id o re d u c ia se , G O _c om po ne nt :n u cl eu s; G O _c om po ne nt . cy to pl as m ;G O _ fu n c ti o n : al de hy de r e d u c ta se ac tiv ity ; G O _f un ct io n: al do -k et o re du ct as e ac tiv ity ;G O _f un ct io n: - 1. 80 1 - 1. 80 9 0. 98 0 1. 00 0 A A W 41 72 7. 1 0.O OE +0 0 98 17 - 11 29 6 [C ry pto co cc us n eo fo rm an s va r. o x id or ed uc ta se ac tiv ity ; G O _p ro ce ss : ar ab in os e m et ab ol ism ;G O _p ro ct ss : D n eo fo nn an s JE C2 IJ x yl os e m et ab ol ism Co ns er ve d hy po th et ic al pr ot ei n G O _c om po ne nt :p er ox iso m al m at rix ; G O fu nc tio n: 2, 4- di en oy l-C oA A A W 43 77 2. I 0.O OE +0 0 12 05 1 - 13 16 3 IC ry pt oc oc cu s n eo fo rm an sv ar . re du ct as e (N AD PH )a ct iv ity ;G O _p ro ce ss :s po ru la tio n (se ns u - 1. 80 1 - 1. 80 9 0. 98 0 1. 00 0 n e o fo rm a n s JE C 2 I] S a c c lt a ro m y c e s) ; G O p ro c e ss : fa tt y a c id c a ta b o li sm Co ns er ve d hy po th et ic al pr ot ei n 41 85 - 12 12 1 A A W 43 79 4. t 0.0 00 -tO O 24 21 - 53 00 [C ry pto co cc us n eo fo rm an s va r. no GO te rm s - 2. 22 3 0. 97 5 n eo fo rn ta ns JE C2 IJ 5’: Ex pr es se d pr ot ei n [C ry pto co cc us n eo fo rm an s va r. n eo fo rm an sJ EC 2I J 5’ A .A W 47 18 4. l; 4. 00 0- 16 62 45 - 91 39 3’ :E p o x id e h y d ro la se I, pu ta tiv e no GO te rm s - 2. 22 3 0 97 5 3 ’: A A W 47 19 1. t [C ry pto co cc us n e o fo rm a n s v a r . n eo fo rm an s J E C 2 II O x id o re d u c ta s e , p u ta ti v e G O _ c o m p o n e n t: n u cl eu s; G O _ c o m p o n e n t: c y to p la s m ; G O _ fu n c ti o n : a ld e h y d e re du cta se ac tiv ity ; G O _ fu n c ti o n : a ld o -k e to re du cta se a c ti v it y ; G O _ fu n c ti o n - 2, 22 3 0. 97 5 A A W 41 72 7 I 0.O OE +O 0 98 17 - 11 29 6 [C ryp toc oc cu s n eo fo rm an s va r. o x id or ed uc ta se ac tiv ity ; G O _p ro ce ss ar ab ia os e m e ta b o li s m ; G O _p ro ce ss 1)- n e o fo rm a n s JE C Z J] x yl os e m et ab ol ism C o n se rv e d hy po th et ic al pr ot ei n G O _c om po ne nt :p er ox iso m al m at rix ;G O _f un ct io n: 2, 4- di en oy l-C 0A A A W 43 77 2. I 0. 00 0+ 00 12 05 1 - 13 16 3 [C ryp toc oc cu n n eo fo rm an s va r. re du ct as e (N AD PH )a ct iv ity ;G O _p ro ce ss :s po ru la tio n (se ns u - 2. 22 3 0. 97 5 n eo fo rm an s JE C2 IJ Sa cc ha ro m yc es ); GO pr oc es s: fa tty ac id ca ta bo lis m Tr an sk et ol as e, pu ta tiv e G O _ c o m p o n e n t: c y to p la sm ; G O _f un ct io n: tr a n s k e to la se ac tiv ity ;G O _ p ro c e ss : - 1. 68 7 0. 79 8 15 74 3 - 25 85 7 A A W 43 39 2. I 0.O OE +0 0 13 80 8 - 16 46 8 [C ryp toc oc cas ne of or m aa sv ar . pe nt os e- ph os ph ete sh un t n eo fo rm am JE C2 1] X P_ 66 05 62 .1 3.O OE -2 3 17 83 6- 19 74 7 H yp ot he tic al pr ot em [A sp erg ill us no GO te rm s - 1. 68 7 0. 79 8 n id ul an s F G S C A 4 1 1. Ex pr es se d pr ot ei n (C ryp toc oc ca t 1. A A W 46 84 4. 1, n eo fo rm an s v ar . n eo fo rm sn s JE C2 IJ n o G O te rm s - 1. 68 7 0. 79 8 3. 00 0- 44 21 23 5- 22 15 0 2. A A W 46 97 3. l 2. H yp ot he tic al pr ot ei n [C ry pto co cc us n eo fo rm an s va r. n e o fo rn s s n s JE C2 t] 5’ : G O _c om po ne nt :m ito ch on dr in l 5’: Iro n io n tr a n s p o rt -r e la te d pr ot ei n, pu ta tiv e [C iyp toc oc cu sn eo fo rn ia ns va r. 5’: A A W 43 75 1. l; 0. 00 0+ 00 22 50 5 - 23 56 8 m at rix , G O _p ro ce ss . tro n io n n eo fo rm an sJ EC 2I ] - 1. 68 7 0. 79 8 3’ :A A W 43 75 3. I tr an sp or t 3’ :C on se rv ed hy po th et ic al pr ot ei n [C ry pto co cc us n eo fo rm an sv at . 3’ :n o GO te rm s n eo fo rm an sJ EC 2I ] L -a ra b in it o l 4 _ d e h y d ro g e a u s e no GO te rm s - 1 68 7 0 79 8 X P_ 00 15 46 30 2. l 2 .O O E -0 4 23 92 9- 25 42 7 [B o tr y o ti n ia fu ck el in na B 0 5 . 10 1 T ra n s k e to la s e , p u ta ti v e 15 83 9- 27 28 5 A A W 4 3 3 9 2 .1 0. 00 0+ 00 13 10 8- 16 46 8 [C ry p to co cc u s n eo fo rm am va r. G O _ c o m p o n e n t. cy to pl as m ,G O _ fu n c ti o n . tr an sk et ol as e ac tiv ity ;G O _p ro ce ss . - 1 .9 1 3 0. 89 9 pe ato se -p ho sp ha te sh un t n eo fo rm an sjE C2 l] X P_ 66 05 62 .1 3.O OE -2 3 17 83 6- 19 74 7 H yp ot he tic al pr ot ei n [A sp erg ill us no GO te rm s - 1. 91 3 0. 89 9 n id ul an s F G SC A4 ] 1. Ex pr es se d pr ot ei n [C ry p to co cc as 1. A A W 46 84 4. I; n eo fo rm an s va r. n eo fo rm an s JE C2 I] n o G O te rn is - 1. 91 3 0. 89 9 3.O OE -4 4 21 23 5- 22 15 0 2. A A W 46 97 3. 1 2. H yp ot he tic al pr ot ei n [C ry pto co cc us n eo fo rm an s va r. n eo fo rm an sJ EC 21 ] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A ve ra ge LR St an da rd D ev ia tio n o fL R C oo rd in at es C U R de sc rib in g Pr ot ei n fl ) E- va ln e Co or di na te s P re d ic te d Pn nc tio n” G en e O nt ol og y R 79 4 K B 38 64 K B 78 92 R 79 4 K B 38 64 K B 78 92 o fG en e’ th e_ ar ea ’ 5’ : G O _c om po ne nt : m it o ch o n dr ia l 5’ : Ir o n io n tr a n sp o rt -r el at ed pr ot ei n, pu ta ti ve [C ry pt oc oc cu s n e o fo rm an s v a r. 5’: A A W 43 75 1. 1; m a tr ix ; GO _p ro ce ss : ir on io n n eo fo rm an s J E C 2I ] O. OO E+ O0 22 50 5 - 23 56 8 - 19 13 0. 89 9 3’ : A A W 43 75 3. 1 tr an sp or t 3’ : Co ns er ve d hy po th et ic al p ro te in [C ryp toc oc cu s n eo fo rn ia ns va t. 3’ : n o G O ttr m s n e o fo rm an si E C 2l ] L- ar ab in ito l4 -d eh yd ro ge na se n o GO tei -m s - 1. 91 3 0. 89 9 X P_ 00 1 5 46 30 2. 1 2. O O E- 04 23 92 9- 25 42 7 IB ot ry ot in ia fli ck el ia na B O S. 10 ] T ra ns ke to la se , pu ta ti ve 16 16 3- 25 85 7 A A W 43 39 2. l 0.O OE +0 0 1 3 8 0 8 - 16 46 8 [C ryp toc oc cu s n eo fo rn ia ns va r. G O _ c o m 1s o n e n t . c’ to pl as m , G O_ fu nc tio n. tra ns ke to la se ac tiv ity ; G O_ pr oc es s: - 1. 72 6 0. 77 6 pe nt os e- ph os ph at e s hu nt n e o fo rm an s IE C 2I ] X P _6 60 56 2. I 3. O O E- 23 17 83 6- 19 74 7 H yp ot he tic al pr ot ei n [A sp trg ill us n o GO te rm s - 1 .7 2 6 0. 77 6 n id ul an s FG SC _A 4] 1. Ex pr es se d pr ot ei n [C ry pto co cc us 1. AA W 46 84 4. I; n eo fo rm an s va r. n eo fo rm an s JE C2 1J 3. O O E -4 4 2 1 2 3 5 - 2 2 1 50 n o G O te rm s - 1 .7 2 6 0. 77 6 2. A A W 46 97 3. 1 2. H y p o th et ic al p ro te in [C ry pt oc oc cu s n e o fo rm an s v a r. _ n e o fo rm an s JE C 2I ] 5’ : G O _c om po ne nt : m it o ch o n dr ia l 5’ : Ir o n io n tr an sp or t- re la te d pr ot ei n, pu ta tiv e [C ry pt oc oc cu s n e o fo rm an s v a r. 5’ : A A W 43 75 1. 1; m a tr ix ; GO _p ro ce ss : ir on io n n eo fo m m aa sJ EC 2l ] 0 OO E+ 00 2 2 5 0 5 - 2 35 68 - 1 .7 2 6 0. 77 6 3’ : A A W 4 3 7 5 3 .t tr an sp or t 3’ : Co ns er ve d hy po th et ic al pr ot ei n [C ry pto co cc us n eo fo rin an s va r. 3’ : n o G O te rm s n eo fo rm an s JE C2 1] L- ar ab in ito l4 -d eh yd ro ge na se - 1. 72 6 0. 77 6 X P_ 00 l5 46 30 2. 1 2.O OE -0 4 23 92 9- 25 42 7 [B otr yo tin ia_ fuc ke lin na B0 5_ 10 ] 37 29 2- 39 18 2 A A W 43 73 6. 1 0. 00 E+ 00 34 42 3- 37 47 5 Ex pr es se d pr ot ei n [C ry pto co cc us G O t e 0. 76 7 0. 31 7 n eo fo rm an s v a r. n e o fo rn ta ns )E C 2I ] 5’: H yp ot he ti ca l p ro te in [C ry pto co cc us n eo fo rin an s va r. 5’ : n o G O te rm s 5’ : A A W 4S 16 5. 1; n e o fo nn am JE C 2I J 3’ : G O _c om po ne nt : n u c le us ; G O _c om po ne nt : c yt op la sm ; G O _f un ct io n: 0. O O E+ 00 38 28 3 - 39 90 5 0. 76 7 0. 31 7 3’ : A A W 45 16 4. 1 3’ : br an ch ed -c ha in -a m in o- ac id br an ch ed -c ha in -a m in o- ac id tr an sa m in as e a c ti vi ty ; G O _p ro ce ss : a m in o a c id tr an sa m in as e, pu ta ti ve [C ry pt oc oc cu s ca ta bo lis m ; G O _p ro ce ss : br an ch ed ch ai n fu m ily a m in o a c id bi os yn th es is n e o fo rm an s va r. n e o fo rm an s JE C 2I ] I. H yp ot he ti ca l pr ot ei n IC ry pt oc oc cu s n eo fo rm at in va r. 83 80 t6 n eo fo nn an s JE C2 1J 1. n o GO te nn s I. A A W 46 09 7. l - 59 29 5 - 61 83 8 0.O OE +0 0 2. 35 S pr im ar y tr an sc rip t p ro ce ss in g- 2. G O _c om po ne nt : n u cl eu s; G O _c om po ne nt : c yt op la sm ; G O _p ro ce ss :3 5S - 0. 88 0 0. 43 6 2. A A W 46 09 8. 1 87 17 35 re la te d pr ot ei n, pu ta ti ve pr im ar y tr an sc ri pt pr oc es sin g [C ry pto co cc us n eo fo rm an s va r. n eo fo rm an s JE C2 I] — M 18 32 4- 36 71 5 A A W 46 86 8. l 0.O OE +0 0 16 50 9- 18 44 0 Ex pr es se d pr ot ei n [C ry pto co cc us n o GO te rm s - 2. 73 7 1. 36 3 n eo fo rm am va r. _n eo fo rm an s JE C 2 II tR N A (5 -m et hy la m in om et hy l-2 - th io ur id yl nt e) -m et hy ltr an sf er us e, G O _c om po ne nt : m ito ch on dr io n; G O _f un ct io n: lE N A (5 -m et hy la m in om et hy l- A A W 46 86 3. l 0. O O E+ 00 18 75 5- 20 63 6 - 2 .7 37 1. 36 3 pu ta tiv e [C ry pto co cc us n eo fo rm an s 2 -t h io u ri d y la te )- m et hy ltr an sf er as e a c ti v it y va r. n eo fo rm an s JE C2 I] A A W 43 04 2. l 0 OO E+ 00 22 28 9- 22 86 9 H yp ot he tic al pr ot ei n [C ry pt oc oc cu s n o G O te rm s - 2 .7 37 1. 36 3 n eo fo rm an s v a r. _ n e o fo rm an s JE C 2I ] A B 38 13 58 .l 7. 30 E -0 l 24 56 7- 25 45 7 Gu t p ro te in [S oli ba cte r u sit at us n o GO te rm s - 2 .7 37 1. 36 3 E 11 in 60 76 ] P ut at iv e D N A -b in di ng pr ot ei n n o GO te rm s - 2 73 7 1 36 3 A B W IO 7O 5. l 8. O O E- 02 2 7 7 7 2 -2 89 44 [F ra nk ia ip . E A N ip ec ] 5” c A M P -d ep en de nt pr ot ei n ki na se , 5’: G O _c om po ne nt , c yt op la sm ; G O _c om po ne nt : c A M P- de pe nd en t p ro te in pu ta ti ve [C ry pto co cc us n eo fo rm an s ki na se c o m pl ex ; G O _f un ct io n: pr ot ei n s e ri ne /th re on in e ki na se n c tiv lty ; 5’: A A W 44 72 0. 1; va r. n eo fo rm an s JE C 2I J G O _f un ct io n: c A M P -d ep en de nt pr ot ei n ki nn se a c ti vi ty ; G O _p ro ce ss : pr ot ei n 2. O O E- 28 29 41 6- 30 80 1 - 2. 73 7 1. 36 3 3’ :A A W 44 72 3. I 3’ :h yp ot he tic al pr ot ei n a m in o n c id ph os ph or yl at io n; G O _p ro ce ss : ps eu do hy ph al gr ow th ; G O _p ro ce ss : (C ryp toc oc cu s n e o fo rm an s v a r. Ra s pr ot ei n s ig na l tr an sd uc tio n n e o fo rm an s JE C2 1I 3’ : n o GO te rm s A A W 46 83 7. 1 8 ,0 0 E -l ll 33 61 7- 34 08 7 H yp ot he ti ca l pr ot ei n [C ry pt oc oc cu s n o GO te rm s - 2. 73 7 1.3 63 n eo fo rm an s v a r. _ n e o fo rm an s JE C 2I J _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A ve ra ge LR St an da rd D ev ia tio n o fL R C o. rd ia at es C U R de sc ai bi ng Pr ot ei n li ) E- va lu e C oo rd in at es Pr ed ic te d Fa nc tio nd G en e O nt ol og y R 79 4 K B 38 64 K B 78 92 R 79 4 K B 38 64 K B 78 92 o fG en e’ th e ar ea ’ H yp ot he tic al pr ot ei n (C ryp toc oc cu s no GO te rm s - 2. 73 7 1. 36 3 M A A W 46 86 4. 1 0.O OE +0 0 34 68 0- 36 18 5 n eo fo rm an s va r. n eo fo rm an sJ EC 2I ] H yp ot he tic al pr ot ei n tC ry pt oc oc cu s G O _c om po ne nt :s m al ln u cl eo la rr ib on uc le op ro te in co m pl ex ;G O _f un ct io n: - 2 73 7 1 36 3 A A W 46 86 2. 1 0.O OE +0 0 36 36 7- 39 52 0 n eo fo rm an s va r. n eo fo rm an s JE C2 I] sn o RN A bi nd in g; GO pr oc es s: pr oc es sin g o f2 0S pr e- rR N A 5’: Co ns er ve d hy po th et ic al pr ot ei n 5’: G O _c om po ne nt :s o lu bl e fra ct io n; G O _f un ct io n: ch ap er on e ac tiv ity ; (C ryp toc oc cu sn eo fo rm an sv ar . 10 91 85 - 5’: A A W 42 13 9. I; 11 03 85 - n eo fo rm an a JE C2 I] GO _f lsn ctl of l. cy ste in e- ty pe pe pt id as e ac tiv ity 2.O OE -2 l 11 21 41 3’ A A W 42 14 1 I 11 06 57 3’ Co ns er ve d hy po th et ic al pr ot ei n 3’ :G O _c om po ne nt :n u cl eu s; G O _c om po ne nt : cy to so l; G O _f un ct io n: pr ot ei n - 1. 41 0 0. 96 2 [C ry pto co cc us n eo fo rm an s ki na ne ac tiv ity ;G O _f un ct io n: n u cl eo cy to pl as m ic tr an sp or te ra ct iv ity ; G O _p ro ce ss : pr oc es sin g o f2 0S pr e- rR N A n eo fo rm nn sJ EC 2I ] 5’: Co ns er ve d hy po th et ic al pr ot ei n 5’: G O _c om po ne nt :s o lu bl e fra ct io n; G O _f un ct io n: ch ap er on e ac tiv ity ; [C ry pto co cc us n eo fo rm an s va r. 10 91 85 - 5’: A A W 42 13 9. I; 11 03 85 - n eo fo rn sa ns JE C2 I] G O _f un ct io n. cy ste m e- ty pe pe pt id as e ac tiv ity 2.O OE -2 1 13 63 12 3’ A A W 42 14 1 1 11 06 57 3’ Co ns er ve d hy po th et ic al pr ot ei n 3’ :G O _c om po ne nt :n u cl eu s; G O _c om po ne nt : cy to no l; G O _f un ct io n: pr ot ei n - 1. 41 1 - 1 41 3 0. 89 6 0. 88 8 [C ry pto co cc us n eo fo rn ia ns v ar ki na se ac tiv ity ;G O _f un ct io n: n u cl eo cy to pl as m ic tr an sp or te ra ct iv ity : G O _p ro ce ss : pr oc es sin g o f2 0S pr e- rR N A n eo fo rm un sJ EC 2I ] 29 95 22 - 29 73 57 - H yp ot he tic al pr ot ei n [C ry pto co cc us A A W 46 77 8. 1 4. OO E- l6 7 n o G O te rm s - 0. 85 4 - 0. 99 7 0. 48 4 0. 50 7 30 14 68 29 96 86 n eo fo nn an a va r. n eo fo rm an sJ EC 2I ] 29 99 79 - Ex pr es se d pr ot ei n [C ry pto co cc us no GO tm -m s - 0. 85 4 - 0. 99 7 0. 48 4 0. 50 7 A A W 46 91 6. 1 2.O OE -6 2 30 05 99 n eo fo rs na ns va r. n eo fo rm an sJ EC 2I ] St er ol m et ab ol ism -re la te d pr ot ei n, 30 09 93 - A A W 46 76 4. 1 0.O OE +0 0 pu ta tiv e [C ry pto co cc us n eo fo rm an s G O _c om po ne nt :m em br an e; G O _p ro ce ss : st er ol m et ab ol ism - 0. 85 4 - 0. 99 7 0. 48 4 0. 50 7 30 19 82 va r. n eo fo rm an s JE C2 1] — 5’ :E xp re ss ed pr ot ei n [C ryp toc oc cu s n eo fo rm an s va r. n eo fo rm an sJ EC 2I ] 10 67 66 - 5’: A A W 47 04 5. l; - 5’: n o G O te rm s - 0. 91 3 0. 63 2 l.0 0E -5 5 10 64 13 3’ .C on se rv ed hy po th et ic al pr ot ei n 3’ :G O _c om po ne nt :c yt op la sm :G O _f un ct io n: pr ot ei n ki na se ac tiv ity N 10 93 10 3’ :A A W 47 04 7. 1 10 73 13 (C ryp toc oc cu sn eo fo rm an sv ar . n eo fo rm am _J EC 2I ) 5’: Ex pr es se d pr ot ei n (C ryp toc oc cu s 10 81 48 n eo fo rm as s va r. n eo fo rm an sJ EC 2I ] 5’: no GO te rm s 5’: A A W 47 04 5. l; - 0. 91 3 0. 63 2 3.O OE -l5 - 3’ :C on se rv ed hy po th et ic al pr ot ei n 3’ :A A W 47 04 7. I 10 86 90 3’ :G O _c om po ne nt : cy to pl as m ;G O _f un ct io n: pr ot ei n ki na se ac tiv ity fC ry pt oc oc cu s n eo fo rm an s va r. n co fo rm an s J EC 2I I Co ns er ve d hy po th et ic al pr ot ei n 10 91 09 - A A W 47 04 7. I 0.O OE +0 0 [C ryp toc oc cu sn eo fo rm an sv ar . G O _c om po ne nt :c yt op la sm ;G O _f un ct io n: pr ot ei n ki na se ac tiv ity - 0. 91 3 0. 63 2 11 38 74 n eo fo rm an sJ EC 2I ] H ex os e tr an sp or t-r el at ed pr ot ei n, G O _c om po ne nt :p la sm sm em br an e; G O _f un ct io n: fru ct os et ra ns po rte r 67 46 09 - 67 39 86 - A A W 45 90 2. 1 0.O OE +0 0 pu ta tiv e [C ry pto co cc us n eo fo rm an s ac tiv ity ;G O _f un ct io n: gl uc os e tr an sp or te ra ct iv ity ;G O _f un ct io n: m sn n o se - 1. 88 8 0. 92 9 67 89 06 67 61 34 va r. n eo fo rm an sJ EC 21 ( tr an sp or te ra ct iv ity : G O pr oc es s: he xo se tr an sp or t 67 67 99 - H yp ot he tic al pr ot ei n [C ry pto co cc us no GO t e - 1 88 8 0 92 9 A A W 43 76 0. 1 0.O OE +0 0 67 87 80 n eo fo rn sa na va r. n eo fo rm an s JE C2 I] H ex os e tr an sp or t-r el at ed pr ot ei n, G O _c om po ne nt :p la sm a m em br an e; G O _f un ct io n: fru ct os et ra ns po rte r 67 47 28 - 67 39 86 - A A W 45 90 2. l 0.O OE +0 0 pu ta tiv e [C ry pto co cc us n eo fo rm an s ac tiv ity ;G O _f un ct io n: gl uc os e tr an sp or te ra ct iv ity ;G O _f un ct io n: m an n o ae - 1. 56 8 0.9 41 67 89 06 67 61 34 va r. n eo fo rm an sJ EC 21 I tr an sp or te ra ct iv ity ;G O _p ro ce ss :h ex os e tr an sp or t 67 67 99 - H yp ot he tic al pr ot ei n [C ry pto co cc us A A W 43 76 0. l 0.O OE +0 0 no GO te rm s - 1. 56 8 0.9 41 67 87 80 n eo fo rm an sv ar . n eo fo rm an sJ EC 2I ] H ex os e tr an sp or t-r el at ed pr ot ei n, G O _c om po ne nt :p la sm a m em br an e; G O _f un ct io n: fru ct os e tr an sp or te r 67 62 44 - 67 39 86 - A A W 45 90 2. 1 0.O OE +0 0 pu ta tiv e (C ryp toc oc cu sn eo fo rm an s ac tiv ity ;G O _f un ct io n: gl uc os e tr an sp or te ra ct iv ity ; G O _f un ct io n: m an n o se - 1. 87 5 1. 22 9 67 89 06 67 61 34 va r. n eo fo rm an sJ EC 21 ] tr an sp or te ra ct iv ity ;G O pr oc ea s: he xo se tr aa sp or t 67 67 99 - H yp ot he tic al pr ot ei n [C ry pto co ce us no GO t e - 1. 87 5 1. 22 9 A A W 43 76 0. 1 0.O OE +0 0 67 87 80 n eo fo rm an sv ar . n eo fo rm an s J EC 2I I Ta bl e B. 1 R e g io n s o f d if fe re n c e in th e g e n o m e s o f th re e s e r o ty p e B s tr a in s c o m p a re d w it h th e s e q u e n c e d g e n o m e o f s tr a in W M 2 7 6 . R e g io n s o f d if fe re n c e th a t o v er la p ar e in th e sa m e co lo ur . Th e fir st co lu m n in di ca te st he ch ro m os om e (C HR )n u m be r. aN uc le ot id e co o rd in at es o ft he se gm en ti de nt ifi ed by CG H .b G e ID o ft op BL A ST n. Th e E- va lu e o ft he B LA ST re su lt is in cl ud ed in th e fo llo w in g co lu m n. cC oo rd in at es o ft he sp ec ifi c ge ne in th e se gm en ti de nt ifi ed by CG H. dF un ct io na l i nf or m at io n ab ou tt he to p B LA ST hi t. Th e GO o n to lo gy is in cl ud ed in th e fo llo w in g co lu m n. A ve ra ge LR St an s! ”r d de vi at ip o fL R Co or di na te s Co or di na te s o f W M 27 6 W M 27 6 C H R de sc nb in g th e Pr ot ei n ID E- va lu e G en e Pr ed ic te d Fu nc tio n G en e O nt ol og y R V 66 09 5 E5 66 G FP 2 RV 66 09 5 E5 66 G FP I ar ea X en ob io tic -tr an sp or tin g A TP as e, pu ta tiv e G O _c om po ne nt : pl as m a m e m br an e; G O _f un ct io n: x e n o bi ot ic A 45 5 - 10 51 6 A A W 41 68 8. 1 0 59 1 - 62 77 [C ry pt oc oc cu sn e o fo rm am v ar . tra ns po rti ng A TP as e a c tiv ity ;G O_ pr oc es s: dr ug tr an sp or t; G O _p ro ce ss : . 1. 27 5 0. 87 1 n eo fo rm am JE C 2I ] re sp on se to dr ug ” G lu co se 1- de hy dr og en as e, pu ta tiv e G O _c om po ne nt : pe ro xi so m al m a tr ix ,G O _f un ct io n: 2, 4. di en oy l-C 0A A A W 42 O IO .1 0 10 50 1 - 11 52 7 [C ry pt oc oc cu sn e o fo rm am v ar . re du ct as e (N AD PH )a c tiv ity ;G O _p ro ce ss : sp or ul at io n (se m u - 1. 27 5 0. 87 1 _ _ _ _ _ _ _ _ _ _ _ _ n e o fo rm am JE C 2I I Sa cc ha ro m yc ea ); G O Ji ro ce ss : fa tty ac id ca ta bo lis m 55 67 19 - N O G EN E - 2. 24 0 1. 22 4 55 77 04 57 55 19 SA R sm a ll m o n o m e ric G TP as e, pu ta tiv e G O c o m po ne nt : C O PE v e si cl e co at ; G O _f un ct io n: SA R sm al l 57 70 90 - - 0. 93 0 0. 71 9 A A W 41 61 O .1 0 [C ry pt oc oc cu sn eo fo rm an sv ar . m o n o m e ric G TP as e a c tiv ity ;G O _p ro ce ss : E R to G ol gi tr an sp or t 58 04 92 57 65 39 n e o fo rm an s JE C 21 ] 5’: C on se rv ed hy po th et ic al pr ot ei n 5’: [C ryp toc oc cu sn e o fo rm am v ar . 5’ n o G O te nn s 57 74 16 - n e o fo rm an s JE C 2I ]; 3’ :G O _c om po ne nt : C O PE v e si cl e co at ; G O _f un ct io n: SA R sm a ll - 0. 93 0 0. 71 9 A A W 4I 60 9. 1; 5. OO E- 20 57 80 78 3’: SA R sm al lm o n o m e ric G TP as e, 3’ :A A W 4I 6I O .1 m o n o m e ric G TP as e a c tiv ity ;G O _p ro ce ss : E R to G ol gi tr an sp or t pu ta tiv e [C ry pt oc oc cu sn e o fo rm an s va r. n e o fo rm an s JE C 2I ] 1.O OE - 57 83 85 - H yp ot he tic al pr ot ei n IC ry pt oc oc cu s n o G O te rm s - 0. 93 0 0. 71 9 X P_ 77 72 90 .1 14 2 58 04 38 n e o fo rm an s va r. n eo fo rm on s B -3 50 lA ] 84 26 47 - 9.O OE - 84 02 30 - E R o rg an iz at io n an d bi og en es is -r el at ed G O _c om po ne nt : e n do pl as m ic re tic ul um m e m br an e; G O _p ro ce ss : E R - 0 93 8 0 66 6 A A W 40 95 0. I 84 38 20 13 2 84 28 47 pr ot ei n, pu ta tiv e (C ry pto co cc us n e o fo rm am v ar n e o fo rm as s JE C 2I ] o rg an iz at io n an d bi og en es is Ty pe 2C Pr ot ei n Ph os ph at as e, pu ta tiv e 84 29 58 - A A W 40 95 2. I 0, 00 E+ O 0 84 49 70 [C ry pt oc oc cu sn e o fo rm am v ar . G O _f un ct io n: pr ot ei n ph os ph at as e ty pe 2C a c tiv ity - 0. 93 8 0. 66 6 n eo fo rm an sJ E C 2I ] 97 06 02 - 9 6 9 - R PN 1O -li ke pr ot ei n, pu ta tiv e G O _c om po ne nt : pr ot ea so m e re gu la to ry pa rti cl e (se m u Eu ka ry ot a); A A W 41 02 4. 1 0. O O E+ 00 97 15 29 97 07 02 [C ry pt oc oc cu sn e o fo rm an s v ar . G O _f un ct io n: e n do pe pt id as e a c tiv ity ;G O _p ro ce ss : u bi qu iti n. de pe nd en t 2. 17 9 1. 30 9 n e o fo rm an s JE C 21 I pr ot ei n ca ta bo lis m G O _c om po ne nt : hi st on e ac et yl tra ns fe ra se c o m pl ex ;G O _c om po ne nt : a c tin c a bl e (se m u Sa cc ha ro m yc es ); G O _c om po ne nt : c o n tr ac til e rin g (se ns u Sa cc ha ro m yc es ); G O _c om po ne nt : ac tin c o rt ic al pa tc h (se ns u Sa cc ha ro m yc es ); G O _c om po ne nt : ac tin fil am en t; G O _f un ct io n: st ru ct ur al c o n st itu en to fc yt os ke le to n; G O _p ro ce ss : m ito ch on dr io n in he rit an ce ;G O _p ro ce ss :v a c u o le in he rit an ce ;G O _p ro ce ss : e st ab lis hm en to fc el lp ol ar ity (se us u Sa cc ha ro m yc es ); G O _p ro ce ss : c yt ok in es is ;G O _p ro ce ss : re gu la tio n o ft ra ns cr ip tio n fr om P ol E pr om ot er ;G O _p ro ce ss : e x o c yt os is ;G O _p ro ce ss : e n do cy to si s; G O _p ro ce ss : re sp on se to o sm o tic st re ss ; G O _p ro ce ss : ce ll w al l o rg an iz at io n an d bi og en es is ;G O _p ro ce ss : bu dd in g ce ll a pi ca lb ud gr ow th ;G O _p ro ce ss : bu dd in g ce ll is ot ro pi c bu d gr ow th ;G O _p ro ce ss : sp or ul at io n (se us u Sa cc ha ro m yc es ); G O _p ro ce ss : pr ot ei n se c re tio n; G O _p ro ce ss : hi st on e a c e ty la tio n; G O _p ro ce ss :a c tin fil am en t re o rg an iz at io n da rin g ce ll cy cl e; G O _p ro ce ss :v e si cl e tr an sp or ta lo ng ac tin fil am en t; G O _p ro ce ss : m ito tic sp in dl e o rie nt at io n (se m u r n N O GE NE A A W 41 O 26 .1 0. O O E+ 00 - A ct in [C ry pt oc oc cu sn e o fo rm an s v ar . n e o fo rm an s JE C 21 ] 12 77 23 9- 12 78 63 7 2. 17 9 1. 30 9 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A ve ra ge LR S ta nd ’d de vi ati v— o fL R C oo rd in at es C oo rd in at es o f W M 27 6 W M 27 6 C H It de sc ri bi ng th e Pr ot ei n ID E- va lu e G en e Pr ed ic te d Fu nc tio n G en e O nt ol og y R V 66 09 5 € 56 6 G FP 2 R V 66 09 5 € 56 6 G FP Z a re a GO _c om po ne nt : n u cl eu s; G O _f lin ct io ir tla ,n cr ip tio n fa cto ra ct ivi ty; B 16 56 - 69 36 AA W 42 7O L. 1 0. 00 E4 00 93 3 - 39 73 H yp ot he tic al pr ot ei n [C ryp toc oc cU s G O _ p ra ’ ca rb oh yd ra te m et ab ol ism ; G O_ pr oc es s: re gu la tio n o f - 2. 59 2 1. 70 4 n ir ia n s v ar . n e o fo rm am JE C2 IJ tra ns cr ip tio n, D N A -d ep en de nt A A W 4I 4I O .I 0.O OE +0 0 51 11 - 63 53 Ex pr es se d pr ot ei n [C ryp toc oc cu s , G O t e - 2. 59 2 1. 70 4 ne of om ma ns va r. n eo fo rm an a JE C2 IJ A A W 40 61 6. 1 0. 00 E+ 00 68 32 -8 62 4 EX ptO S5 Cd pr ot ei n [C 00 0c cu s GO te rm s - 25 92 1. 70 4 n eo fo em ai n va r. n eo fo rm an s JE C2 I} 28 87 16 - 5.O OE - Cr yp to co cc us ga tti ist ra in E5 6t iM A Ta n/ a - J 93 8 0. 76 5 A Y 71 04 29 29 03 65 17 5 lo cu s, co m pl et e se qu en ce ,N O G EN E 34 84 10 - 34 79 84 - H yp ot he tic al pr ot ei n p eo co cc u s n o GO te nn s - 3. 14 8 0. 81 7 X P 77 81 23 .1 5, 00 E- 65 34 95 05 - 34 82 08 n eo fo en ian sv ar . n e o fo rm an s B -3 50 1A ) N A D H de hy dr og en as e su bu ni ts 34 97 03 - [m ito ch on dri on C ry pt oc oc cu s n e o ln ai n n o G O - 3 t4 8 0 8t 7 A A N 37 58 4. 1 1, 00 E- 7l 34 99 06 va r. gz ub ii (F ilo be sid iel la n e o fo rm an s se ro ty pe A ) G O _c om po ne nt : n u cl eu s; G O _c om po ne nt : c yt op la sm ;G O _f un ct io n: 59 38 72 - 59 17 55 - H yp ot he tic al pr ot em [C ry pt oc oc cu s tr an sc rip tio n fa ct or ac tiv ity ;G O _p ro ce ss : tr an sc rip tio n; G O _p ro ce ss : - 1. 09 2 0. 71 1 A A W 40 80 4. 1 0. O O E+ 00 59 43 93 59 39 09 n e o fo rm an s v ar . n e o fo rm an s JE C 2I ] re sp on se to o x id at iv e st re ss ; G O _p ro ce ss : re sp on se to dr ug 59 42 09 - H yp ot he tic al pr ot ei n [C ry pt oc oc cu s A A W 41 33 0. 1 0. O O E+ 00 n o G O te rn is - 1. 09 2 0. 71 1 59 74 96 n eo fo rm an sv ar . n eo fo rm an sJ E C 2I ] GO _c om po ne nt :p las m a m em br an e; G O _f un ct io n: re ce pt or ac tiv ity ; 10 21 52 2- 10 22 09 4- Gl uc os e tr an sp or te r. pu ta tiv e GO _f un cti on :g lu co se tr an sp or te ra ct iv ity ; G O_ fu nc tio n: gl uc os e A A W 41 54 8. l 0.O OE +0 0 0. 75 8 0. 47 3 t0 23 97 4 10 23 87 3 00 0c di s n j n ‘ a bi nd in g; GO _p eo ce ss : m pn al tr an sd uc tio n; GO _p ro ce ss : n eo fo rsn an aJ EC 2t ] re sp on se to gl uc os e st im ul us 5’: Co ns er ve d hy po th et ic al pr ot ei n 5’: [C ryp toc oc cu sn eo fo rm an sv ar . 12 38 56 0- 12 38 93 2- A A W 45 24 7. 1; 0.O OE +0 0 12 42 12 5 12 40 91 7 n eo fo rm sn si EC 2l ] n o G O te rm s - 1. 54 2 0. 70 8 3’ :A A W 45 02 6. 1 3’ :M M S2 ,p ut at iv e [C ryp toc oc cu s n eo fo nn as s va r. n eo fo rm an s JE C 2I ] 21 76 14 0- 21 76 76 0- H yp ot lte tic al pr oe ei n[ Cr yp toc oc cu s n o G O te rm s 10 53 03 50 A A W 45 16 7. t 0,0 0E 4’ OO 21 82 56 2 21 78 06 1 n eo fh nn ai n va r. n eo fu rm an s JE C2 I] 5’ :G O _c om po ne nt n u cl eu s; G O _f un ct io n D N A bi nd in g; G O _p ro ce ss : ch ro m at in si le nc in g at te lo m er e; GO _p ro ce ss :a bo rt- ch ain fa tty ac id 5’ : 5’ :b st 3 pr ot ei n, pu ta tiv e m et ab ol um 21 79 35 1- 3’ : T ra ns cr ip tio na la c tiv at or gu nS ,p U ta tiv e 3’ :G O _c om po ne nt : SA GA co m pl ex ; G O _c om po ne nt : A da 2/ Gc nS /A da 3 1. 05 3 0. 35 0 A A W 40 94 3. 1; 0. O O E+ 00 21 80 17 9 [C iyp toc oc cu sn e o fo rm an s v ar . 3’ :A A W 40 83 0. 1 tr an sc rip tio n a c tiv at or co m pl ex ;G O _f un ct io n: hi sto ne ac et yl tra ns fe ra ae n e o fo rm am JE C 2I ] ac tiv ity ;G O_ pr oc es s: rb ro m at in m o di fic at io n; GO _p ro ce ss :b ist on e ac et yl at io n 5’ :E xp re ss ed pr ot ei n FC ry pt oc oc cu s 5’: n eo fo m m an tv ar . n eo fo rm an sJ E C 2I J; 2. 00 €- 21 81 05 3- A A W 47 I8 4. 1; 15 3 21 81 76 5 3’ :E po xi de hy dr ol as e 1, pu ta tiv e no G O te em s 1. 05 3 0. 35 0 3’ :A A W 47 I9 1. 1 [C ryp toc oc cu sn eo fo em am va r. n eo fo em ar nJ EC 2l l 21 81 91 9- Ex pr ea se dp ro te in [C iyp too oc cu a n o G O te rn n 10 53 0. 35 0 A A W 4I 4I O .1 0.O OE -I- 00 21 83 16 0 n eo fo er na ns v ar . n e o fo rn ia ns JE C2 I] G O _c om po ne nt n u cle us ;G O _f lsn ct io a tr an sc rip tio n fa ct or a c tiv ity ; 21 84 29 0- H yp ot he tic al pr ot ei n [C ryp eo co cc us G O _p ro ce ss :c ar bo hy dr at e m et ab ol ian s; G O _p ro ce ss : re gu la tio n o f 1. 05 3 0, 35 0 A A W 42 70 t,1 0, 00 E4 ’O O 21 87 33 0 n eo fo sn ia ns va r. n eo fo rn ia ns JE C2 I) tra ns cr ip tio n, D N A -d ep en de nt 21 77 32 3- 21 76 76 0- H yp ot he tic al pr ot ei n[ Cr yp toc oc cu s n o G O te nn s - 2. 45 1 1, 23 9 A A W 45 16 7. I 0. O O E+ 00 21 86 55 2 21 78 06 1 n eo fo rm ac av ar . c ie of or m an ai EC 2l l 5’ :G O _c om po ne nt :n u cle us ;G O _f un ct io n: D N A bi nd in g; G O _p ro ce ss : ch ro ni at in si le nc in g a t te lo m er e; G O _p ro ce ss : sh or t-c ha in fa tty a c id , , 5’: hs t3 pr ot ei n, pu ta tiv e m et ab oh em 21 79 35 1 - 3’ .T ra ns cr ip tio na la ct iv at or gu nS ,p ut at iv e 3’ :G O_ co m po ne nt :S A G A co m pl ex ;G O_ co m po ne nt :A da 2/ Gc n5 /A da 3 - 2. 45 1 1. 23 9 A A W 40 94 3. 1; 0, 00 E+ 00 21 80 17 9 [C ryp toc oc cu s n eo fo rm am va r. tr an sc rip tio n ac tiv at or co m pl ex ;G O _f lm ct io a hi sto ne ac et yl tra ns fe ea ae 3’ :A A W 40 83 0. 1 n eo fo nn an a JE C2 IJ ac tiv ity ;G O_ pr oc es s: ch ro m at in m o di fic ati on ;G O_ pr oc es s: hn to ne ac et yl at io n 00 _ _ _ _ _ _ _ _ _ _ _ _ _ _ A ve ra ge LR St an da rd de vi at io n o f L R C oo rd in at es C oo rd in at es o f W M 27 6 W M 27 6 C H R de sc rib in g th e Pr ot ei n ID E- va lu e Pr ed ic te d Fu nc tio n G en e O nt ol og y R V 66 09 5 £5 66 RV 66 09 5 £5 66 G en e G H ’2 G FP 2 a re a 5’: Ex pr es se d pr ot ei n IC ry pt oc oc cu s 5’: n eo fo rm an sv ar . n eo fo rm an s J EC 2I ); 2.O OE - 21 81 05 3- B A A W 47 18 4. l; 15 3 21 81 76 5 3. Ep ox id e hy dr ol as e 1, pu ta tiv e no GO te rm s - 24 51 1. 23 9 3’ :A A W 47 19 1. l IC ry pt oc oc cu s n eo fo rm an s va r. n eo fo rm an s J EC 2I ] 21 81 91 9- Ex pr es se d pr ot ei n [C ry pto co cc us no 0< )t er m s - 2 45 1 I 23 9 A A W 4I 4I O .1 0.O OE +0 0 21 83 16 0 n eo fo rm an s va r. n eo fo rm an s JE C2 I] 21 84 29 0- H yp ot he tic al pr ot ei n (C ryp toc oc cu s G O _c om po ne nt : n u cl eu s; G O _f un ct io n: tr an sc rip tio n fa ct or ac tiv ity ; A A W 42 70 1 . 1 0.O OE +0 0 21 87 33 0 n eo fo rm an s v ar n eo fo rm an s JE C2 I] G O _p ro ce ss :c ar bo hy dr at e m et ab ol ism ; G O _p ro ce ss : re gu la tio n o f - 2. 45 1 1. 23 9 tr an sc rip tio n, D N A -d ep en de nt — — C 14 83 44 - 14 72 25 - H yp ot he tic al pr ot ei n [C ry pto co cc us A A W 42 62 3. l 8.O OE -6 6 no GO te rm s - 0. 99 0 0. 97 7 14 91 32 14 85 15 n eo fo rm an sv ar . n e o fo rm an si EC 2t l 36 94 50 - 36 85 14 - H yp ot he tic al pr ot ei n [C ry pto co co us no GO te rm s - 0 70 6 0 55 4 A A W 42 12 5. 1 0. 00 E+ 00 37 20 34 36 96 32 n eo fo rn sa ns va r. n eo fo rm an s JE C2 I) 36 96 98 - H yp ot he tic al pr ot ei n [C ry pto co cc us A A W 42 12 7. I 0.O OE -t- 00 no GO te rm s - 0. 70 6 0. 55 4 37 08 54 n eo fo tm an s va r. n eo ib rm an s J EC 2I ] 6 OO E- 37 l1 36 Co ns er ve d hy po th et ic al pr ot ei n A A W 42 65 0. 1 13 7 37 24 24 [C ry pto co cc us n eo fo rm an s va r. no GO te rm s - 0. 70 6 0, 55 4 n eo fo rm an s JE C2 I] 5’: n o G O te rm s 5’: H yp ot he tic al pr ot ei n [C ry pto co cc us 3’ : G O _c om po ne nt : n u cl eu s; G O _c om po ne nt : c yt op la sm , G O _f un ct io n: n eo fo rm an s va r. n eo fo rn ia ns JE C2 II cy cl in -d ep en de nt pr ot ei n ki na se ac tiv ity ; G O _p ro ce ss : re gu la tio n o fc el l 53 92 82 - 5’ A A W 42 21 6. l 53 98 18 - 1.O OE -3 8 54 16 43 3” A A W 4Z 2I S 1 54 01 95 3’ :C dc 2 cy cl in -d ep eo de nt ki na se , p ut at iv e cy cl e; G O _p ro ce ss :C it/ S tr an si tio no f m ito tic ce ll cy cl e; G O _p ro ce ss : S - 0. 92 4 0. 58 4 [C ry pto co cc us n eo fo rn in ns va r. ph as e o fm ito tic ce ll cy cl e; G O _p ro ce ss : 02 /le t t ra ns iti on o f m ito tic ce ll n eo fo rm am JE C2 I] cy cl e; G O _p ro ce ss :p ro te in am in o ac id ph os ph or yl at io n; G O _p ro ce ss : re gu la tio n o f m ei os is 5’: n o G O te rtn s 5’: H yp ot he tic al pr ot ei n IC ly pt oc oc cu s 3’ : G O _c om po ne nt .n u cl eu s; G O _c om po ne nt : c yt op la sm ; G O _f un ct io n: n eo fo rm an s va r. n eo fo rm an s J EC 2I ] cy cl in -d ep en de nt pr ot ei n ki na se ac tiv ity ; G O _p ro ce ss : re gu la tio n o f c el l 5’: A A W 42 21 6. l 54 03 16 . 3.O OE -9 8 3” A A W 42 21 8 1 54 23 17 3’ :C dc 2 cy cl in -d ep en de nt ki na se ,p ut at iv e cy cl e; G O _p ro ce ss : Ci t/S tr an sit io n o fm ito tic ce ll cy cle ;G O _p ro ce ss : S - 0. 92 4 0. 58 4 (C ryp toc oc eu o n eo fo rm an s va r. ph as e o f m ito tic ce ll cy cl e; G O _p ro ce ss : G 2/ M tr an sit io n o f m ito tic ce ll n eo fo rm an s J EC 2I J cy cl e; G O _p ro ce ss : pr ot ei n am in o ac id ph os ph or yl at io n; G O _p ro ce ss : re gu la tio n_ of m ei os is 5’: H yp ot he tic al pr ot ei n [C ry pto co cc us 5’ : n o GO te rm s 5’ : n eo fo rm an s va r. n eo fo rm an s JE C2 I] 3’ : G O _c om po ne nt :n u cl eu s, G O _c om po ne nt : c yt op la sm ; G O _f un ct io n: 61 99 42 - 62 24 28 - A A W 45 16 5. l; 0.O OE +0 0 3’ :B ra nc he d- ch ai n- am in o- ac id br an ch ed -c ha in -a m in o- ac id tr an sa m in as e ac tiv ity ;G O _p ro ce ss : am in o 0. 45 9 0. 21 8 62 51 62 62 49 77 - 3’ :A A W 45 16 4. t tta ns am sn as e, pu ta tiv e [C typ toc oc cu S ac id ca ta bo lis m ;G O _p ro ce ss : b ra nc he d ch ai n fa m ily am in o ac id n eo fo rm an s v ar n eo fo rm an s J EC 2I ] bi os yn th es is N uc le ar m at rix pr ot ei n N M P2 00 ,p ut at iv e G O _c om po ne nt : n u cl eu s; G O _c om po ne nt : n pl ic eo so m e co m pl ex ; 73 22 25 - 73 08 60 - A A W 42 21 3. t 0.O OE +0 0 (C typ toc oc cu s n eo fo rm an s va r. G O _c om po ne nt :c yt op la sm ; G O _f un ct io n: pe e- m RN A sp lic in g fa ct or - 0. 99 3 0. 58 4 73 35 61 73 27 15 n eo fo rm an s J EC 2I J ac tiv ity ; G O pr oc es s: n u cl ea r i nR N A sp lic in g, y in sp lic eo so m e 82 48 30 - N O G EN E - 2. 01 7 0. 77 7 82 54 27 G O _c om po ne nt :c yt op la sm ; G O _c om po ne nt : p ol ys om e, G O _f un ct io n A ll 5 de pe nd en t h el ic as e, pu ta tiv e n u cl ei c ac id bi nd in g; G O _f un ct io n: A TP -d tp en de nt he lic as e ac tiv ity ; 93 40 53 - 93 01 92 - A A W 42 28 7. I 0.O OE +0 0 [C ry pto co cc us n eo fo rm an a va r. G O _f un ct io n: A TP as e ac tiv ity , G O _p ro ce ss : RN A ca ta bo lis m , 1.3 91 0. 70 6 93 41 27 93 41 66 n eo fo rm an s JE C2 I] n o n se n se -m ed ia te d de ca y; G O _p ro ce ss : m RN A ca ta bo lis m ; G O _p ro ce ss :_ re gu la tio n o ft ra ns la tio na l t er m in at io n 5’: G O _c om po ne nt : c yt op la sm ; G O _f un ct io n: fo rm al de hy de 11 11 89 3 - I.O OE - H yp ot he tic al pr ot ei n IC ry pt oc oc cu s A A W 46 79 9. 1 N O G EN E 11 15 46 2 10 9 n eo fo rm an s v ar n eo fo rm as s JE CZ U de hy dr og en as e g lu ta th io ne )ac tiv ity ; G O _p ro ce ss : fo rm al de hy de - 3. 56 1 1.2 73 as sim ila tio n 19 60 31 9- 19 60 87 6- Co ns er ve d hy po th et tc al pr ot ei n A A W 44 49 8. I 3.O OE -4 2 [C ry pto co cc us n eo fo rm an s v ar . G O _c om po ne nt :n u cl eu s; G O _c om po ne nt : c yt op la sm 1. 44 0 0. 80 9 19 61 45 5 19 61 09 9 n eo fo rm an s JE C2 I] — _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A v e ra g e L R S ta n d a r d d e v ia ti o n o f L R C o o rd in a te s C o o rd in a te s o f W M 2 7 6 W M 2 7 6 C H R de sc nb in g th e Pr ot ei n II ) E - v a lu e P r e d ic te d F u n c ti o n G e n e O n to lo g y R V 6 6 0 9 5 E 5 6 6 R V 6 6 0 9 5 E5 66 G e n e G F P 2 G E P 2 ar ea 1: Co ns er ve d hy po th et ic al pr ot ei n 1: GO co m po ne nt :n u cle us ;G O_ co m po ne nt :c yt op las m ;G O _ fu n c ti o n : [C ryp toc oc cu sn eo fo rm an sv ar pr ot ei n c a r r ie r a c ti v it y ; G O _p ro ce ss : p ro te in -n u cl eu s im po rt 1 0 1 8 7 8 - 1: A A W 4 6 6 4 7 .l ; 1 0 4 9 4 2 - 0.O OE +0 0 n eo fo rm an s JE C 2 I] 2: G O _c om po ne nt : ac tin c a p (s en su S ac ch ar om yc es ); G O _ fu n c ti o n - 2 3 1 5 1 0 3 5 D 10 51 71 2 :A A W 4 6 6 4 8 .1 11 42 13 2: H y p o th e ti c a l p ro te in [C ryp toc oc cu s S N A R E bi nd in g; GO _p ro ce ss : ex o cy to sis ;G O_ pr oc es s: v es ic le do ck in g n e o fo rm an s v a r. n eo fo rm an s J E C 2 II du ri ng e x o c y to s la ; G O _ p ro c e s s : v e s ic le fis sio n 10 50 89 Co ns er ve d h y p o th et ic al pe ot ei n A A W 46 72 8. l 0 .O O E + 0 0 - IC ry pt oc oc cu s ne of or m am va r. n o G O te rm s - 2, 31 5 1, 03 5 10 63 33 n eo fo rm am JE C2 II 15 84 75 - 15 63 50 - Ex pr es se d pr ot ei n [Cr YP tOC OC Cu S n o G O te rm s - 02 41 01 61 A A W 46 71 9. 1 0.O OE -fO O 15 87 08 15 91 14 n eo fo rm an s va r. n eo fu rm an sJ EC 2I J 57 47 52 - 57 44 77 - E x p re ss ed p ro te in [C ry pt oc oc cu s A A W 46 49 1. 1 0. O O E+ 00 n o G O te rm s - 2. 24 8 1. 24 3 57 62 39 57 63 37 n eo fo rm an sv ar , n eo fo rm an s JE C2 IJ 5’ : Co ns er ve d h y p o th et ic al p ro te in s’ n o G O te rm s 5’ : [C ry pt oc oc cu s n eo fo rm an sv ar . 2 1 3 4 0 6 5 . 2 1 3 4 4 5 0 - A A W 45 24 7 1; 0 OO E+ 00 n e o fo rm nn s JE C 2 II 3’ : G O _c om po ne nt : n u c le ar po re ; G O _c om po ne nt : c y to p la sm ; - 1. 51 9 - 0. 76 3 0. 68 9 0. 42 1 21 37 60 3 21 37 20 8 G O _f un ct io n: s tr u ct u ra l c o n s ti tu en t o f n u c le ar po re ; G O _p ro ce ss : 3’ : A A W 45 02 6. 1 3’ : M M S2 , pu ta ti ve [C ry pto co cc us pr ot ei n- nu cl eu s im po rt ne of or ns an .s v ar n eo fo rm an sJ E C 2I ] G O _c om po ne nt : sm a ll n u c le ol ar ri bo nu cl eo pr ot ei n c o m pl ex ; u 3 sm al ln u c le ol ar ri bo nu cl eo pr ot ei n G O _c om po ne nt : s m a ll n u c le ar ri bo nu cl eo pr ot ei n co m pl ex ;G O _f un ct io n: 22 07 19 0- 2 2 07 53 8. A A W 45 75 0. 1 0 O O E + 0 0 2 2 2 94 99 22 08 34 3 p ro te in im p3 , p u ta ti v e [C ry pto co cc us s n o R N A bi nd in g; G O _p ro ce ss : rR N A m o di fi ca ti on ; G O _p ro ce ss : 35 S 0. 23 9 0. 20 1 n e o fo rm an s v a r. n e o fo rn ia n s JE C 21 ] pr im ar y tr a n sc ri p t p ro ce ss in g ; G O _p ro ce ss : ri bo so m e bi og en es is ; G O _p ro ce ss :p ro ce ss in g o f 20 S p re -r R N A Fe rr o- 02 -o xi do re du ct as e, pu ta ti ve G O _c om po ne nt : pl as m a m e m br an e; G O _f un ct io rr fe rr o x id as e a c ti v it y ; 22 09 56 3- A A W 45 74 9. I 0.O OE +0 0 [C iyp toc oc cu sn eo fo rm an sv ar . GO _p ro ce ss :h ig h af fin ity iro n io n tr a n sp o rt ; G O _p ro ce ss : re sp on se to 0 .2 3 9 0. 20 1 22 11 81 3 n e o fo rm an s JE C 21 I c o p p er io n 2 2 1 2 4 32 - D ru g tr a n sp o rt er , pu ta ti ve [C ry pto co cc us G O _c om po ne nt : in te g ra l to m e m br an e; G O _f un ct io n: dr ug tr a n sp o rt er 0 23 9 0 20 1 A A W 45 74 8. 1 0. O O E + 00 2 2 1 4 8 0 0 n e o fo rm ae s v a r. n e o fo rm a n s JE C 2 II a c ti v it y ; G O _p ro ce ss : dr ug tr an sp or t 22 17 82 4- H yp ot he tic al pr ot ei n [C ryp toc oc cu s n o G O te rm a 0. 23 9 0.2 01 A A W 45 74 7. 1 0.O OE +0 0 2 2 1 93 34 n eo fo rm nn s v a r. n e o fo rm an s JE C 21 I 22 21 21 1 - E x p re ss ed p ro te in [C ry pt oc oc cu s n o G O te rm s 0 23 9 0. 20 1 A A W 45 74 6. I 0. O O E + 00 2 2 2 2 4 59 n e o fo rm am va r. n eo fo rm an sJ E C 2 IJ 2 2 2 3 7 0 9 - H y p o th et ic al p ro te in [C ry pto co cc us G O _c om po ne nt : in te g ra l to p la sm a m e m br an e; G O _f un ct io n: A A W 44 41 0. l 0.O OE +0 0 n ic o ti n u m id e m o n o n u c le o ti d e p er in ea se a c ti v it y ; G O _p ro ce ss : 0. 23 9 0. 20 1 2 2 2 5 7 8 1 n e o fo rm a n s v a r . n e o fo rm a n s 1 ] n ic ot in am ic le m o n o n u cl eo tk je tr an sp or t C on se rv ed h y p o th et ic al p ro te in 22 26 19 3 - A A W 42 04 4. 1 0.O OE +0 0 [C ry pto co cc us n eo fo rm an sv ar . no G O te rm s 0. 23 9 0. 20 1 2 2 2 7 0 9 9 n eo fo rn ia ns JE C2 I] Co ns er ve d hy po th et ic al pr ot ei n 22 28 12 7- A A W 43 38 4. I 0.O OE +0 0 [C ryp toc oc cu sn e o fo rm am v a r. n o G O te rm s 0. 23 9 0. 20 1 2 2 2 9 0 8 2 n e o fo rm an s JE C2 1] 22 22 35 1 - 22 21 21 1 - E x p re ss ed p ro te in [C ry pt oc oc cu s n o GO te rm s 0 87 8 0 47 7 A A W 45 74 6. 1 0.O OE +0 0 22 23 66 1 22 22 45 9 n eo fo rm an s v a r. n e o fo rm an s JE C2 I] 2 2 2 3 7 0 9 - H y p o th et ic al p ro te in (C ry pt oc or ru s G O _c om po ne nt : in te g ra l to p la sm a m e m br an e; G O _f un ct io n: A A W 44 41 0. 1 0.O OE +0 0 n ic o ti n am id e m o n o n u c le o ti d e p er m m se a c ti v it y ; G O _p ro ce ss : 0 .8 7 8 0 .4 7 7 22 25 78 1 n e o fo rm an s va r. n e o fo rm an s JE C2 IJ n ic o ti ri am id e m o n o n u c le o ti de tr a n sp o rt — E 4 3 5 9 5 - 45 13 0 A. AW 43 41 0. I 3.O OE -4 7 43 21 9 - 43 68 2 H yp ot he tic al pr ot ei n [C ryp toc oc cu s n o G O te rm s - 13 70 0. 84 4 n e o fo ri n an s v a r. n e o fo rm an n JE C2 I] Co ns er ve d h y p o th et ic al p ro te in A A W 43 77 8. l 00 0E +0 0 43 83 8- 45 34 6 [C ry pto co cc us ne of or ma ns va r. n o G O te rm s - 1, 37 0 08 44 n eo fo rm an s JE C2 I] I1 3 8 6 3 . 11 28 05 - Ex pe eu se d pr ot ei n [C ryp eo co cc as n o GO te rm s - 0. 75 3 0. 62 2 A A W 43 80 8. I 0.0 0E 4’ VO 11 48 72 11 41 27 n eo fo rm an s v a r. n e o fo rm an s JE C2 II C o p 9 s ig n al o so m e c o m p le x s u bu ni t 1, 11 43 30 - A A W 43 44 8. 1 0. 00 0+ 00 11 64 70 p u ta ti v e (C ry pt oc oc cu s n e o fo rm an s v a r. n o GO te rm s - 0. 75 3 0. 62 2 n eo fo rm an sJ EC 21 ] A ve ra ge ER St an da rd de vi at io n o fL R C oo rd in at es C oo rd in at es o f W M 27 6 W M 27 6 C U R d es cr ib in g th e Pr ot ei n U ) E -v al n e P re d ic te d Fu nc tio n G en e O n to lo g y R V 6 6 0 9 5 £5 66 R V 66 09 5 £5 66 G en e G F P 2 G FP 2 a r e a E 1 1 5 8 7 2 - 11 43 30 C op P s ig n al o so m e c o m p le x s u bu ni t 1, A A W 4 3 4 4 8 .l 0.O OE +0 0 11 64 67 11 64 70 pu ta tiv e [C ry pto co cc us n eo fo rm an s va r. no G O te rm s 0. 91 1 0. 39 8 n eo fo rm am JE C 2 I] 15 76 21 C o m er v ed h y p o th et ic al p ro te in 15 66 46 - - A A W 43 47 6. I 0.O OE -fO 0 15 79 51 15 86 58 [C ry pto co cc us n eo ib rm an sv ar . no GO te rm s - 1 .7 97 1. 36 6 n eo fo rm an si EC 2l j Co ns er ve d hy po th et ic al pr ot ei n 1 8 1 1 7 0 - 1 7 9 0 5 9 - A A W 43 48 8. I 0.O OE +0 0 18 19 82 18 22 70 o co cc u s n eo fo nn an s va r. no GO te rm s - 0 70 0 0. 40 1 n eo fo rm ai n JE C2 IJ 25 00 26 - N O G EN E - 0. 73 8 0. 47 5 25 04 63 35 40 96 - 35 21 40 - A sp ar ag rn et R N A lig as e, pu ta tiv e G O _c om po ne nt : c y to pl as m ; G O _f un ct io n: as pa ra gi ne -tR N A lig as e A A W 4 3 4 2 7 .l 0. O O E + 00 [C ry pto co cc us n eo fo rm an sv ar . a c ti vi ty ; G O _f un ct io n: A TP bi nd in g; G O _p ro ce ss : a s p ar ag in y l- tR N A - 0. 42 0 0. 37 4 35 84 25 35 42 50 n eo fo rm an sJ EC 2I ] am in oa cy la tio n 5’ : C o m er v ed h y p o th et ic al p ro te in 5 ’ n o G O te rm s 5’ . [C ry pt oc oc cu s n e o fo rm am v & 3’ : G O _c om po ne nt : n u c le ar po re ; G O _c om po ne nt : c y to p la sm ; 3 5 4 9 7 0 - - 0. 42 0 0. 37 4 A A W 45 24 7. 1, 0. O O E + 00 n e o fo rm an s JE C 2 I] G O _f un ct io n: st ru ct ur al c o n s ti tu en t o f n u c le ar po re ; G O _p ro ce ss : 35 77 34 3’ : A A W 45 02 6. 1 3’ : M M S 2, p u ta ti v e [C ry pto co cc us p o te in n c le n a im p o rt n e o fo rm an u_ va r. _n eo fo rm an s JE C 2 I] 35 41 95 - 3 5 2 1 4 0 - A sp ar ag in e- tR N A li ga se , p u ta ti v e G O _c om po ne nt : c y to pl as m ; G O _f un ct io n: a s p ar ag in e- IR N A li ga se A A W 43 42 7. I 0. O O E + 00 35 81 21 35 42 50 [C ry pto co cc us n eo fo rm an sv a r. a c ti vi ty ; G O _ fu n ct io n A TP bi nd in g; G O _p ro ce ss a s p ar ag in y l- tR N A - 1. 49 3 0. 64 3 n eo fo rm an s JE C 2 I] a m in o ac y la ti o n 5’ : Co ns er ve d hy po th et ic al p ro te in 5 ’ n o G O t e 5’ . [C ry pto co cc us n eo fo rm an s 3’ : G O _c om po ne nt : n u c le ar po re ; G O _c om po ne nt : c y to p la sm ; 1 49 3 06 43 3 5 4 9 7 0 - A A W 45 24 7. 1. 0. O O E + 00 n eo fo rm an s JE C 2 II G O _f un ct io n: s tr u c tu ra l co n st itu en to f n u cl ea r p or e; G O _p ro ce ss : 35 77 34 3’ : A A W 45 02 6. 1 3’ : M M S 2, p u ta ti v e [C r3 pt oc oc cu s p re te in n u cl ea s im po rt n eo fo rm an s_ va r._ ne of or m nn sJ EC 2I ] C on se rv ed hy po th et ic al p ro te in G O _c om po ne nt : m it o ch o n dr ia l in ne r m e m br an e; G O _f un ct io n: pr ot ei n 69 84 68 - 6 9 7 5 7 6 - A A W 43 90 5. I 0.O OE +0 0 69 94 36 69 90 83 [C iyp toc oc ca sn e o fo rm ai m v a r. tr an sp or te ra c ti vi ty ; G O _p ro ce ss : m ito ch on dr ia li nn er m em br an e p ro te in - 0. 64 7 0 32 6 n e o fo rm an sl E C 2 ll im p o rt C o n se rv ed h y p o th et ic al p ro te in 69 92 97 - A A W 4 3 68 1. l 0. O O E + 00 [C ry pt oc oc cu s n e o fo rm am va r. n o G O te rm s - 0. 64 7 0. 32 6 70 02 33 n e o fo rm an s JE C2 I] 11 09 97 5 - 11 06 69 4- H y p o th et ic al p ro te in (C ry pt oc oc cu s A A W 43 78 6. l 0.O OE +0 0 n o G O te rm s - 2. 85 3 0. 97 8 11 10 73 0 11 11 50 9 n e o fo rm am v a r. n e o fo rm nn s JE C2 II — Co ns er ve d hy po th et ic al p ro te in F 26 60 -4 34 8 A A W 42 04 4. 1 0.O OE +0 0 30 24 -4 46 2 [C ry pto co cc us ne of or ma ns va r. n o G O te rm s 1. 46 7 0.6 11 n e o fo rm am JE C2 I] C on se rv ed h y p o th et ic al p ro te in 4.0 GB - 1 0 6 8 6 - 14 35 6 A A W 44 14 7. 1 10 96 5- 14 12 1 [C ry pt oc oc cu sn eo fo rm ai nv ar . n o G O te rm s - 1. 16 7 0. 50 6 17 5 n e o fu rm an st E C 2 l] A lp -d ep en d en t pe rm ea se ,p ut at iv e G O _ co m p o n en t= cy to p la sm ; e n d o p la sm ic re ti cu lu m ; in te gr al to X P_ 57 16 29 .l 0.O OE +0 0 14 28 9- 18 37 3 [C ry pt oc oc cu s n e o fo rm an s v a r. m e m br an e; G O _f ii nc ti on = A T P -b in d in g c a s s e tt e (A BC )t ra n sp o rt er - 1 .1 67 0. 50 6 n eo fo rm an s JE C2 I] a c ti vi ty ; G O _p ro ce ns = tr an sp or t C o m er v ed h y p o th et ic al p ro te in 4. 0G B - 11 06 3- 14 01 4 A A W 44 14 7. I 10 96 5 - 14 12 1 [C ry pto co cc us n eo fo rm am va r. no GO te rm s - 0. 65 8 0. 27 7 17 5 n e o fo rm an sl EC 2l j 5’: H y p o th et ic al p ro te in [C ry pt oc oc cu s 64 42 94 - 5’: A A W 44 03 2 I lO G E- 64 43 80 - n e o fo rm an o v ar . n e o fo rm an sJ E C 2 l] 5’: n o G O te rm s - 2 .8 40 1. 21 0 64 50 63 3’ :A A W 44 03 4. I 11 4 64 47 60 3’ :E xp re ss ed p ro te in (C ry pt oc oc cu s 3’ :n o GO te rn ts n e o fo rm an s v a r. _ n e o fo rm an s JE C2 I] 10 67 22 C on se rv ed h y p o th et ic al p ro te in 10 85 76 - - A A W 44 33 2. 1 0.O OE 4s tO 11 47 37 10 89 19 [C ry pto co cc us n eo fo rm an sv a r. n o GO te rm s - 0. 94 4 0. 25 5 n eo fo rm an s JE CZ I] 10 91 24 - H y p o th et ic al p ro te in [C ry pt oc oc cn s n o GO te rm s - 0 94 4 0 25 5 A A W 44 IO 5. I 0.O OE +0 0 11 09 01 n eo fo rm an s v a r. n e o fo rm an s IE C2 1I I - i _ _ _ _ _ _ _ _ _ _ _ _ _ _ A ve ra ge LR St an da rd de vi at io n o fL R C oo rd in at es C oo rd in at es o f W M 27 6 W M 27 6 C H R d e s c n b in g th e Pr ot ei n ID E -v al ne G ei w P re d ic te d F u n c ti o n G e n e O nt ol og y R V 66 09 5 E5 66 G FP 2 R V 66 09 5 E5 66 G FP 2 ar ea GO _c om po ne nt :c yt op las m ;G O_ fu nc tio n: pr ot ei n se rin e/ th re on in e 11 24 60 - ki na se a c tiv ity ; GO _p ro ce ss :p ro te in a m in o a c id ph os ph or yl at io n; - 0. 94 4 0. 25 5 F A A W 44 10 3. 1 0O O E +0 0 R an t- li ke pr ot ei n ki na se , pu ta tiv e GO _p ro ce ss :g lu co se tra ns po rt; G O _p ro ce ss : po si tiv e re gu la tio n o f 11 38 55 tr an sc ri pt io n fr om Po l1 1 pr om ot er 11 66 73 5- 11 66 55 7- E xp re aa ed pr ot ei n( Cr yp to co cc us n o G O te rt ns 07 03 05 95 A A W 46 93 5. 1 3. O O E- 94 11 69 60 4 11 67 28 9 n e o fo ri na ns v a r. n e o fo rm am JE C 2I ] 11 68 65 8 - Ex pr es se d pr ot ei n IC ry pt oc oc cu s n o G O te rm s 0 70 3 0 5 9 5 A A W 46 86 1. I 0 OO E+ OO 11 69 72 3 n eo fo rm an sv a r. n e o fo rm an s JE C 2I ] V es ic le -m ed ia te d tr an sp or t- re la te d pr ot ei n. G O _c om po ne nt : in te gr al to m e m br an e; G O _p ro ce ss : v e si cl e- m ed ia te d 25 92 77 - 25 66 40 - A A W 44 05 1. 1 0. O O E+ 00 26 10 37 25 93 39 pu ta tiv e [C ry pt oc oc cu s n e o fo rm an s v a r. - 1. 63 0 1. 32 0 tr an sp or t n eo fo rm an sJ E C 2I ] 5’: B io tin sy nt ha se , pu ta tiv e IC r’ itt oc oc cu s 5’: G O _c om po ne nt : m ito ch on dr io n; G O _f un ct io n: bi ot in sy nt ha se 5’: n eo fo rm an s v a r. n e o fo rm an s JE C 2I ] A A W 44 37 5. 1; 0. O O E+ 00 25 95 72 - 3’ :V es ic le -m ed ia te d tr an sp or t- re la te d a c tiv ity ; G O _p ro ce ss : bi ot in bi os yn th es is - 1. 63 0 1. 32 0 26 14 41 3’ :G O _c om po ne nt : in te gr al to m e m br an e; G O _p ro ce ss : v e s ic le 3’ :A A W 44 05 1. 1 pr ot ei n, pu ta tiv e [C ry pt oc oc cu s m e di at ed tr an sp or t n e o fo rm an s v a r. n e o fo rm an s JE C 2I ] — 5’: G O _c om po ne nt : m e m br an e fr ac tio n; G O _f un ct io n: a lp ha - 5’: A lp ha -g lu co si de :b yd ro gt n ay m po rte r, gl uc os id e: hy dr og en sy m po rte ra c tiv ity ; G O _p ro ce ss : a ip ha -g lu co si de . pu ta tiv e [C ry pt oc oc cu s n e o fo rm am v a r. tr an sp or t 7.O OE - n e o fo rm am JE C 2I ] 3’ :G O _c om po ne nt : pl as m a m e m br an e; G O _f un ct io n: tw o -c o m po ne nt G 85 - 21 45 A A W 47 00 6. 1; 31 9- 20 06 1. 77 2 0. 70 4 11 2 3’ :P ro te in -h is tid in e ki na se , pu ta tiv e se n so r m o le cu le a c tiv ity ; G O fu nc tio n: pr ot ei n- hi st id in e ki na se a c tiv ity ; 3’ :A A W 47 00 7. 1 (C ry pt oc oc cu s n e o fo rm am va r. G O _f un ct io n: o sm o se n so r a c tiv ity ; G O _p ro ce ss : pr ot ei n a m in o a c id n e o fo rm an s JE C 2I I ph os ph or yl at io n; G O _p ro ce ss : o sm o a e n so ry si gn al in g pa th w ay vi a tw o c o m po ne nt sy st em ; G O pr oc es s: re sp on se to hy dr og en pe ro xi de 5’: In os ito lo x yg en as e, pu ta tiv e [C ry pt oc oc cu s n e o fo rm an s va r. 5’: n e o fo rm am JE C 21 J 51 79 88 - 51 65 20 - A A W 44 68 7. 1; 0.O OE +0 0 3’ :G lu ta ry l-C oA d e h y d ro g e n a se , n o GO te rm s - 3. 30 6 0. 94 8 53 33 33 51 80 10 3’ : A A W 44 68 9. 1 m ito ch on dr ia lp re cu rs or , pu ta tiv e [C ry pt oc oc cu s n e o fo rt na ns va r. n e o fo rm an s JE C 2I ] 51 84 93 - In os ito lo x yg en as e, pu ta tiv e [C ry pt oc oc cu s A A W 44 68 7. 1 0. O O E+ 00 n o G O te rm s - 3. 30 6 0. 94 8 51 96 70 n e o fo rm am va r. n e o fo rm an s JE C 21 ] 52 12 20 - C on se rv ed hy po th et ic al pr ot ei n A A W 44 68 3. 1 0. O O E+ 00 [C ry pt oc oc cu s n e o fo rm an s v a r. n o G O te rm s - 3. 30 6 0. 94 8 52 23 63 n e o fo rm an s JE C 21 ] 5’: M yo -i no si to lt ra n sp or te r 2, pu ta tiv e 5’: IC ry pt oc oc cu s n e o fo rm an s V ar . 5’: G O _c om po ne nt : m e m br an e; G O _f un ct io n: m yo -i no si to lt ra m po rt er 52 29 95 - n eo fo rm aa sJ E C 21 ] A A W 44 68 0. 1; 0. O O E+ 00 52 43 82 3’ C on se rv ed hy po th et ic al pr ot ei n a c tiv ity ; G O _p ro cn ss : m yo -i no si to lt ra ns po rt - 3. 30 6 0. 94 8 3’ :n o GO te rm s 3’ :A A W 44 68 2. 1 [C ry pt oc oc cu s n e o fo rm an s va r. n e o fo rm an s JE C 2I ] 52 52 78 - M yo -i no si to l tr an sp or te r 2, pu ta tiv e G O _c om po ne nt : m e m br an e; G O _f un ct io n: m yo -i no si to lt ra n sp or te r 3 30 6 0 94 8 A A W 44 68 0. I 0. O O E+ 00 52 76 26 [C ry pt oc oc cu s n e o fo rm an s V ar . a c tiv ity ; G O _p ro ce ss : m yo -i no si to l t ra n sp or t n e o fo rm an s JE C 21 I 52 94 13 - E xp re ss ed pr ot ei n [C ry pt oc oc cu s n o G O te rm s - 3. 30 6 0 94 8 A A W 44 88 3. I 0. O O E+ 00 53 21 75 n eo fo rm an s v a r. n e o fo rm an s JE C 21 I C on se rv ed hy po th et ic al pr ot ei n 53 25 96 - A A W 44 67 7. I 0. O O E+ 00 53 37 66 [C ry pt oc oc cu s n c o fo rm an s v a r. n o GO te rm s - 3. 30 6 0. 94 8 n c o fo rm an s JE C 2I I . 5’: E xp re ss ed pr ot ei n [C ry pt oc oc cu s 12 43 48 0- 12 43 67 7- n e o fo rt na ns va r. n eo fo rm an sJ E C 2I J A A W 44 06 8I ; 0. O O E+ 00 n o G O te rm s - 0. 93 8 05 96 12 44 77 5 12 44 15 5 3’ :H yp ot he tic al pr ot ei n [C ry pt oc oc cu s 3’ :A A W 44 06 9. I n e o fo rm an s v a r. n e o fo ri na ns JE C 2I ] — _ _ _ _ _ _ _ _ _ _ _ _ _ _ A ve ra ge LR St an da rd de vi at io n o f L R C oo rd ia at es Co or di na te s o f W M 27 6 W M 27 6 CH R de sc rib in g th e Pr ot ei n ID E- va lu e G en e Pr ed ic te d Fu nc tio n G en e O nt ol og y R V 66 09 5 E5 66 G FP 2 R V 66 09 5 £5 66 G FP 2 ar ea Co ns er ve d hy po th et ic al pr ot ei n 2.O OE - H 13 -1 38 47 A A W 44 14 7. 1 14 3 - 17 58 IC ry pt o0 00 cu sn eo fo nn an a va r. no GO te rn ls - 2. 81 8 0. 93 0 10 6 n eo fo rm an ai EC 2l l Re xo se tr an sp or t-r el at ed pr ot ein ,p ut at iv e 6.O OE -1 6 31 34 - 54 04 [C ryp t00 0c cu sn eo fb nn an s va r. n o ta pp lit ab le - 2. 81 8 0. 93 0 n eo lb rm an sJ EC 2I ] X P_ 77 17 94 .1 7.O OE -0 3 59 31 - 71 11 H yp ot he tic al pr ot ei n [C ryp toc oc cu s n o GO te rm s - 2. 81 8 0. 93 0 n eo lh rm an av ar . n eo fo rm an ,B -3 50 1A J X P_ 77 63 33 .I 1. 80 E- 01 92 03 - 99 62 H yp ot sc al pr ot ei n [C typ toc oc cu s n o G O te nn s - 2. 81 8 0. 93 0 n eo fo rm an sv ar . n eo fo rm an s B- 35 01 A1 Pu ta tiv e m em br an ep ro te in [S tre pto my ce s GO te rm s - 2. 81 8 0. 93 0 BA C7 42 81 .1 5. 90 E- 02 10 43 0- 11 30 8 av er m iti lis M A- 46 80 1 X P_ 74 90 01 .1 1. 20 E+ 00 11 99 3 - 13 37 3 Co ns et ve d hy po th et ic al pr ot ei n no GO te rm s - 2. 81 8 0. 93 0 [A ap erg illu sf ism ig atu sA f2 93 ] 2.O oE - Co ns er ve d hy po th et ic al pr ot ei n 13 - 14 73 5 A A W 44 14 7. 1 14 3- 17 58 [C ryp toc oc cu sn eo fo rm an sv ar . no G O te rm s - 2. 45 3 1. 38 4 10 6 n eo fo rm an s JE C2 II H ex os et ra ns po rt- re la te d pr ot ein , p ut at iv e 6.O OE -1 6 31 34 -5 40 4 [C iyp toc oc ws n eo lb rm an sv ar . n ot ap pl ic ab le - 2. 45 3 1. 38 4 n eo lb rm an s J EC 2I I H yp ot he tic al pr ot ei n [C ryp toc oc cu s no GO te rm s - 2. 45 3 1. 38 4 X P_ 77 17 94 .1 7.O OE -0 3 59 31 -7 11 1 n eo fo rm ai sv ar . n eo fo rm sn sB -3 50 1A 1 X P_ 77 63 33 .1 1. 80 E- 01 92 03 -9 96 2 H yp ot he tic al pr ot ei n[ Cr yp toc oc cu a n o G O te sm s - 2. 45 3 1. 38 4 n to fo rm an sv ar . n eo fo rm an sB -3 50 1A ] Pu ta tiv e m em br an e pr ot ei n (S tre pto my ce s GO - 2. 45 3 1. 38 4 BA C7 42 81 .1 5. 90 E- 02 10 43 0- 11 30 8 av er m iti lis M A -4 68 0J X P_ 74 90 01 .1 1.2 oE +O o 11 99 3 - 13 37 3 Co ns er ve d hy po th et ic al pr ot ei n no GO te rm s - 2. 45 3 1. 38 4 [A sp erg illu sf lir ni ga tu s A f2 93 1 Ra b G TP as ta ct iv ato r, pu tat iv e 5’ :G O _c om po ne nt so lu bl e fra cti on ;G O_ fu nc tio n: Ra b GT Pa se 26 62 92 - 5’ :A AW 4S SO 4. I 6.O OE - 26 66 33 - 3’: U I sm all n u cl ea rr ib on uc ito pr ot ei n, ac tiv at or ac tiv ity - 2. 67 3 1. 59 4 26 82 19 3’ :A A W 45 45 5. 1 14 7 26 73 96 . 3’ :G O _c om po ne nt :c o m m itm en tc o m pl ex ;G O _c om po ne nt sn RN P U I; pu ta tiv e GO fu nc tio n: m RN A bi nd in g; GO _p ro ce ss :m itN A sp lic in g 50 79 49 - 50 08 39 - RN A he lic aa e, pu ta tiv e( Cr yp toc oc cu s G O _f iin ct io ir RN A he lic as ea ct iv ity ;G O_ pr oc es s: re gu la tio n o f - 1. 91 8 0 95 5 A A W 45 48 7. 1 0.O OE +0 0 50 89 52 50 71 92 n eo fo rm an sv ar . n eo fo rn ia n, JE C2 IJ tr an sla tio n 50 76 20 - St er ol -b in di ng pr ot ei n (C ryp toc oc cu s GO fis nc tio it st er ol ca rr ie ra ct iv ity - 1. 91 8 0. 95 5 A A W 45 50 7. I 0.O OE +0 O 50 84 02 n eo fo rm an a va r. n eo fo rm an a JE C2 IJ — 59 50 64 - N O GE NE 1. 39 2 0. 85 7 59 78 09 Co ns er ve d hy po th et ic al pr ot ei n 97 29 90 - 97 06 64 - A A W 45 30 6.1 0.O OE +0 0 [C ryp toc oc cu sn eo fo rm an sv ar . no G O te rm s 1. 22 4 0. 60 2 97 36 73 97 31 67 n eo fo rm an s JE C2 I] 10 89 75 8- N O GE NE - 0. 92 1 0. 23 3 10 90 32 8 5’: 5’: H yp ot he tic al pr ot ei n [C ryp toc oc cu s 11 20 73 2- 11 21 25 2- n eo fb nn an sv ar .n eo fb rm an sJ EC 2l j 5’ :n o G O te en is A A W 47 I6 2. I; 0.O OE -I0 0 1. 87 4 0. 37 3 11 28 02 7 11 22 90 2 3’ :A m id as e, pu ta tiv e IC ry pt oc oc cu s 3’ :G O _f im ct io ir am id as e ac tiv ity 3’ :A A W 47 16 4. 1 n eo fo ,m an s va r. n eo fo rm an s J EC 2I I 5’: 5’: H yp ot he tic al pr ot ei n [C ryp toc oc cu s 11 23 04 5 - n eo fo rm an s va r. n eo fo rm an s J EC 2I I 5’ :n o G O te rm s A A W 47 16 2. I; 0.O OE +0 0 1. 87 4 0. 37 3 11 25 05 3 3’ :A m id as e, pu ta tiv e [C ryp toc oc cu s 3’ :G O_ fu nc tio er am id as ea ct iv ity 3’ :A A W 47 16 4. I n eo fo m ia ns va r. n to fo rm an a JE C2 IJ 2.O OE - 11 26 04 9- H yp ot he tic al pr ot ei n[ Cr yp toc oc cu s n o G O te rm s 1. 87 4 0. 37 3 A A W 44 06 9. I 17 3 11 27 16 1 n eo fo rm an sv ar . n eo fo rn se na jE C2 ll 12 60 46 5 - 12 60 30 3 - Co ns er ve d hy po th et ic al pr ot em A A W 42 04 4. I 0.O OE +0 0 [C ryp toc oc cu sn eo fo rm am va r. n o GO te rm s 0. 66 6 0. 69 0 12 65 38 5 12 61 72 3 n eo fo rm an sJ EC 21 ] — — _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A ve ra ge LR St an ta ,-d de vi at ic a o fL R C oo rd .a at es C oo rd m an te s o f W M 27 6 W M 27 6 C U R de sc ri bi ng th e Pr ot ei n 11 ) E- va hi e G en e Pr ed ic te d Fr ni ct ie a G en e O nt ol og y R V 66 09 5 E 56 6 G F P 2 R V 66 09 5 E 56 6 G F P 2 ar ea U nn am ed p ro te in p ro d u c t; pr ed ict ed n o GO te r m s 1 1 4 0 0 3 1 1 85 -9 27 8 BA E5 55 98 .1 4.O OE -4 0 68 1 - 25 82 p ro te in [A sp er gi ll us o r y z a e ] 5’: M y o -i n o si to l tra ns po rte r, p u ta ti v e . IC ry pt oc oc cu s n eo fo nu an sv ar . 5’: GO c o m p o n e n t: m e m b ra n e; G O _ fu n ct io n : m y o -i n o a it o l tr an sp or te r n e o fo rm a n s JE C2 I] ac tiv ity ; G O _ p ro ce ss : m y o -i n o si to l tr an sp or t A A W 43 04 0. 1; 7.O OE -3 4 53 04 -5 66 4 1 1 4 0 0 .3 1 1 3’ : M al to se 0- ac et yl tra ns fe ru se ,p ut at iv e 3 ’: G O fu nc tio n: m al to se 0- ac et yl tra ns fe ra se ac tiv ity ;G O_ fu nc tio n. 3’ : A A W 43 04 8. t [C ryp toc oc cu sn eo fo rm an sv ar . ac et yl tra ns fe ra se ac tiv ity n eo fo rm am JE C2 IJ H yp ot he tic al p ro te in [C ba et om iu m no GO te rm s 1 14 0 0 3 1 1 X P_ 00 12 20 95 4, I 1. 50 E+ 00 89 06 -9 4 5 6 gl ob os um CB S 14 8.5 11 Se xu al d ev el o p m en t r e g u la to r 1 5 4 5 0 6 - 1 5 5 5 7 8 - A A W 42 79 8. 1 0.O OE +0 0 [C ry pt oc oc cu s n e o fo rm a n s v a r . GO _p ro ce ss :s ex d e te rm in a ti o n - 3 .3 4 6 1 .1 5 8 1 5 7 8 4 9 1 5 6 9 3 7 n eo fo rm an sJ E C 2I J 5 ’: S u lf it e tra ns po rte r, pu ta tiv e 5 ’: IC r3 ’lt oc oc cu s n eo fo rto an sv ar . 5 ’: G O _c om po ne nt :p la sm a m e m b ra n e; G O _ fu n ct io n : s u lf it e tr an sp or te r 15 9 7 4 7 - 16 00 95 - n eo fo rm an sJ EC 2I ] A A W 40 99 0. 1; 3. O O E -l 3 1 6 1 2 1 2 3 ” Co ns er ve d hy po th et ic al pr ot ei n ac tiv ity ; G O_ pr oc es s: su lfi te tra ns po rt 0 .9 0 3 0 .4 6 1 16 27 46 3’ :A A W 40 99 2. l 3’ : n o G O te rn in [C ryp toc oc cu sn eo fo nn aim va r. n eo fo rm an s JE C2 II 5’ : Sp ec ifi c tra ns cr ip tio na lr ep re ss or , 5’ : GO _c om po ne nt :n u cl eu s; G O _f un ct io n: sp ec ifi ct ra ns cr ip tio na l . pu tat iv e [C ryp toc oc cu sn eo fo rm an sv ar 16 25 78 - n eo fo rm am JE C2 I re pr es so ra ct iv ity ,G O_ pr oc es s: n eg at iv e re gu la tio n o f tra ns cr ip tio n fr o m A A W 4S 26 I . 1, 4.O OE -5 8 Po lU pr om ot er ; GO _p ro ce ss :D N A re pa ir 0. 90 3 0. 46 1 16 40 69 3’ . D ih yd ro fo la te sy nt ha se ,p ut at iv e 3’ : G O _c om po ne nt : cy to pl as m ;G O_ fu nc tio n: di hy dr of ol ate sy nt ha se 3’ A A W 45 26 2 I [C ryp toc oc cu sn eo fo rm an sv ar ac tiv ity ;G O_ pr oc es s: fo lic ac id an d de riv at iv e b io sy n th es is n eo fo rm un s JE C2 I] 5’ : G O _ co m p o n en t: n u c le u s; G O _ fu n ct io n : s p e c if ic tr a n s c ri p ti o n a l 5’ : s p e c if ic tr a n s c ri p ti o n a l r e p re ss o r, r e p re ss o r a c ti v it y ; G O _ p ro ce ss : n e g a ti v e r e g u la ti o n o f tr an sc rip tio n fr o m 16 62 28 - 5’ : A A W 45 26 1. 1 16 61 08 - 7. O O E- 58 19 84 14 3” A .A W 45 26 2 I 16 63 56 pu ta ti ve P ol U pr om ot er ; G O _p ro ce ss :D N A re pa ir” - 2. 16 8 1 31 0 3’ : di lty dr of ol at e sy nt ita se ,p ut at iv e 3’ : GO _c om po ne nt :c yt op las m ;G O _f un ct io n: di hy dr of ol at e sy nt ha se ac tiv ity ;G O_ pr oc es s:_ fo lic ac id an d de riv at iv e bi os yn th es is Co ns er ve d hy po th et ic al pr ot ei n GO _c om po ne nt :c yt os ol ic la rg e rib os om al su bu ni t( sen su Eu ka ry ot a); 16 68 52 - A A W 43 23 3. 1 0.O OE +0 0 (C ryp toc oc cu sn eo fo nn an sv ar . GO _f un cti on :s tr uc tu ra lc o n st itu en to fr ib os om e; GO _p ro ce ss :p ro te in - 2. 16 8 1 31 0 16 74 53 n eo fo rm ai n )E C2 t] bi ot yn th es is 5’ : C on se rv ed hy po th et ic al pr ot ei n [C ryp toc oc cu sn eo fo rm as s va r. 5’ : n o G O te rn ts 5’ : A A W 43 23 5. 1 16 74 92 - - 2. 16 8 1. 31 0 2. O O E- 20 n eo fo rm an s JE C2 I] 3’ : A A W 42 80 3. t 16 81 56 3’ : n o G O te rn ts 3’ : Ex pr es se d pr ot ei n (C ryp toc oc cu s n eo fo r