UBC Faculty Research and Publications

Characterization of the human RFX transcription factor family by regulatory and target gene analysis Sugiaman-Trapman, Debora; Vitezic, Morana; Jouhilahti, Eeva-Mari; Mathelier, Anthony; Lauter, Gilbert; Misra, Sougat; Daub, Carsten O; Kere, Juha; Swoboda, Peter Mar 6, 2018

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12864_2018_Article_4564.pdf [ 1.71MB ]
JSON: 52383-1.0364165.json
JSON-LD: 52383-1.0364165-ld.json
RDF/XML (Pretty): 52383-1.0364165-rdf.xml
RDF/JSON: 52383-1.0364165-rdf.json
Turtle: 52383-1.0364165-turtle.txt
N-Triples: 52383-1.0364165-rdf-ntriples.txt
Original Record: 52383-1.0364165-source.json
Full Text

Full Text

RESEARCH ARTICLE Open AccessCharacterization of the human RFXtranscription factor family by regulatoryand target gene analysisDebora Sugiaman-Trapman1, Morana Vitezic2, Eeva-Mari Jouhilahti1, Anthony Mathelier3,4,5, Gilbert Lauter1,Sougat Misra6, Carsten O. Daub1,7, Juha Kere1,8,9† and Peter Swoboda1*†AbstractBackground: Evolutionarily conserved RFX transcription factors (TFs) regulate their target genes through a DNAsequence motif called the X-box. Thereby they regulate cellular specialization and terminal differentiation. Here, weprovide a comprehensive analysis of all the eight human RFX genes (RFX1–8), their spatial and temporal expressionprofiles, potential upstream regulators and target genes.Results: We extracted all known human RFX1–8 gene expression profiles from the FANTOM5 database derivedfrom transcription start site (TSS) activity as captured by Cap Analysis of Gene Expression (CAGE) technology. RFXgenes are broadly (RFX1–3, RFX5, RFX7) and specifically (RFX4, RFX6) expressed in different cell types, with highexpression in four organ systems: immune system, gastrointestinal tract, reproductive system and nervous system.Tissue type specific expression profiles link defined RFX family members with the target gene batteries theyregulate. We experimentally confirmed novel TSS locations and characterized the previously undescribed RFX8 to belowly expressed. RFX tissue and cell type specificity arises mainly from differences in TSS architecture. RFX transcriptisoforms lacking a DNA binding domain (DBD) open up new possibilities for combinatorial target gene regulation.Our results favor a new grouping of the RFX family based on protein domain composition. We uncovered andexperimentally confirmed the TFs SP2 and ESR1 as upstream regulators of specific RFX genes. Using TF bindingprofiles from the JASPAR database, we determined relevant patterns of X-box motif positioning with respect togene TSS locations of human RFX target genes.Conclusions: The wealth of data we provide will serve as the basis for precisely determining the roles RFX TFs playin human development and disease.Keywords: Cell differentiation, Cilia, Spermatogenesis, Immune cell proliferation, Neuronal development, Cell cycle control,Tumor suppressionBackgroundRFX (Regulatory Factor binding to the X-box) transcriptionfactors (TFs) share and are defined by a conserved, special-ized winged-helix type DNA binding domain (DBD) [1].RFX genes have been identified in all animals within theUnikont branch of eukaryotes, which excludes algae, plantsand various protozoan branches [2]. Metazoan genomes en-code one to several RFX genes. C. elegans possesses one,Drosophila has two [3, 4], mammals have eight and – due togenome duplication – fishes have nine RFX genes [2, 5–10].Human RFX1–7 have previously been described [9], whileRFX8 (ENSG00000196460, www.ensembl.org) has not beencharacterized.In different organisms, RFX TFs have been shown toregulate genes involved in various and seemingly disparatecellular and developmental processes [7] like the cell cycleand DNA repair [11, 12], or aspects of cellular differenti-ation, like the functional maturation of cells of the immuneresponse [13] and the development of cilia on the surfaceof polarized cells [14–16]. As a consequence of these roles* Correspondence: peter.swoboda@ki.se†Equal contributors1Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge,SwedenFull list of author information is available at the end of the article© The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.Sugiaman-Trapman et al. BMC Genomics  (2018) 19:181 https://doi.org/10.1186/s12864-018-4564-6in development, mutations in RFX genes can lead to severedisease states. Mutations in RFX5 cause autosomal reces-sive Bare Lymphocyte Syndrome (OMIM #209920), charac-terized by severe combined immunodeficiency due tofailure in HLA expression. Mutations in RFX6 cause auto-somal recessive Mitchell-Riley Syndrome, characterized byneonatal diabetes and malformations of the gut (OMIM#615710). Rfx mutant mice exhibit a plethora of mild tofatal phenotypes, ranging from male sterility [17] to brainabnormalities [18]. These phenotypes are often attributedto cilia dysfunction [17, 19–21]. Of note, many humanciliopathy genes are strongly assumed to be RFX TF targets,given that their orthologs have been shown to be RFX TFtargets in several different organisms, ranging from C. ele-gans to mouse [22–24].In addition to the DBD, RFX TFs may contain other con-served domains like activation (AD) and dimerization(DIM) domains and the domains B and C of unknownfunction [7, 9]. The RFX DBD recognizes an imperfectinverted repeat sequence, the X-box motif, to which it binds[1]. RFX TF binding to the X-box motif has repeatedly beendemonstrated by using methods ranging from in vitro bind-ing studies, in vivo expression and mutation analyses toSELEX and ChIP sequencing approaches [8, 25–28]. Com-bined, these approaches led to the discovery of large batter-ies of RFX target genes [16].By contrast, very little is known about upstream regu-lators of RFX genes. So far only a few studies in mice,zebrafish and flies have identified TFs of the bHLH class,Neurog3 and Atonal, as well as the homeobox proteinNoto as upstream regulators of RFX genes [29–31]. Inthe yeast S. cerevisiae an upstream phosphorylation cas-cade controls expression of the RFX gene Crt1 [32].In this study – using extensive analysis of data from theFANTOM5 database followed by experimental validations– we present an in-depth characterization of the entirehuman RFX gene family (RFX1–8), including the previ-ously undescribed RFX8 and RFX transcript isoforms thatencode TFs without DBD. We provide an updated group-ing of human RFX TFs and show that RFX functional do-main composition is independent of expression profile.Our exhaustive analysis of RFX expression in many differ-ent human tissues and cell types suggests that RFX tissueand cell type specificity arises mainly from differences inTSS architecture and not from different transcript iso-forms. We determined with high precision the positioningof X-box motifs with respect to TSS locations of humanRFX target genes. Using cluster analysis based on tissueand cell type specific expression profiles we link definedRFX family members with the target gene batteries theyregulate. Further, we provide a first list of candidate up-stream regulators of human RFX genes. The wealth ofdata we provide will serve as the basis for future studies ofthe role of RFX TFs in human development and disease.ResultsExpression of human RFX genes in different tissue typesDetailed expression profiles of the human RFX1–8 geneshave not been described. We used data from the FAN-TOM5 database that is based on experimental expressionprofiling by CAGE technology across a wide spectrum ofhuman biological samples. The expression level of a givenCAGE TSS location is defined by an arbitrary unit, tagsper million (TPM) [33]. We extracted 37 CAGE TSS loca-tions for RFX1–8 from the FANTOM 5database andshortlisted these to 30 TSS locations by merging thosewhich are in close proximity to each other and have simi-lar expression profiles (cf. Methods). We then namedthese 30 TSS locations alphabetically, whereby promoterA (pA) is the highest expressed TSS. Expression of eachRFX TSS is described in detail for human tissues, primarycells and cell lines (Additional file 1). The wealth of bio-logical samples allows classifying the expression profilesfor human RFX1–8 in different cell types.A given RFX TSS location is considered as beingexpressed broadly if it is expressed at TPM > 5 in a largenumber and variety of tissues (n > 10). Conversely, agiven RFX TSS location is considered as being expressedspecifically if it is expressed at TPM> 5 in a small num-ber of tissues of the same organ (n < 10). We found mostTSS locations of RFX1–3, 5 and 7 to be expressedbroadly in many tissue types whereas the TSS locationsof RFX4 and RFX6 are all expressed in specific tissuetypes. pA@RFX4 is highly specific in brain and spinalcord tissues, while pB and pC@RFX4 are highly specificin testis. RFX6 TSS locations are all specificallyexpressed in the gastrointestinal tract (GI) (Additionalfile 2). We performed hierarchical clustering of the 30RFX TSS locations based on their expression values(TPM) across 135 human tissue samples. Thereby weidentified four major tissue clusters, namely immunesystem [34], gastrointestinal tract [35, 36], testis [37] andbrain and spinal cord [18, 38–41], and two minor clus-ters, namely uterus and lung [42] (Additional file 2).The expression of RFX8 is very low, making it the mostelusive member of the human RFX family that has hithertoavoided detection. Here we identified RFX8 TSS locationswith highest expressions in some tissues of the immunesystem (pA, pC, pD) and the gastrointestinal tract (pB, pE).However, the tissue expression values for pC, pD and pEwere hard to distinguish from background noise (TPM< 1).In primary cells and cell lines, RFX8 TSS locations hadhigher expression values, with the most prominent expres-sion in a Schwannoma cell line (Additional file 2).Connecting TSS expression profiles to protein-codingtranscript isoformsFANTOM5 data allowed us to determine TSS locations andexpression profiles. In order to connect the 30 RFX TSSSugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 2 of 15locations described above to known transcript isoforms, weset a maximum distance limit of 50 nt between the TSS lo-cation and the nearest Ensembl protein-coding transcriptwith a complete open reading frame. We found that 18 ofthese 30 RFX TSS locations matched Ensembl protein-coding transcripts. The remainder (12) of the 30 RFX TSSlocations were treated as novel transcript isoforms in humantissues. We selected seven of these as representatives for ex-perimental validation by RT-PCR and sequencing (Table 1,Additional file 3: Table S1). The seven novel transcripts con-sist of (i) testis specific pC@RFX1 and pE@RFX3, (ii)broadly expressed and highest in brain pC@RFX3,pC@RFX5, pA@RFX7, pC@RFX7, and (iii) lowly expressedpA@RFX8.Next, we assessed the full transcript sequences (from5′ to 3’ UTRs) including their coding potential (fromstart to stop codons) from both the matched Ensemblprotein-coding transcripts and the novel sequence-verified transcripts (Table S2 in Additional file 2). Repre-sentatives of the RFX1–8 transcripts are shown in Fig. 1a.We found that the majority of RFX transcript isoformsoriginating from the same gene encode identical proteinssuggesting that tissue and cell type specificity arisesmainly from differences in TSS architecture andTable 1 RFX1–8 expression data and novel transcriptsGene (chromosome) TSS MatchedEnsembletranscriptTissue profile summaryExpression Highest inRFX1 (chr19) pA@RFX1 ENST00000254325 Broad Cerebellum (brain)pB@RFX1 ENST00000254325pC@RFX1* Novel transcript* Specific TestisRFX2 (chr19) pA@RFX2 ENST00000303657 Broad UteruspB@RFX2 ENST00000303657 TestispC@RFX2 ENST00000303657 Medulla oblongata (brain)RFX3 (chr9) pA@RFX3 ENST00000382004 Broad Cerebellum (brain)pB@RFX3 ENST00000382004 Lung, fetalpC@RFX3* Novel transcript* Cerebellum (brain)pD@RFX3 ENST00000382004 Lung, fetalpE@RFX3* Novel transcript* Specific TestisRFX4 (chr12) pA@RFX4 ENST00000392842 Specific Spinal cordpB@RFX4 ENST00000229387 Specific TestispC@RFX4 ENST00000357881RFX5 (chr1) pA@RFX5 ENST00000290524 Broad Blood (immune system)pB@RFX5 ENST00000290524 Tonsil (immune system)pC@RFX5* Novel transcript* Brain, fetalpD@RFX5 Novel transcript Duodenum, fetal (GI)RFX6 (chr6) pA@RFX6 ENST00000332958 Specific Duodenum, fetal (GI)pB@RFX6 ENST00000332958pC@RFX6 ENST00000332958RFX7 (chr15) pA@RFX7* Novel transcript* Broad Cerebellum (brain)pB@RFX7 ENST00000559447pC@RFX7* Novel transcript*pD@RFX7 Novel transcriptRFX8 (chr2) pA@RFX8* Novel transcript* Lowly expressed (TPM < 5) Thymus (immune system)pB@RFX8 ENST00000428343 Medial frontal gyrus (brain)pC@RFX8 Novel transcript Noise (TPM < 1) HeartpD@RFX8 Novel transcript BreastpE@RFX8 Novel transcript Rectum, fetalThirty TSS locations from eight human RFX genes and their respective tissue profile summaries are presented (cf. Methods; GI = gastrointestinal tract). For anexpanded summary and the analysis of functional domains, see Tables S1 and S2 in Additional file 3, respectively. Novel transcripts are marked in bold and thoseselected for experimental validation are marked with an asterisk. For RT-PCR verified sequences of novel transcripts, see Table S9 in Additional file 3Sugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 3 of 15regulation through their corresponding promoters. Theexceptions are RFX1, RFX4 and RFX8 with isoforms en-coding different protein variants. The testis specificpC@RFX1 transcript isoform encodes a shortened N-terminal region upstream of the activating domain (AD).RFX4 transcript isoforms have been extensively studied[43–45] and thus complement our results, where wefound isoforms encoding different RFX4 protein vari-ants. The RFX8 gene encodes a TF protein lacking aDNA binding domain (DBD) (ENSP00000401536,www.ensembl.org). Here, we experimentally validatedRFX8 transcripts by sequencing cDNA from humanbrain total RNA and uncovered novel splicing patternsleading to alternative RFX8 protein variants, with andwithout DBD (Table S1 in Additional file 3).RFX functional domain composition is independent ofexpression profileHuman RFX TFs were previously categorized throughphylogenetic analysis of their four functional domainsoutside the DBD: activating domain (AD), domain B,domain C and dimerization domain (DIM) [9, 10, 16].Given the variation in coding potential of all 30RFX1–8 transcript isoforms, we investigated whetherthere is a correlation between the presence or ab-sence of certain RFX functional domains and theCAGE TSS expression profiles in human tissues. First,we categorized all RFX1–8 transcripts into fourgroups based on their functional domain structure: (i)Group 1: RFX1–3 with all known domains, (ii) Group2: RFX4, RFX6 and RFX8 lacking the AD, (iii) Group3: RFX5 and RFX7 with only the DBD, (iv) Group 4:RFX4 and RFX8 lacking the DBD (Fig. 1a). When wethen compared these four groups to their respectiveTSS expression profiles, we did not find any indica-tion that RFX TFs with similar domain compositionwould be expressed broadly or specifically in a certaintissue cluster, suggesting that the RFX functional do-main composition is independent of expression pro-file. We analyzed RFX4 and RFX8 in more detail toillustrate this point (Fig. 1b, c).Based on the FANTOM5 expression profiles, the geneRFX4 is highly tissue specific compared to other RFXgenes. In our analysis, we connected three RFX4 TSSab cFig. 1 Representative RFX transcripts grouped according to their functional domain compositions. a Representative RFX transcripts (to scale in nucleotides /nt) can be categorized based on the presence or absence of functional domains. Group 1 consists of RFX1, RFX2, and RFX3, which have all the domains.Group 2 consists of RFX4, RFX6 and RFX8, which have all domains but the AD. Group 3 consists of RFX5 and RFX7, which have only the DBD. Group 4 isnovel, consisting of isoforms of RFX4 and RFX8, which lack the DBD. The start of the black bar marks the TSS position. Green and red arrows mark start andstop codon positions, respectively. The RFX protein domains encoded by these transcripts are AD (activation domain), DBD (DNA binding domain), B(domain B), C (domain C), and DIM (dimerization domain). They are indicated using color-coded boxes. The DBD (red box), which typically spans 222–225 nt(cf. Table S2 in Additional file 3) serves as a size marker. b, c RFX4 and RFX8 TSS locations illustrate best that RFX functional domain composition isindependent of expression profile. They are connected to Ensembl protein-coding transcripts or shown as novel, validated transcripts (in red). Exon numbersrefer to those in the corresponding Ensembl transcript IDs (distance and positions are not to scale). pA@RFX4 (red) belongs to the brain and spinal cordcluster, whereas pB and pC@RFX4 (green) belong to the testis cluster (cf. Additional file 2). The highest expressed tissues for pA and pB@RFX8 are thymusand medial frontal gyrus, respectively, and they are not color-coded because of their low expression levels in tags per million (TPM< 5) (Table 1)Sugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 4 of 15locations to three different Ensembl protein-codingtranscripts (Fig. 1b). The longest, highest expressedtranscript falls into Group 2 (lacking the AD); it is spe-cifically expressed in the brain and spinal cord. Theother two less expressed transcripts are both testis spe-cific, whereby one belongs to Group 2 (lacking the AD)and the other one to Group 4 (lacking the DBD). Thenewly described gene RFX8 is the least expressed of allthe RFX genes. We connected the two highest expressedRFX8 TSS locations with three possible transcripts (Fig.1c). The same TSS can lead to different transcripts en-coding protein variants, which either fall into Group 2(lacking the AD) or Group 4 (lacking the DBD), sug-gesting an additional layer of gene regulation on top ofthe TSS architecture itself. Interestingly, RFX8 tran-scripts with DBD revealed that the RFX8 DBD is slightlyshorter than the DBDs in RFX1–7. Whereby, multiplesequence alignments of RFX DBD amino acid sequencesreveal that it is the least conserved N-terminal aminoacids of the DBD that are missing in RFX8 (Figure S1 inAdditional file 3).Tissue and cell type specific clustering of RFX familymembers with the target genes they regulateTo correlate and eventually predict which RFX familymember regulates which target gene in which human tis-sue and cell type, we compared and clustered the expres-sion profiles of all RFX family members with direct RFXtarget genes. We selected from the literature a large num-ber of validated direct RFX target genes in humans, asdemonstrated either by a biochemical interaction betweenan RFX TF and the respective X-box promoter motif orby confirmation of the X-box function by mutation ana-lysis (Table S3 in Additional file 3). We then extracted theCAGE TSS expression values (TPM) of these genes fromthe FANTOM5 database and performed unsupervisedheat map clustering based on the correlations of expres-sion values across 135 tissue types (Fig. 2).We identified strong tissue specific RFX and targetgene clusters, namely for testis (RFX1–4 and targetgenes GPR56, ALMS1, RFX1) and the gastrointestinaltract (RFX5–6 and target gene INS/insulin). We also ob-served strong differences between clusters of theFig. 2 Heat map of tissue expression clusters of RFX1–8 and their experimentally confirmed target genes in humans. Heat map ofunsupervised hierarchical clustering of 30 TSS locations of RFX genes and 185 TSS locations of validated RFX target genes (withshorthand p for promoter) based on the expression values in tags per million (TPM) across 135 human tissue samples extracted fromFANTOM5. The heat map color-code represent Pearson correlation values with a gradient of − 1 in dark blue/blue (negative correlation), 0 in white(zero correlation) and 1 in yellow/orange (positive correlation). The graph was generated by the heatmap.2::gplots [95] R package. RFX TSS locationstissue clusters (y-axis) are color-coded as described in Additional file 2. The tissue cluster divisions of RFX target genes (x-axis) are based on groups oftissues with the highest expression values (TPM) of the respective TSS locations. The term “other tissues” includes adipose, kidney, lung, seminal vesicle,skeletal muscle, throat and uterusSugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 5 of 15immune system (RFX5) and the brain (RFX1, 3, 4, 7).This underscores that (i) the respective RFX familymembers regulate different sets of target genes as theyare not co-expressed in a given tissue type, and (ii) for agiven target gene RFX TFs can act as activators or as in-hibitors. For brain tissue and cell types, RFX1, 3, 4 and 7clustered tightly together, indicating a preference forthese RFX family members to (co-) regulate the expres-sion of brain-specific genes such as the ciliopathy/Alström syndrome gene ALMS1, the dyslexia candidategene KIAA0319 or the gene MAP1A. Interestingly, an-other member of the brain cluster, pC@RFX2 (Table 1,Additional file 2), in the context of target genes clus-tered separately (Fig. 2), suggesting that in the brainRFX2 regulates a distinct set of target genes. Alterna-tively, RFX2 may interact with other RFX family mem-bers or other co-factors without preference as long asthey are co-expressed in a given tissue and cell type.X-box motif positioning in the human genomeRFX TFs regulate their target genes by binding to aconserved X-box motif in the promoter region. Previ-ous X-box motif searches have typically been carriedout using 1–3 kb sequence windows upstream of theTSS or ATG, such as in C. elegans [27], D. melanoga-ster [23], mouse and human [46]. To our knowledge,precise X-box motif positioning has not been character-ized in the human genome. Thus, we determined the mostlikely positioning of functional X-box motifs in the pro-moter region, defined as 5000 bp upstream (− 5000) and2000 bp downstream (+ 2000) in relation to TSS locations,of experimentally validated human RFX target genes.To facilitate the search, we used two curated TF bindingprofiles for human RFX available in the JASPAR (2018)database [47]: RFX2 (MA0600.1) and RFX5 (MA0510.1)(Table S4 in Additional file 3). As a control, we selected a10-fold larger random set of TSS locations across the hu-man genome. Our search effort revealed that X-box hitsare typically located very close to RFX target genes TSS lo-cations (Fig. 3, Table S5 in Additional file 3). Based onsearch and find statistics the X-box positioning windowcan be further subdivided into a robust window of − 500to + 500 bp and a permissive window of − 2300 to +1400 bp. Using an independent search approach (theMEME suite FIMO software) [48] we confirmed theseoverall search and find parameters for human X-boxmotifs. Our analysis enhances the prediction power offuture searches for functional X-box motifs, which re-lates to both upstream and downstream of TSS loca-tions of candidate human RFX target genes andpinpoints their likely locations. Functional X-box mo-tifs at larger distances from TSS locations (e.g at dis-tal enhancers) are likely to be the exception ratherthan the norm (Table S6 in Additional file 3).Prediction of upstream RFX regulators using transcriptionfactor binding site (TFBS) analysisIdentifying the upstream regulators of RFX genes will allowpredicting the developmental and cellular niche that RFXTFs occupy. Thus, we searched for TF binding profilesover-represented in the promoter and enhancer regions ofall 8 human RFX genes. We used search windows of −5000 to + 2000 bp in relation to 30 RFX TSS locations and- 200 to + 200 bp around the midpoints of 13 significantlycorrelated candidate RFX enhancer sequences (extractedfrom Andersson et al. [49]; Table S7 in Additional file 3).We then scanned these regions with all the core vertebrateTF binding profiles present in the JASPAR 2016 database[47]. The enrichment for TF binding profiles was assessedagainst a 10-fold larger random set of human promoterand enhancer regions using the oPOSSUM3 tool [50].We identified 19 over-represented TF binding profiles(Fig. 4) associated to the TFs SP2 (specificity protein 2) (JAS-PAR profile MA0516.1), E2F4 (E2 factor 4) (MA0470.1),KLF16 (Kruppel like factor 16) (MA0741.1), SP8 (specificityprotein 8) (MA0747.1), SP3 (specificity protein 3)(MA0746.1), EGR3 (early growth response 3) (MA0732.1),ESR1 (estrogen receptor alpha) (MA0112.3), Creb5 (cAMPresponsive element binding protein 5) (MA0840.1), ZNF740(zinc finger protein 740) (MA0753.1), ATF7 (activating tran-scription factor 7) (MA0834.1), SOX21 (sex determining re-gion Y-box 21) (MA0866.1), MZF1 (myeloid zinc finger 1)(MA0056.1 and MA0057.1), Tcfl5 (transcription factor likeFig. 3 X-box motif position with respect to TSS locations. Densityfrequency of X-box motif positions with respect to the TSSlocations of experimentally proven direct RFX target genes inhumans (shown in blue) and a set of 10× random TSS locationsfrom FANTOM5 (shown in red): TSS -5000 to + 2000 bp windowswere scanned with two JASPAR RFX motifs (RFX2 MA0600.1 andRFX5 MA0510.1) with 80% threshold. We define a sequencewindow as “robust” by the area where the two curves with 95% C.I.smoothing do not overlap. We define a sequence window as“permissive” by the area where the two curves intersectSugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 6 of 155) (MA0632.1), KLF5 (Kruppel like factor 5) (MA0599.1),SP1 (specificity protein 1) (MA0079.3), EGR1 (early growthresponse 1) (MA0162.2), TFAP2C (transcription factor AP-2gamma) (MA0815.1) and JDP2 (Jun dimerization protein 2)(MA0656.1). The full list of the TF binding profiles can befound in Additional file 4.siRNA validation of RFX regulatorsTo test if any of the over-represented TF binding profilescan be linked to functional upstream regulation of RFXgenes, we selected two TFs within the high-scoring TFbinding profiles. We used siRNA knockdown of SP2 andESR1 followed by qRT-PCR measuring the fold change ofmRNA expression levels of RFX genes. We could success-fully demonstrate knockdown of SP2 and ESR1 both at themRNA level by qRT-PCR and at the protein level byimmunoblotting (Fig. 5). The TF binding profiles of SP2and ESR1 both scored clearly above the z-score threshold(Fig. 4). We selected the human MCF7 breast cancer cellline for which data are available in FANTOM5. In this cellline the genes SP2, ESR1, RFX1–3, − 5, and − 7 areexpressed sufficiently high (TPM> 5), while the genesRFX4, − 6 and − 8 are not expressed (TPM= 0). We usedefficiency-adjusted fold change quantification against scram-bled (Scr) control siRNA normalized to the geometric meanof HPRT1 and HSPCB as two independent reference genes[51]. All the Ct levels of the test siRNA and Scr controlsiRNA can be found in Additional file 5.We observed that siRNA knockdown of SP2 and ESR1in human MCF7 cells resulted in a significant fold changein the mRNA expression levels of at least one of the RFXgenes (Fig. 5). siRNA knockdown of SP2 resulted in bothactivating and inhibiting effects on RFX genes, wherebyonly RFX7 showed significant up-regulation. In contrast,siRNA knockdown of ESR1 revealed consistent inhibitoryeffects on all the RFX genes analyzed, with RFX2, − 3, − 5,and − 7 being significantly up-regulated. These data showthat computational TFBS analyses of the promoter regionsof RFX1–8 correctly identified functional upstream regu-lators of these RFX genes. Depending on the individualRFX gene these upstream regulators act either as activa-tors or as repressors.DiscussionWe have exhaustively analyzed all eight members of thehuman RFX TF gene family (RFX1–8). By extracting andcomputationally analyzing large-scale experimental datasets, we were able to describe in detail RFX gene expres-sion as well as the RFX gene regulatory landscape inmany different human tissues and cell types, includingFig. 4 TF binding profiles in the promoter and enhancer regions of RFX genes. Distribution of all the z-scores of all the core vertebrate transcriptionfactor binding site (TFBS) profiles in JASPAR 2016, with the search areas consisting of − 5000 to + 2000 bp with respect to the 30 RFX TSS locations and− 200 bp to + 200 bp from the mid-points of the RFX enhancers, against a background of a set of 10× random TSS locations and enhancers withidentical window size and matching %GC distribution from FANTOM5. High-scoring or over-represented TF binding site profiles were computed ashaving z-scores above the mean + 2 x standard deviation (red dotted line)Sugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 7 of 15in parts the experimental validation thereof. We providethe first detailed experimental characterization of RFX8and of RFX isoforms without DBD. Further, we provideinsight into upstream regulators of human RFX genesand determined the sequence windows in which – inmost cases – human RFX TFs act as direct regulators oftheir target genes. Thereby we provide an in-depth cata-logue and key resource for future work on the roles thatRFX TFs play in human development and disease.Our extensive survey of all the human RFX (1–8) geneexpression profiles enabled us to carefully analyze all thetranscript isoforms and determine the potential proteinvariants encoded by these isoforms. We ordered all ex-pression profiles from low to high, from broad to tissuespecific, made tissue and cell type assignments, includ-ing isoform correlations and non-correlations, andthereby were able to cluster the expression profiles forall the isoforms of all the human RFX genes. We foundand experimentally validated that – typically – RFX geneTSS locations (of the same gene) would lead to the sameprotein variant, suggesting that it is mostly promoterand TSS architecture that gives rise to diversity in geneexpression profiles. These results highlight the import-ance of studying non-coding regulatory regions of keygenes involved in developmental processes such as celltype specification and differentiation. Exceptions includeTSS locations for the gene RFX4 that were spread acrossa large genomic distance leading to transcript isoformsencoding different tissue specific protein variants.Our work lead to an updated grouping of human RFXTFs and showed that RFX functional domain compos-ition is independent of expression profile. We identifiedtwo RFX genes, RFX4 and RFX8, which can encode pro-tein variants without DBD. The function of RFX proteinvariants without DBD is unclear. Possibly, they act astissue-specific co-repressors, similar to SHP proteins[52]. This potential role is clearly inferred for RFX4,where such competitive co-repression may occur in thetestis but not in the nervous system [44]. Transcript val-idation for the newly described gene RFX8 revealed thepossibility for encoding protein variants with and with-out a DBD. For the protein variant with DBD, this do-main would be slightly shorter as it is missing the leastconserved N-terminal 20 amino acids. In addition to theoverall low expression level of RFX8, it raises the ques-tion of RFX8 functionality. RFX8 was most prominentlyexpressed in Schwannoma cells, suggesting a role forRFX8 in Schwann cell proliferation.Given the central role that RFX TFs play during develop-ment (e.g. in the differentiation of cilia), we were interestedin finding candidate upstream regulators of RFX genes. Weused computational predictions based on over-representedTF binding profiles to find candidate upstream regulatorsof RFX genes and thereby infer the developmental path-ways that RFX1–8 are part of. The over-represented TFbinding profiles that our analysis uncovered are associatedwith TFs involved in (i) neural development (SP2, ESR1,Creb5, SOX21) [53–56] and neurite outgrowth (KLF16,EGR3) [57, 58], (ii) cognitive functions (EGR3, EGR1) [59],(iii) craniofacial development (SP8) [60], (iv) proliferationof immune cells (EGR3, KLF5) [61, 62], platelet formation(SP3, SP1) [63] and innate immunological memory (ATF7)[64], (v) cell cycle control (E2F4) [65, 66] and tumor sup-pression (ZNF40, MZF1, TFAP2C, JDP2) [67–70], and (vi)reproductive functions (ESR1, Tcfl5) [54, 71]. TF bindingbcaFig. 5 siRNA validation of candidate RFX regulators. The genes SP2 andESR1 represent the high-scoring group of candidate RFX regulators (cf.Fig. 4 and Additional file 4). In the MCF7 breast cancer cell line,amplification efficiency-adjusted mRNA fold change quantifications ofRFX1, 2, 3, 5 and 7 were normalized to the geometric mean of HPRT1 andHSPCB, whereby a fold change equaling 1 describes an unchangedexpression level. For this we used (a) SP2 siRNA versus scrambled (Scr)control siRNA knockdown, and (b) ESR1 siRNA versus scrambled (Scr)control siRNA knockdown. In (a, b) error bars represent SEM and fold-change statistical significance was calculated using the student two-sample t-test (***p-value ≤0.01, **p-value ≤0.05, *p-value ≤0.1). a, c SP2siRNA knockdown was confirmed at both the mRNA and protein leveland a significant up-regulation of RFX7 and down-regulation of RFX5 wereobserved. b, c ESR1 siRNA knockdown was confirmed at both the mRNAand protein level and significant up-regulations of RFX2, 3, 5 and 7 wereobserved. c Immunoblotting band intensities were quantified usingImageJ and normalized with the indicated loading controlsSugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 8 of 15profiles for RFX TFs themselves were not over-represented,suggesting that autoregulation is not a common feature forthe expression of RFX1–8 genes and that RFX1 autorepres-sion may be the only exception [72].We validated SP2 and ESR1 by siRNA knockdown andqRT-PCR and found that they act as inhibitors of theRFX genes. We assume that these candidate upstreamregulators act directly, given the over-representation oftheir TF binding site profiles in RFX1–8 promoter andcandidate RFX enhancer regions. The cellular contextwe used for experimental validation, human MCF7breast cancer cells, very likely does not represent all hu-man tissues. Thus, more exact mechanisms of RFX generegulation remain to be analyzed in different cell-typespecific environments. At present, there is little evidencefor preferences in RFX dimerization patterns [43].The discovery of new RFX target genes typically startswith searching for X-box motifs, the binding site for RFXTFs. X-box searches have mostly focused on upstream pro-moter sequences, e.g. upstream of the first exon or of theATG [23, 27, 73]. Here we expand by relating X-box pos-ition to both upstream and downstream of human geneTSS locations. X-box position, motif sequence and conser-vation across species (cf. Henriksson et al. [74]) allow for aprecise ranking of hits. With respect to a given gene TSSlocation we have assigned these hits to a permissive win-dow (− 2300 to + 1400) and a robust window (− 500 to +500) for the higher ranks. Our data strengthen and surpassprevious work in other organisms where functional X-boxmotifs were found close to the gene start sites [75]. Ourtype of analysis will enhance the prediction power of futuresearches for functional X-box motifs, because relating X-box motifs to both upstream and downstream of TSS loca-tions of candidate human RFX target genes adds anotherlevel of precision to the search procedure. Functional X-box motifs at larger distances from TSS locations (e.g atdistal enhancers) are likely to be the exception rather thanthe norm. The presence of X-box motifs was shown tocontribute to the activeness of both promoters and en-hancers, whereby distal enhancers that harbor X-box mo-tifs exhibited greater promoter activity than enhancers thatlack them [76]. This phenomenon would fit a model where(as found in Xenopus leavis) Rfx2 and Foxj1 coordinatelyregulate ciliary gene expression, with Rfx2 stabilizing Foxj1binding at chromatin loops [77].Comparative tissue and cell type specific expression pro-file clustering represents a complementary approach to X-box searches for the ascertainment of cross-connectionsbetween RFX genes and candidate sets of downstream tar-get genes. We have used this approach successfully to de-scribe the key roles that defined RFX family members playby regulating only certain target genes in e.g. human testisand the gastrointestinal tract. Combining both methods, X-box searches and expression profile clustering, will be veryhelpful for the discovery of precise sets of RFX target genesin many different human tissue and cell types.Studies in mammals suggest that RFX TFs function interminal cell differentiation or in the maintenance ofcertain functional specializations. Examples include thedifferentiation and maintenance of pancreatic β-cells asinsulin producers [78], the repression of collagen forma-tion during adult life [79], the maintenance of testis cordintegrity [80], the regulation of spermiogenesis andsperm flagellum assembly [17], the maintenance of post-natal auditory hair cells [81], and the regulation of ciliarygenes involved in the assembly and maintenance offunctional cilia [16]. Interestingly, RFX TFs seem toexert their function on structures connected to polarizedcell surfaces, e.g. cilia, immune synapse, neuronal syn-apse and the vascular face of β-cells [82].Given such a range of RFX TF functions in different tis-sue and cell types, elucidating their role in disease will befacilitated when more precise connections can be estab-lished between specific RFX protein isoforms, RFX targetgene sets and quantity or cell type of expression. So far onlyRFX5 and RFX6 mutations have been linked to defined dis-eases, while mutations in other RFX genes may cause morecomplex, pleiotropic disease symptoms. Embryonic lethalityin Rfx1−/− mice suggests that Rfx1 function cannot be com-pensated for [83]. RFX mutations may cause ciliopathies, asRFX TFs directly regulate many ciliary genes in differentcell and tissue types. The complexity of ciliopathies arisesdue to primary cilia being present on most human celltypes [84]. Very recently, X-box motifs were shown to over-lap with type 2 diabetes risk alleles [85], elevating the im-portance of understanding X-box motif sequence andposition, and X-box containing promoter activity in con-nection to RFX target gene regulation.Our exhaustive and in-depth characterization of thefunctional domain composition and the expression pro-files of all the eight human RFX genes, including upstreamregulatory and downstream target gene analysis, in con-nection with mammalian studies, e.g. investigating Rfxmice mutants, will serve as the basis for uncovering andunderstanding phenotypes or pathologies of RFX muta-tions in humans. For example, one might expect malesterility to be associated with mutations in testis specificRFX1–4 gene isoforms, or with dys-regulation of testisspecific RFX target genes (e.g. GPR56 [86], ALMS1 [87]and RFX1), or with the role upstream RFX regulators (e.g.ESR1 [88, 89]) play in ciliogenesis.ConclusionsWe provide a comprehensive and systematiccharacterization of the expression profiles of all the eighthuman RFX genes, including the previously undescribedRFX8. We open the window to their potential upstreamregulators during development. We advance on howSugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 9 of 15human RFX TFs regulate their target genes. Thereby,our study contributes to the understanding of the differ-ent functions for RFX TFs in their specific spatial andtemporal context in the different tissue and cell types ofhumans. Our work will greatly help in uncovering theircell-type specific target gene batteries, essential for eluci-dating RFX-associated aspects of cellular specializationand terminal functional differentiation. In turn, this willaid in understanding disease mechanisms and outcome.MethodsExtraction and analysis of CAGE TSS locations from theFANTOM5 databaseCAGE TSS locations and expression profiles were extractedfrom FANTOM5 Phase I as downloaded from SSTAR [90]:http://fantom.gsc.riken.jp/5/sstar/Main_Page. FANTOM5TSS data represent expression profiles from 889 biologicalsamples with assigned detection levels in arbitrary units“tags per million” (TPM) [33]. We categorized all samplesinto three separate groups: human tissues (135 samples –80% adult and 20% fetal), human primary cells (170 samples– here represented as the average of the donor replicates)and human cell lines (255 samples), and excluded the timecourse samples. TSS data were extracted and analyzed, andthen named with shorthand p (for promoter) in alphabeticalorder (pA, pB, pC, etc.) based on the following criteria: (1)if the tissue correlation is equal to or greater than 0.7 andindividual TSS locations fall within 100 bp of each other,they were merged into one TSS; (2) if the highest tissuesample TPM is < 1, this TSS was disregarded unless thehighest primary cell (in any donor replicate) or cell lineTPM is ≥5; (3) the alphabetical order of TSS locations isbased on the descending order of its total sum of TPMvalues in all 889 biological samples after conditions (1) and(2) are met.A given TSS location is considered as being expressedbroadly if it is expressed at TPM> 5 in a large number andvariety of tissues (n > 10). Conversely, a given TSS locationis considered as being expressed specifically if it isexpressed at TPM> 5 in a small number of tissues of thesame organ (n < 10). The exceptions are: (i) pA of RFX4displays high expression in many tissues (n > 10) but specif-ically in the brain and spinal cord; (ii) RFX8 TSS locationsare either lowly expressed (TPM< 5) or at backgroundnoise levels (TPM< 1). Additional information about theseand other CAGE TSS locations present in the FANTOM5database (e.g. the presence or absence of TATA boxes, CpGislands, etc.) has been described by Lizio et al. 2015 [90].All the genomic coordinates are stated in BED format.Transcript validation from novel TSS locations by RT-PCRA given TSS location was deemed to be a novel tran-script isoform for experimental validation when it doesnot overlap with or is not found within +/− 50 bp of thestart site (indicated as exon 1) of known protein-codingtranscripts with complete open reading frame descrip-tion in the Ensembl database (release 81 – July 2015,http://www.ensembl.org/). We designed forward primersto bind either within the novel TSS sequence, or over-lapping with the 3′ end of the TSS, or at the most 50 bpdownstream from the TSS. Reverse primers were de-signed to always bind downstream of the ATG, respect-ively, from the reference Ensembl transcript. In the caseof the RFX8 gene, we designed additional primers thatsandwiched the DBD exonic region to confirm the pres-ence or absence of a DBD-encoding exon. We reversetranscribed 1 μg commercial human testis total RNA(Clontech, Cat. No. 636533) and human whole braintotal RNA (Clontech, Cat No. 636530) using InvitrogenSuperScript III First-strand Synthesis Super Mix forqRT-PCR (Cat No. 11752–050). We used undilutedcDNA and 40 PCR cycles with the exception of 45 PCRcycles for RFX8. 2 μl of the PCR product were clonedusing a TOPO TA Cloning Kit (Invitrogen Dual Pro-moter PCR II-TOPO Vector, Cat No. 450640). Then, 4–10 white colonies from AMP + IPTG/X-gal plates werescreened by PCR M13 vector primers, out of which 2–4independent samples were sequenced with T7 and SP6universal primers. Sequencing results were analyzedusing the BLAT Tool (UCSC Genome Browser, http://genome-euro.ucsc.edu/). In the case of the RFX8 DBDtranscript validation, at least 100 white colonies werescreened with PCR M13 vector primers prior to sequen-cing, given the overall low expression of the RFX8 gene.Sequences of primers and verified transcripts are listedin Tables S8 and S9, respectively, in Additional file 3.Determination of RFX protein domainsPeptide sequences of human RFX1–3 protein domains(AD, DBD, B, C and DIM) as described previously [9, 10]were used to determine the corresponding domains in hu-man RFX4–8 using the T-coffee protein sequence align-ment program [91] (http://www.tcoffee.org/). Visualizationof the RFX transcripts in Fig. 1a with the protein domaincomposition was done using IBS software [92].Positional X-box motif scanningWe scanned for candidate X-box motifs using two knownX-box motifs deposited in the JASPAR database [47](http://jaspar.genereg.net/): human RFX2 (motif MA0600.1,representing a full-site X-box) and RFX5 (motif MA0510.1,representing a half-site X-box). For these scans we usedDNA regions of 5000 bp upstream (− 5000) and 2000 bpdownstream (+ 2000) as search windows relative to the TSSlocations. We selected X-box motifs in the promoter re-gions, which were captured by the JASPAR built-in scanfunction (version 5.0_ALPHA) with an 80% threshold. Pre-viously validated X-box motifs were found with theseSugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 10 of 15criteria and also independently using the MEME SuiteFIMO software (version 4.10.0) [48] (http://meme-sui-te.org/tools/fimo) with a p-value < 0.0001. Positional motifenrichment was ascertained by analyzing in the same way10 times random TSS sets from all the CAGE TSS loca-tions present in FANTOM5. The graphical smoothingmethod employed was local polynomial regression fitting(loess) constructed by the R package ggplot2::geom_smooth[93] with a confidence interval (C.I.) level = 0.95.Multiple TF binding profile analysis for the prediction ofcandidate RFX regulatorsWe performed TF binding profile enrichment analyses usingthe oPOSSUM3 tool [50] with the CORE vertebrate TFbinding profiles present in the JASPAR 2016 database [47].DNA regions of − 5000 to + 2000 bp of the 30 RFX TSS lo-cations and − 200 to + 200 bp from the midpoints of 13 can-didate RFX enhancers were used as search windows(foreground). Candidate RFX enhancers were chosen byselecting enhancers present within − 500 kb to + 500 kb ofthe 30 RFX TSS locations, as extracted from Anderssonet al. [49] (http://fantom.gsc.riken.jp/5/datafiles/latest/extra/Enhancers/), and whose expressions were significantly corre-lated (Spearman correlation with multiple testing correction,False Discovery Rate < 0.05) with RFX TSS locations basedon FANTOM5 CAGE expression values (TPM) in 889biological samples). As background we considered 10-foldlarger sets of DNA regions with %GC matching the ones ofthe foreground sequences and derived for regions surround-ing all phase 1.3 CAGE peak coordinates (http://fantom.gsc.riken.jp/5/datafiles/phase1.3/extra/CAGE_peaks/hg19.cage_peak_coord_permissive.bed.gz; − 5000 bp and + 2000 bp)and phase 2.0 enhancer coordinates (http://fantom.gsc.riken.jp/5/datafiles/phase2.0/extra/Enhancers/human_permissive_enhancers_phase_1_and_2.bed.gz; +/− 200 bp) using Bias-Away (https://www.ncbi.nlm.nih.gov/pubmed/24927817).We computed the mean (m) and standard deviation (sd) ofthe distribution of all the z-scores (considering the enrich-ment of the total number of predicted TFBSs) obtainedfrom oPOSSUM3 and put a threshold at m + 2 x sd.Validation of candidate RFX regulators by siRNAknockdown and qRT-PCRSP2 and ESR1 siRNA concentrations (Table S10 inAdditional file 3) were optimized for knockdown efficiency(cutoff: more than 2-fold) using qRT-PCR. siRNA andqPCR primer sequences (obtained from Eurofins Genom-ics: https://www.eurofinsgenomics.eu/) were selected to tar-get all the known protein-coding transcript isoforms.Primer specificities were tested first by common PCR andlater by qPCR analyses of the melting curves using twonegative controls, a water sample and a cDNA samplewithout reverse transcriptase. Sequences of qPCR primerswith their amplification efficiencies determined in astandard control setup are listed in Table S11 in Additionalfile 3. The MCF7 breast cancer cell line (Michigan CancerFoundation) was used as the human cell line listed in theFANTOM5 database as having sufficiently high expressionof both the candidate and the RFX genes (TPM> 5). MCF7cells were maintained using DMEM 1 g/L-D-glucose withadded pyruvate, 10% FBS, 1% Penicillin/Streptomycin and1% L-glutamine at 37 °C at 5% CO2. Cells were seeded 24 hprior to transfection (150,000 cells in 2 ml in a 6-well plateformat). Lipofectamine RNAiMAX (Invitrogen, Cat. No.13778–030) was mixed with siRNA according to the manu-facturer’s instructions. RNA was extracted (RNeasy MiniKit and DNase Set, QIAGEN) 24 h after transfection andwe used 2 biological replicates repeated on three differentdays of transfection. We converted 1 μg RNA to cDNA(Invitrogen SuperScript III First-strand Synthesis SuperMix for qRT-PCR, Cat No. 11752–050). qPCR was per-formed for 40 cycles in singleplex technical triplicates usingFastStart Universal SYBR Green Master with ROX refer-ence dye (Roche Cat No. 04913914001) on an AB7500 Fastmachine. We used 2 μl of 1:3 diluted cDNA from a bio-logical replicate in 10 μl total. Ct levels with automaticthreshold were obtained (Additional file 5) and efficiency-adjusted fold-changes were calculated against scrambled(Scr) control siRNA with the geometric mean of HPRT1and HSPCB as two independent reference genes fornormalization [51]. Graph and statistical tests were per-formed in R using the ggplot2 package [93] and two-tailedone-sample Student’s t-test [94].siRNA knockdown confirmation by immunoblottingAt 24 h after transfection with siRNAs, MCF7 cells werecollected and washed twice with PBS. Whole cell lysateswere prepared upon sonicating the cells in RIPA buffer(Sigma-Aldrich, St. Louis, MO, USA) containing PMSF(1 mM, final concentration) and a protease inhibitor cock-tail (Sigma-Aldrich, St. Louis, MO, USA). The protein con-tent of these cell lysates was determined using the BCAprotein assay kit (Thermo Scientific, Sweden). A total of40–80 μg of protein was loaded per well, separated on a12% SDS-PAGE gel (Bio-Rad, Stockholm, Sweden) andtransferred to a 0.45 μm pore-sized PVDF membrane (Bio-Rad, Stockholm, Sweden). After transfer, membranes wereincubated overnight at 4 °C with primary antibodies (rabbitpolyclonal ESR1, Catalog # sc-543, dilution - 1:500, SantaCruz Biotech; rabbit polyclonal SP2 (A-8), Catalog # sc-17,814, Lot # D0605, dilution - 1:100, Santa Cruz Biotech;rabbit polyclonal beta-tubulin, Catalog # ab6046, dilution -1:5000, Abcam; mouse monoclonal vinculin, clone V284,Lot # 2627627, dilution - 1:5000, Millipore) diluted in 5%milk. Subsequently, blots were washed and incubated witheither horseradish peroxidase-conjugated secondary anti-body (polyclonal rabbit anti-mouse/HRP, Lot # 00054403,dilution - 1:3000, Dako Chemicals) or Li-Cor donkey anti-Sugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 11 of 15mouse IRDye 800CW (Catalog # 926–32,212, for vinculin)or Li-Cor donkey anti-rabbit IRDye 680LT (Catalog # 926–68,023, for ESR1) for 1 h at room temperature. Imagingwas performed using a Li-Cor Odyssey Fc system. An en-hanced chemiluminescence technique (WesternBright Sir-ius ECL substrate, Advansta) was applied for developingthe SP2 blot due to low abundance of the target protein. Inall other cases, fluorescence signals were acquired. Band in-tensities were quantified using ImageJ and normalized withthe indicated loading controls.Human reference sequenceThe human reference sequence used is the Human Feb.2009 (GRCh37/hg19) Assembly.Additional filesAdditional file 1: Detailed expression values (TPM) for RFX TSS locations.Expression values in tags per million (TPM) for all 30 RFX TSS locations inall 889 biological samples and their categorization into tissues (135),primary cells (473 donor replicates and 170 merged replicates from theaverage TPM value of the donor replicates), cell lines (255) and timecourses (26). (XLSX 355 kb)Additional file 2: Hierarchical clustering, expression plots and top 10tissues, primary cells and cell lines of RFX TSS locations. Hierarchicalclustering of 30 RFX TSS locations (with shorthand p for promoter)based on expression values (TPM) across 135 human tissue samples,using a 1-Pearson correlation distance measure and average linkagemethod, as computed by the pvclust R package with nboot = 1000with the numbers representing approximately unbiased (au) p-values(Suzuki and Shimodaira, 2006). Tissue clusters are color-coded andrepresent the groups of tissues with the highest overall expressionvalues: immune system (teal), gastrointestinal tract (purple), testis(green), brain and spinal cord (red), and two minor clusters, uterusand lung (black). RFX TSS locations without color code have low ex-pression values (TPM < 5). This is followed by the expression profilesof 30 RFX TSS locations in human tissues, primary cells and cell lines,whereby for every one of the eight human RFX genes (1–8), summa-rized TSS profile data are presented vertically (“top-down”), startingwith the a tissue plot, followed by a table of the top 10 tissues, atable of the top 10 primary cells and a table of the top 10 cell lines(highest expression levels are listed first, respectively). The tissue plotis the expression level in log (base 10) TPM against tissues that aresorted from the highest to the lowest expressed from 135 tissues,whereby the plot only includes the first 100 tissues. The arbitrary unitfor detection of expression is tags per million (TPM) as defined byFANTOM5. We consider TPM < 5 to be lowly expressed and TPM < 1to be background noise. (PDF 3276 kb)Additional file 3: Supporting tables, figures and supplementaryreferences. Table S1. Summary of RFX1–8 expression data and noveltranscript validation. Table S2. Positions of functional domains encodedby RFX transcripts. Table S3. Experimentally proven, direct RFX targetgenes in humans from the literature. Table S4. Human X-box motifs se-lected from the JASPAR database. Table S5. Experimentally validated hu-man X-box motif sequences in promoter regions that were captured bythe scanning criteria. Table S6. Experimentally validated human X-boxmotif sequences that were either in distal regions or that were not cap-tured by the scanning criteria. Table S7. RFX correlated enhancers within+/− 500 kb of RFX TSS locations. Table S8. Primer sequences for novelRFX transcripts validation. Table S9. Verified novel RFX transcript se-quences. Table S10. siRNA sequences for candidate RFX regulators.Table S11. qPCR primer sequences and amplification efficiencies for val-idation of candidate RFX regulators. Figure S1. Human RFX1–8 DBD pro-tein sequence alignment. Supplementary references. (DOCX 424 kb)Additional file 4: Detailed candidate RFX regulator oPOSSUM3 scanningresults using JASPAR 2016 core vertebrate TF binding profiles.Transcription factor binding sites (TFBS) scanning results from oPOSSUM3within the promoter and enhancer regions of RFX1–8 using the COREvertebrate TF binding profiles in JASPAR 2016. Included are the DNAregions that were considered as foreground and the following TFbinding site details: SP2 (specificity protein 2) (JASPAR profile MA0516.1)and ESR1 (estrogen receptor alpha) (MA0112.3). (XLSX 50 kb)Additional file 5: Ct levels of qRT-PCR, used for validation of candidateRFX regulators by siRNA knockdown. Individual Ct levels with automaticthreshold obtained on an AB7500 Fast machine for SP2 and ESR1 as can-didate RFX regulators and their respective test siRNA and scrambled (Scr)control siRNA knockdown data on RFX genes (RFX1, RFX2, RFX3, RFX5,RFX7) and the two reference genes (HPRT1, HSPCB). (XLSX 33 kb)AbbreviationsAD: Activating domain; B: B domain; C: C domain; CAGE: Cap Analysis ofGene Expression; DBD: DNA binding domain; DIM: Dimerization domain;FANTOM5: Functional Annotation of Mammalian Genome 5; RFX: RegulatoryFactor binding to the X-box; TF: Transcription factor; TFBS: Transcriptionfactor binding site; TPM: Tags per million; TSS: Transcription start siteAcknowledgementsWe express gratitude toward the FANTOM Consortium and its publicdatabase (http://fantom.gsc.riken.jp/). We thank Min Jia from the KarolinskaInstitute (KI) Department of Biosciences and Nutrition for providing theMCF7 cell line and Arun Selvam from the KI Department of LaboratoryMedicine for assistance in protein work.FundingWe acknowledge financial support from the Swedish Research Council(Vetenskapsrådet), from the Swedish Brain Foundation (Hjärnfonden), andfrom the Torsten Söderberg and Åhlén Foundations. DST received supportfrom the KI in the form of a PhD student (KID) scholarship. MV wassupported by the EU Horizon 2020 Marie Curie Individual Fellowship. AMwas supported by a Genome Canada Large Scale Applied Research Grant(No. 174CDE), by funding provided by the Child and Family ResearchInstitute and the British Columbia Children’s Hospital Foundation (Vancouver,BC, Canada), by funding from the Norwegian Research Council (Helse Sør-Øst)and the University of Oslo through the Centre for Molecular MedicineNorway (NCMM) and the Oslo University Hospital (Radiumhospitalet). GLacknowledges fellowship support from the Swedish Society for MedicalResearch (Svenska Sällskapet för Medicinsk Forskning), the Lars Hierta MemorialFoundation (Stiftelsen Lars Hiertas Minne) and from the Thuring Foundation.JK was the recipient of a KI Distinguished Professor Award and a RoyalSociety Wolfson Research Excellence Award. PS received support from the KIStrategic Neurosciences Program.Availability of data and materialsAll data generated or analysed during this study are included in thispublished article (and its additional files).Authors’ contributionsPS, JK and COD conceived and supervised the study. DST performed thedatabase work, data analysis, experimental validation work and made all theFigures. MV contributed to the database work and data analysis presented inTable 1, Fig. 3 and Fig. 4. EMJ contributed to the experimental validationwork presented in Table 1 and Fig. 5. AM performed the database work anddata analyses presented in Fig. 4. GL contributed to the database work anddata analysis presented in Fig. 2. SM and GL performed and analyzed all theprotein work. DST, JK and PS wrote and edited the manuscript. All authorsread and approved the final version of the manuscript.Ethics approval and consent to participateFor work with mammalian cell cultures the authors are in possession ofthe applicable permits for carrying out studies with genetically modifiedmicro-organisms / GMMs (Dnr 5.8.18–1012/14; Dnr 5.5.18–6998/15).Consent for publicationThis section is not applicable.Sugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 12 of 15Competing interestsThe authors declare that they have no competing interests.Publisher’s NoteSpringer Nature remains neutral with regard to jurisdictional claims inpublished maps and institutional affiliations.Author details1Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge,Sweden. 2Department of Biology, Bioinformatics Centre, Section forComputational and RNA Biology, University of Copenhagen, Copenhagen,Denmark. 3Department of Medical Genetics, Centre for Molecular Medicineand Therapeutics at the Child and Family Research Institute, University ofBritish Columbia, Vancouver, Canada. 4Centre for Molecular Medicine Norway(NCMM), Nordic EMBL partnership, University of Oslo, Oslo, Norway.5Department of Cancer Genetics, Institute for Cancer Research, OsloUniversity Hospital Radiumhospitalet, Oslo, Norway. 6Department ofLaboratory Medicine, Karolinska Institutet, Huddinge, Sweden. 7Science forLife Laboratory, Karolinska Institutet, Stockholm, Sweden. 8School of Basicand Medical Biosciences, King’s College London, London, UK. 9FolkhälsanInstitute of Genetics and Molecular Neurology Research Program, Universityof Helsinki, Helsinki, Finland.Received: 3 November 2017 Accepted: 21 February 2018References1. Gajiwala KS, Chen H, Cornille F, Roques BP, Reith W, Mach B, Burley SK.Structure of the winged-helix protein hRFX1 reveals a new mode of DNAbinding. Nature. 2000;403:916–21.2. Piasecki BP, Burghoorn J, Swoboda P. Regulatory factor X (RFX)-mediatedtranscriptional rewiring of ciliary genes in animals. Proc Natl Acad Sci. 2010;107(29):12969–74.3. Durand B, Vandaele C, Spencer D, Pantalacci S, Couble P. Cloning andcharacterization of dRFX, the drosophila member of the RFX family oftranscription factors. Gene. 2000;246(1–2):285–93.4. Otsuki K, Hayashi Y, Kato M, Yoshida H, Yamaguchi M. Characterization ofdRFX2, a novel RFX family protein in drosophila. Nucleic Acids Res. 2004;32(18):5636–48.5. Reith W, Herrero-Sanchez C, Kobr M, Silacci P, Berte C, Barras E, Fey S, MachB. MHC class II regulatory factor RFX has a novel DNA binding domain andfunctionally independant dimerization domain. Genes Dev. 1990;4:1528–40.6. Reith W, Ucla C, Barras E, Gaud A, Durand B, Herrero-Sanchez C, Kobr M,Mach B. RFX1, a transactivator of hepatitis B virus enhancer I, belongs to anovel family of homodimeric and heterodimeric DNA-binding proteins. MolCell Biol. 1994;14(2):1230–44.7. Emery P, Durand B, Mach B, Reith W. RFX proteins, a novel family of DNAbinding proteins conserved in the eukaryotic kingdom. Nucleic Acids Res.1996;24:803–7.8. Swoboda P, Adler HT, Thomas JH. The RFX-type transcription factorDAF-19 regulates sensory neuron cilium formation in C. elegans. MolCell. 2000;5(3):411–21.9. Aftab S, Semenec L, Chu J, Chen N. Identification and characterization ofnovel human tissue-specific RFX transcription factors. BMC Evol Biol. 2008;8(1):226.10. Chu J, Baillie D, Chen N. Convergent evolution of RFX transcription factorsand ciliary genes predated the origin of metazoans. BMC Evol Biol. 2010;10(1):130.11. Zaim J, Speina E, Kierzek AM. Identification of new genes regulated by theCrt1 transcription factor, an effector of the DNA damage checkpointpathway in Saccharomyces cerevisiae. J Biol Chem. 2005;280(1):28–37.12. Garg A, Futcher B, Leatherwood J. A new transcription factor for mitosis: inSchizosaccharomyces pombe, the RFX transcription factor Sak1 works withforkhead factors to regulate mitotic expression. Nucleic Acids Res. 2015;43(14):6874–88.13. Reith W, Mach B. The bare lymphocyte syndrome and the regulation ofMHC expression. Annu Rev Immunol. 2001;19:331–73.14. Senti G, Swoboda P. Distinct isoforms of the RFX transcription factor DAF-19regulate Ciliogenesis and maintenance of synaptic activity. Mol Biol Cell.2008;19(12):5517–28.15. Senti G, Ezcurra M, Löbner J, Schafer WR, Swoboda P. Worms with a singlefunctional sensory cilium generate proper neuron-specific behavioraloutput. Genetics. 2009;183(2):595–605.16. Choksi SP, Lauter G, Swoboda P, Roy S. Switching on cilia: transcriptionalnetworks regulating ciliogenesis. Development. 2014;141(7):1427–41.17. Wu Y, Hu X, Li Z, Wang M, Li S, Wang X, Lin X, Liao S, Zhang Z, Feng X, et al.Transcription factor RFX2 is a key regulator of mouse Spermiogenesis. Sci Rep.2016;6:20435.18. Magnani D, Morle L, Hasenpusch-Theil K, Paschaki M, Jacoby M, Schurmans S,Durand B, Theil T. The ciliogenic transcription factor Rfx3 is required for theformation of the thalamocortical tract by regulating the patterning ofprethalamus and ventral telencephalon. Hum Mol Genet. 2015;24(9):2578–93.19. Baas D, Meiniel A, Benadiba C, Bonnafe E, Meiniel O, Reith W, Durand B. Adeficiency in RFX3 causes hydrocephalus associated with abnormaldifferentiation of ependymal cells. Eur J Neurosci. 2006;24:1020–30.20. Ait-Lounis A, Baas D, Barras E, Benadiba C, Charollais A, Nlend Nlend R,Liegeois D, Meda P, Durand B, Reith W. Novel function of the ciliogenictranscription factor RFX3 in development of the endocrine pancreas.Diabetes. 2007;56:950–9.21. El Zein L, Ait-Lounis A, Morlé L, Thomas J, Chhin B, Spassky N, Reith W, DurandB. RFX3 governs growth and beating efficiency of motile cilia in mouse andcontrols the expression of genes involved in human ciliopathies. J Cell Sci.2009;122(17):3180–9.22. Chen N, Mah A, Blacque OE, Chu J, Phgora K, Bakhoum MW, Hunt NewburyCR, Khattra J, Chan S, Go A, et al. Identification of ciliary and ciliopathygenes in Caenorhabditis elegansthrough comparative genomics. GenomeBiol. 2006;7(12):R126.23. Laurencon A, Dubruille R, Efimenko E, Grenier G, Bissett R, Cortier E, RollandV, Swoboda P, Durand B. Identification of novel regulatory factor X (RFX)target genes by comparative genomics in drosophila species. Genome Biol.2007;8(9):R195.24. Thomas J, Morlé L, Soulavie F, Laurençon A, Sagnol S, Durand B.Transcriptional control of genes involved in ciliogenesis: a first step inmaking cilia. Biol Cell. 2010;102(9):499–513.25. Emery P, Strubin M, Hofmann K, Bucher P, Mach B, Reith W. A consensusmotif in the RFX DNA binding domain and binding domain mutants withaltered specificity. Mol Cell Biol. 1996;16(8):4486–94.26. Blacque OE, Perens EA, Boroevich KA, Inglis PN, Li C, Warner A, Khattra J,Holt RA, Ou G, Mah AK, et al. Functional genomics of the cilium, a sensoryorganelle. Curr Biol. 2005;15:935–41.27. Efimenko E, Bubb K, Mak HY, Holzman T, Leroux MR, Ruvkun G, Thomas JH,Swoboda P. Analysis of xbx genes in C. Elegans. Development. 2005;132(8):1923–34.28. Jolma A, Yan J, Whitington T, Toivonen J, Nitta Kazuhiro R, Rastas P,Morgunova E, Enge M, Taipale M, Wei G, et al. DNA-binding specificities ofhuman transcription factors. Cell. 2013;152(1–2):327–39.29. Beckers A, Alten L, Viebahn C, Andre P, Gossler A. The mouse homeoboxgene Noto regulates node morphogenesis, notochordal ciliogenesis, andleft–right patterning. Proc Natl Acad Sci. 2007;104(40):15765–70.30. Soyer J, Flasse L, Raffelsberger W, Beucher A, Orvain C, Peers B, Ravassard P,Vermot J, Voz ML, Mellitzer G, et al. Rfx6 is an Ngn3-dependent wingedhelix transcription factor required for pancreatic islet cell development.Development. 2010;137(2):203–12.31. Cachero S, Simpson TI, zur Lage PI, Ma L, Newton FG, Holohan EE,Armstrong JD, Jarman AP. The gene regulatory Cascade linking proneuralspecification with differentiation in Drosophila sensory neurons. PLoS Biol.2011;9(1):e1000568.32. Huang M, Zhou Z, Elledge SJ. The DNA replication and damage checkpointpathways induce transcription by inhibition of the Crt1 repressor. Cell. 1998;94(5):595–605.33. FANTOM Consortium, RIKEN PMI, CLST (DGT): A promoter-level mammalianexpression atlas. Nature 2014, 507(7493):462–470.34. Rousseau P, Masternak K, Krawczyk M, Reith W, Dausset J, Carosella ED, MoreauP: In vivo, RFX5 binds differently to the human leucocyte antigen-E, -F, and -Ggene promoters and participates in HLA class I protein expression in a celltype-dependent manner. Immunology 2004, 111(1):53–65.35. Smith SB, Qu H-Q, Taleb N, Kishimoto NY, Scheel DW, Lu Y, Patch A-M,Grabs R, Wang J, Lynn FC, et al. Rfx6 directs islet formation and insulinproduction in mice and humans. Nature. 2010;463(7282):775–80.36. Piccand J, Strasser P, Hodson David J, Meunier A, Ye T, Keime C, Birling M-C,Rutter Guy A, Gradwohl G. Rfx6 maintains the functional identity of adultpancreatic β cells. Cell Rep. 2014;9(6):2219–32.Sugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 13 of 1537. Kistler WS, Horvath GC, Dasgupta A, Kistler MK. Differential expression ofRfx1-4 during mouse spermatogenesis. Gene Expr Patterns. 2009;9(7):515–9.38. Feng C, Li J, Zuo Z. Expression of the transcription factor regulatory factorX1 in mouse brain. Folia Histochemica et Cytobiologica / Polish Academy ofSciences, Polish Histochemical and Cytochemical Society. 2011;49(2):344–51.39. Manojlovic Z, Earwood R, Kato A, Stefanovic B, Kato Y: RFX7 is required forthe formation of cilia in the neural tube. Mech Dev 2014, 132(0):28–37.40. Chung M-I, Peyrot SM, LeBoeuf S, Park TJ, McGary KL, Marcotte EM,Wallingford JB. RFX2 is broadly required for ciliogenesis during vertebratedevelopment. Dev Biol. 2012;363(1):155–65.41. La Manno G, Gyllborg D, Codeluppi S, Nishimura K, Salto C, Zeisel A, BormLars E, Stott Simon RW, Toledo Enrique M, Villaescusa JC, et al. MolecularDiversity of Midbrain Development in Mouse, Human, and Stem Cells. Cell.2016;167(2):566–80. e51942. Didon L, Zwick R, Chao IW, Walters M, Wang R, Hackett N, Crystal R. RFX3modulation of FOXJ1 regulation of cilia genes in the human airwayepithelium. Respir Res. 2013;14(1):70.43. Morotomi-Yano K, Yano K-I, Saito H, Sun Z, Iwama A, Miki Y. Human regulatoryfactor X 4 (RFX4) is a testis-specific dimeric DNA-binding protein thatcooperates with other human RFX members. J Biol Chem. 2002;277(1):836–42.44. Matsushita H, Uenaka A, Ono T, Hasegawa K, Sato S, Koizumi F, Nakagawa K,Toda M, Shingo T, Ichikawa T, et al. Identification of glioma-specific RFX4-Eand -F isoforms and humoral immune response in patients. Cancer Sci.2005;96(11):801–9.45. Zhang D, Zeldin DC, Blackshear PJ. Regulatory factor X4 variant 3: atranscription factor involved in brain development and disease. J NeurosciRes. 2007;85:3515–22.46. Zhang D, Stumpo DJ, Graves JP, DeGraff LM, Grissom SF, Collins JB, Li L,Zeldin DC, Blackshear PJ. Identification of potential target genes for RFX4_v3, a transcription factor critical for brain development. J Neurochem. 2006;98(3):860–75.47. Mathelier A, Fornes O, Arenillas DJ, Chen C-Y, Denay G, Lee J, Shi W, Shyr C,Tan G, Worsley-Hunt R, et al. JASPAR 2016: a major expansion and updateof the open-access database of transcription factor binding profiles. NucleicAcids Res. 2016;44(D1):D110–5.48. Grant CE, Bailey TL, Noble WS. FIMO: scanning for occurrences of a givenmotif. Bioinformatics. 2011;27(7):1017–8.49. Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M,Chen Y, Zhao X, Schmidl C, Suzuki T, et al. An atlas of active enhancersacross human cell types and tissues. Nature. 2014;507(7493):455–61.50. Kwon AT, Arenillas DJ, Hunt RW, Wasserman WW: oPOSSUM-3: AdvancedAnalysis of Regulatory Motif Over-Representation Across Genes or ChIP-SeqDatasets. G3: Genes|Genomes|Genetics 2012, 2(9):987–1002.51. Hellemans J, Mortier G, De Paepe A, Speleman F, Vandesompele J. qBase relativequantification framework and software for management and automated analysisof real-time quantitative PCR data. Genome Biol. 2007;8(2):R19.52. Båvner A, Sanyal S, Gustafsson J-Å, Treuter E. Transcriptional corepression bySHP: molecular mechanisms and physiological consequences. TrendsEndocrinol & Metabolism. 2005;16(10):478–88.53. Liang H, Xiao G, Yin H, Hippenmeyer S, Horowitz JM, Ghashghaei HT. Neuraldevelopment is dependent on the function of specificity protein 2 in cellcycle progression. Development. 2013;140(3):552–61.54. Bondesson M, Hao R, Lin C-Y, Williams C, Gustafsson J-Å. Estrogen receptorsignaling during vertebrate development. Biochimica et Biophysica Acta(BBA) - Gene Regulatory Mechanisms. 2015;1849(2):142–51.55. Lonze BE, Ginty DD. Function and regulation of CREB family transcriptionfactors in the nervous system. Neuron. 2002;35(4):605–23.56. Whittington N, Cunningham D, Le T-K, De Maria D, Silva EM. Sox21regulates the progression of neuronal differentiation in a dose-dependentmanner. Dev Biol. 2015;397(2):237–47.57. Wang J, Galvao J, Beach KM, Luo W, Urrutia RA, Goldberg JL, Otteson DC.Novel roles and mechanism for Krüppel-like factor 16 (KLF16) regulation ofneurite outgrowth and Ephrin receptor A5 (EphA5) expression in retinalganglion cells. J Biol Chem. 2016;291(35):18084–95.58. Quach DH, Oliveira-Fernandes M, Gruner KA, Tourtellotte WG. Asympathetic neuron autonomous role for Egr3-mediated generegulation in dendrite morphogenesis and target tissue innervation.J Neurosci. 2013;33(10):4570–83.59. Poirier R, Cheval H, Mailhes C, Garel S, Charnay P, Davis S, Laroche S. Distinctfunctions of Egr gene family members in cognitive processes. Front Neurosci.2008;2(1):47–55.60. Kasberg AD, Brunskill EW, Steven Potter S. SP8 regulates signaling centersduring craniofacial development. Dev Biol. 2013;381(2):312–23.61. Li S, Miao T, Sebastian M, Bhullar P, Ghaffari E, Liu M, Symonds Alistair LJ,Wang P. The transcription factors Egr2 and Egr3 are essential for the controlof inflammation and antigen-induced proliferation of B and T cells.Immunity. 2012;37(4):685–96.62. Shahrin NH, Diakiw S, Dent LA, Brown AL, D’Andrea RJ. Conditionalknockout mice demonstrate function of Klf5 as a myeloid transcriptionfactor. Blood. 2016;128(1):55–9.63. Meinders M, Kulu DI, van de Werken HJG, Hoogenboezem M, Janssen H,Brouwer RWW, van Ijcken WFJ, Rijkers E-J, Demmers JAA, Krüger I, et al. Sp1/Sp3 transcription factors regulate hallmarks of megakaryocyte maturationand platelet formation and function. Blood. 2015;125(12):1957–67.64. Yoshida K, Maekawa T, Zhu Y, Renard-Guillet C, Chatton B, Inoue K,Uchiyama T, Ishibashi K-I, Yamada T, Ohno N, et al. The transcription factorATF7 mediates lipopolysaccharide-induced epigenetic changes inmacrophages involved in innate immunological memory. Nat Immunol.2015;16(10):1034–43.65. Ma L, Quigley I, Omran H, Kintner C. Multicilin drives centriole biogenesisvia E2f proteins. Genes Dev. 2014;28(13):1461–71.66. Chen H-Z, Tsai S-Y, Leone G. Emerging roles of E2Fs in cancer: an exit fromcell cycle control. Nat Rev Cancer. 2009;9(11):785–97.67. Jen J, Wang Y-C. Zinc finger proteins in cancer progression. J Biomed Sci.2016;23(1):53.68. Eguchi T, Prince T, Wegiel B, Calderwood SK. Role and regulation of myeloidzinc finger protein 1 in cancer. J Cell Biochem. 2015;116(10):2146–54.69. Schemmer J, Araúzo-Bravo MJ, Haas N, Schäfer S, Weber SN, Becker A,Eckert D, Zimmer A, Nettersheim D, Schorle H. Transcription factor TFAP2Cregulates major programs required for murine fetal germ cell maintenanceand Haploinsufficiency predisposes to Teratomas in male mice. PLoS One.2013;8(8):e71113.70. Heinrich R, Livne E, Ben-Izhak O, Aronheim A. The c-Jun dimerizationprotein 2 inhibits cell transformation and acts as a tumor suppressor gene.J Biol Chem. 2004;279(7):5708–15.71. Shi Y, Zhang L, Song S, Teves ME, Li H, Wang Z, Hess RA, Jiang G,Zhang Z. The mouse transcription factor-like 5 gene encodes a proteinlocalized in the manchette and centriole of the elongating spermatid.Andrology. 2013;1(3):431–9.72. Lubelsky Y, Reuven N, Shaul Y. Autorepression of Rfx1 gene expression:functional conservation from yeast to humans in response to DNAreplication arrest. Mol Cell Biol. 2005;25(23):10665–73.73. Tammimies K, Bieder A, Lauter G, Sugiaman-Trapman D, Torchet R,Hokkanen M-E, Burghoorn J, Castrén E, Kere J, Tapia-Páez I, et al. Ciliarydyslexia candidate genes DYX1C1 and DCDC2 are regulated by regulatoryfactor (RF) X transcription factors through X-box promoter motifs. FASEB J.2016;30(10):3578–87.74. Henriksson J, Piasecki BP, Lend K, Bürglin TR, Swoboda P: Chapter Sixteen -Finding Ciliary Genes: A Computational Approach. In: Methods inEnzymology. Edited by Wallace FM, vol. 525. Amsterdam: Academic Press;2013: 327–350.75. Burghoorn J, Piasecki BP, Crona F, Phirke P, Jeppsson KE, Swoboda P. The invivo dissection of direct RFX-target gene promoters in C. Elegans reveals anovel cis-regulatory element, the C-box. Dev Biol. 2012;368(2):415–26.76. Nguyen TA, Jones RD, Snavely AR, Pfenning AR, Kirchner R, Hemberg M,Gray JM. High-throughput functional comparison of promoter andenhancer activities. Genome Res. 2016;26(8):1023–33.77. Quigley IK, Kintner C. Rfx2 stabilizes Foxj1 binding at chromatin loopsto enable multiciliated cell gene expression. PLoS Genet. 2017;13(1):e1006538.78. Chandra V, Albagli-Curiel O, Hastoy B, Piccand J, Randriamampita C, VaillantE, Cavé H, Busiah K, Froguel P, Vaxillaire M, et al. RFX6 regulates insulinsecretion by modulating Ca2+ homeostasis in human β cells. Cell Rep.2014;9(6):2206–18.79. Sengupta PK, Fargo J, Smith BD. The RFX family interacts at the collagen (COL1A2)start site and represses transcription. J Biol Chem. 2002;277(28):24926–37.80. Wang B, Qi T, Chen S-Q, Ye L, Huang Z-S, Li H. RFX1 maintains testis cordintegrity by regulating the expression of Itga6 in male mouse embryos. MolReprod Dev. 2016;83(7):606–14.81. Elkon R, Milon B, Morrison L, Shah M, Vijayakumar S, Racherla M, Leitch CC,Silipino L, Hadi S, Weiss-Gayet M, et al. RFX transcription factors are essentialfor hearing in mice. Nat Commun. 2015;6:8549.Sugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 14 of 1582. Low JT, Zavortink M, Mitchell JM, Gan WJ, Do OH, Schwiening CJ, GaisanoHY, Thorn P. Insulin secretion from beta cells in intact mouse islets istargeted towards the vasculature. Diabetologia. 2014;57(8):1655–63.83. Feng C, Xu W, Zuo Z. Knockout of the regulatory factor X1 gene leads toearly embryonic lethality. Biochem Biophys Res Commun. 2009;386(4):715–7.84. Reiter JF, Leroux MR. Genes and molecular pathways underpinning ciliopathies.Nat Rev Mol Cell Biol. 2017;18(9):533–47.85. Varshney A, Scott LJ, Welch RP, Erdos MR, Chines PS, Narisu N, AlbanusRDO, Orchard P, Wolford BN, Kursawe R, et al. Genetic regulatory signaturesunderlying islet gene expression and type 2 diabetes. Proc Natl Acad Sci.2017;114(9):2301–6.86. Bae B-I, Tietjen I, Atabay KD, Evrony GD, Johnson MB, Asare E, Wang PP,Murayama AY, Im K, Lisgo SN, et al. Evolutionarily dynamic alternativesplicing of GPR56 regulates regional cerebral cortical patterning. Science.2014;343(6172):764–8.87. Purvis TL, Hearn T, Spalluto C, Knorz VJ, Hanley KP, Sanchez-Elsner T, HanleyNA, Wilson DI. Transcriptional regulation of the Alström syndrome geneALMS1 by members of the RFX family and Sp1. Gene. 2010;460(1–2):20–9.88. Nanjappa MK, Hess RA, Medrano TI, Locker SH, Levin ER, Cooke PS.Membrane-localized estrogen receptor 1 is required for normal malereproductive development and function in mice. Endocrinology. 2016;157(7):2909–19.89. Hess RA. Small tubules, surprising discoveries: from efferent ductules in theturkey to the discovery that estrogen receptor alpha is essential for fertilityin the male. Anim Reprod. 2015;12(1):7–23.90. Lizio M, Harshbarger J, Shimoji H, Severin J, Kasukawa T, Sahin S, Abugessaisa I,Fukuda S, Hori F, Ishikawa-Kato S, et al. Gateways to the FANTOM5 promoterlevel mammalian expression atlas. Genome Biol. 2015;16(1):22.91. Notredame C, Higgins DG, Heringa J. T-coffee: a novel method for fast andaccurate multiple sequence alignment. J Mol Biol. 2000;302(1):205–17.92. Liu W, Xie Y, Ma J, Luo X, Nie P, Zuo Z, Lahrmann U, Zhao Q, Zheng Y, ZhaoY, et al. IBS: an illustrator for the presentation and visualization of biologicalsequences. Bioinformatics. 2015;31(20):3359–61.93. Wickham H: ggplot2: Elegant Graphics for Data Analysis, vol. VIII, 213: Springer-Verlag New York; 2009.94. R Development Core Team: R: A Language and Environment for StatisticalComputing. www.R-project.org/:. R Foundation for Statistical Computing; 2016.Accessed 20 Sept 2017.95. Warnes GR, Bolker B, Bonebakker L, Gentleman R, Huber W, Liaw A, LumleyT, Maechler M, Magnusson A, Moeller S et al: gplots: Various R ProgrammingTools for Plotting Data. R package version 3.0.1. https://CRAN.R-project.org/package=gplots; 2016. Accessed 28 Mar 2017.•  We accept pre-submission inquiries •  Our selector tool helps you to find the most relevant journal•  We provide round the clock customer support •  Convenient online submission•  Thorough peer review•  Inclusion in PubMed and all major indexing services •  Maximum visibility for your researchSubmit your manuscript atwww.biomedcentral.com/submitSubmit your next manuscript to BioMed Central and we will help you at every step:Sugiaman-Trapman et al. BMC Genomics  (2018) 19:181 Page 15 of 15


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items