- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Faculty Research and Publications /
- Improving analysis of transcription factor binding...
Open Collections
UBC Faculty Research and Publications
Improving analysis of transcription factor binding sites within ChIP-Seq data based on topological motif enrichment Worsley Hunt, Rebecca; Mathelier, Anthony; del Peso, Luis; Wasserman, Wyeth W.
Abstract
Background: Chromatin immunoprecipitation (ChIP) coupled to high-throughput sequencing (ChIP-Seq) techniques can reveal DNA regions bound by transcription factors (TF). Analysis of the ChIP-Seq regions is now a central component in gene regulation studies. The need remains strong for methods to improve the interpretation of ChIP-Seq data and the study of specific TF binding sites (TFBS). Results We introduce a set of methods to improve the interpretation of ChIP-Seq data, including the inference of mediating TFs based on TFBS motif over-representation analysis and the subsequent study of spatial distribution of TFBSs. TFBS over-representation analysis applied to ChIP-Seq data is used to detect which TFBSs arise more frequently than expected by chance. Visualization of over-representation analysis results with new composition-bias plots reveals systematic bias in over-representation scores. We introduce the BiasAway background generating software to resolve the problem. A heuristic procedure based on topological motif enrichment relative to the ChIP-Seq peaks’ local maximums highlights peaks likely to be directly bound by a TF of interest. The results suggest that on average two-thirds of a ChIP-Seq dataset’s peaks are bound by the ChIP’d TF; the origin of the remaining peaks remaining undetermined. Additional visualization methods allow for the study of both inter-TFBS spatial relationships and motif-flanking sequence properties, as demonstrated in case studies for TBP and ZNF143/THAP11. Conclusions Topological properties of TFBS within ChIP-Seq datasets can be harnessed to better interpret regulatory sequences. Using GC content corrected TFBS over-representation analysis, combined with visualization techniques and analysis of the topological distribution of TFBS, we can distinguish peaks likely to be directly bound by a TF. The new methods will empower researchers for exploration of gene regulation and TF binding.
Item Metadata
Title |
Improving analysis of transcription factor binding sites within ChIP-Seq data based on topological motif enrichment
|
Creator | |
Contributor | |
Publisher |
BioMed Central
|
Date Issued |
2014-06-13
|
Description |
Background:
Chromatin immunoprecipitation (ChIP) coupled to high-throughput sequencing (ChIP-Seq) techniques can reveal DNA regions bound by transcription factors (TF). Analysis of the ChIP-Seq regions is now a central component in gene regulation studies. The need remains strong for methods to improve the interpretation of ChIP-Seq data and the study of specific TF binding sites (TFBS).
Results
We introduce a set of methods to improve the interpretation of ChIP-Seq data, including the inference of mediating TFs based on TFBS motif over-representation analysis and the subsequent study of spatial distribution of TFBSs. TFBS over-representation analysis applied to ChIP-Seq data is used to detect which TFBSs arise more frequently than expected by chance. Visualization of over-representation analysis results with new composition-bias plots reveals systematic bias in over-representation scores. We introduce the BiasAway background generating software to resolve the problem. A heuristic procedure based on topological motif enrichment relative to the ChIP-Seq peaks’ local maximums highlights peaks likely to be directly bound by a TF of interest. The results suggest that on average two-thirds of a ChIP-Seq dataset’s peaks are bound by the ChIP’d TF; the origin of the remaining peaks remaining undetermined. Additional visualization methods allow for the study of both inter-TFBS spatial relationships and motif-flanking sequence properties, as demonstrated in case studies for TBP and ZNF143/THAP11.
Conclusions
Topological properties of TFBS within ChIP-Seq datasets can be harnessed to better interpret regulatory sequences. Using GC content corrected TFBS over-representation analysis, combined with visualization techniques and analysis of the topological distribution of TFBS, we can distinguish peaks likely to be directly bound by a TF. The new methods will empower researchers for exploration of gene regulation and TF binding.
|
Subject | |
Genre | |
Type | |
Language |
eng
|
Date Available |
2015-11-15
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution 4.0 International (CC BY 4.0)
|
DOI |
10.14288/1.0228397
|
URI | |
Affiliation | |
Citation |
BMC Genomics. 2014 Jun 13;15(1):472
|
Publisher DOI |
10.1186/1471-2164-15-472
|
Peer Review Status |
Reviewed
|
Scholarly Level |
Faculty
|
Copyright Holder |
Worsley Hunt et al.; licensee BioMed Central Ltd.
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution 4.0 International (CC BY 4.0)