- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Faculty Research and Publications /
- Pegasys: software for executing and integrating analyses...
Open Collections
UBC Faculty Research and Publications
Pegasys: software for executing and integrating analyses of biological sequences Shah, Sohrab P.; He, David Y.; Sawkins, Jessica N.; Druce, Jeffrey C.; Quon, Gerald; Lett, Drew; Zheng, Grace X.; Xu, Tao; Ouellette, B. F. F.
Abstract
Abstract Background We present Pegasys – a flexible, modular and customizable software system that facilitates the execution and data integration from heterogeneous biological sequence analysis tools. Results The Pegasys system includes numerous tools for pair-wise and multiple sequence alignment, ab initio gene prediction, RNA gene detection, masking repetitive sequences in genomic DNA as well as filters for database formatting and processing raw output from various analysis tools. We introduce a novel data structure for creating workflows of sequence analyses and a unified data model to store its results. The software allows users to dynamically create analysis workflows at run-time by manipulating a graphical user interface. All non-serial dependent analyses are executed in parallel on a compute cluster for efficiency of data generation. The uniform data model and backend relational database management system of Pegasys allow for results of heterogeneous programs included in the workflow to be integrated and exported into General Feature Format for further analyses in GFF-dependent tools, or GAME XML for import into the Apollo genome editor. The modularity of the design allows for new tools to be added to the system with little programmer overhead. The database application programming interface allows programmatic access to the data stored in the backend through SQL queries. Conclusions The Pegasys system enables biologists and bioinformaticians to create and manage sequence analysis workflows. The software is released under the Open Source GNU General Public License. All source code and documentation is available for download at http://bioinformatics.ubc.ca/pegasys/.
Item Metadata
Title |
Pegasys: software for executing and integrating analyses of biological sequences
|
Creator | |
Publisher |
BioMed Central
|
Date Issued |
2004-04-19
|
Description |
Abstract
Background
We present Pegasys – a flexible, modular and customizable software system that facilitates the execution and data integration from heterogeneous biological sequence analysis tools.
Results
The Pegasys system includes numerous tools for pair-wise and multiple sequence alignment, ab initio gene prediction, RNA gene detection, masking repetitive sequences in genomic DNA as well as filters for database formatting and processing raw output from various analysis tools. We introduce a novel data structure for creating workflows of sequence analyses and a unified data model to store its results. The software allows users to dynamically create analysis workflows at run-time by manipulating a graphical user interface. All non-serial dependent analyses are executed in parallel on a compute cluster for efficiency of data generation. The uniform data model and backend relational database management system of Pegasys allow for results of heterogeneous programs included in the workflow to be integrated and exported into General Feature Format for further analyses in GFF-dependent tools, or GAME XML for import into the Apollo genome editor. The modularity of the design allows for new tools to be added to the system with little programmer overhead. The database application programming interface allows programmatic access to the data stored in the backend through SQL queries.
Conclusions
The Pegasys system enables biologists and bioinformaticians to create and manage sequence analysis workflows. The software is released under the Open Source GNU General Public License. All source code and documentation is available for download at http://bioinformatics.ubc.ca/pegasys/.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2015-10-24
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution 4.0 International (CC BY 4.0)
|
DOI |
10.14288/1.0228391
|
URI | |
Affiliation | |
Citation |
BMC Bioinformatics. 2004 Apr 19;5(1):40
|
Publisher DOI |
10.1186/1471-2105-5-40
|
Peer Review Status |
Reviewed
|
Scholarly Level |
Faculty
|
Copyright Holder |
Shah et al
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution 4.0 International (CC BY 4.0)