Software


BenchDW

BenchDW is a generic framework for biological data warehousing systems.

Link: http://warehousebenchmark.fungalgenomics.ca/benchmark/benchdw/index.html

Release: v1.1 (September 2012)

License: GNU GPLv3

Abstract: The rapid development of -omics techniques have provided an unprecedented amount of data, enabling system-wide biological research. However, the success of systems biology is contingent on the ability to integrate a wide variety of types of biological data to automatically predict, assign functional annotations of proteins and perform comparative analyses. Although each biological data integration system presents to some extent a number of desirable features, none of them meets all the requirements for effective integration of system-wide data.
BenchDW is a generic and flexible benchmark framework that aims at facilitating the evaluation and quantification of the capabilities of those biological data warehouses. It currently comprises 13 different metrics ranging from documentation quality to accuracy and response times, which may be recorded for different hardware configurations. Each metric can be weighted to better suit the user's specific needs and compared to the gold standard. BenchDW has been successfully used to benchmark 5 data integration systems using 20 typical biological queries on 4 different hardware configurations.


The EnzymeTracker is a web-based laboratory information management system for sample tracking.

Link: http://cubique.fungalgenomics.ca/enzymedb/index.html

Release: v1.5 (September 2011)

License: GNU GPLv3

Abstract: In many laboratories, researchers store experimental data on their own workstation using spreadsheets. However, this approach poses a number of problems, ranging from sharing issues to inefficient data-mining. Standard spreadsheets are also error-prone, as data do not undergo any validation process. To overcome spreadsheets inherent limitations, a number of proprietary systems have been developed, which laboratories need to pay expensive license fees for. Those costs are usually prohibitive for most laboratories and prevent scientists from benefiting from more sophisticated data management systems.
The EnzymeTracker, a web-based laboratory information management system for sample tracking, is an open-source and flexible alternative that aims at facilitating entry, mining and sharing of experimental biological data. The EnzymeTracker features online spreadsheets and tools for monitoring numerous experiments conducted by several collaborators to identify and characterize samples. It also provides libraries of shared data such as protocols, and administration tools for data access control using OpenID and user/team management. Our system relies on a database management system for efficient data indexing and management and a user-friendly AJAX interface that can be accessed over the Internet. The EnzymeTracker facilitates data entry by dynamically suggesting entries and providing smart data-mining tools to effectively retrieve data. Our system features a number of tools to visualize and annotate experimental data, and export highly customizable reports. It also supports QR matrix barcoding to facilitate sample tracking.


PROFESS is a biology database system that integrates databases describing PROtein Functions, Evolution, Structures and Sequences. PROFESS aims at assisting researchers in the functional and evolutionary analysis of the abundant number of novel proteins continually identified from whole-genome sequencing.

Link: http://cse.unl.edu/~profess/

Release: v1.5 (April 2010)

Abstract: The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are 1100+ molecular biology databases dispersed throughout the Internet. To assist in the functional, structural and evolutionary analysis of the abundant number of novel proteins continually identified from whole-genome sequencing, we introduce the PROFESS (PROtein Function, Evolution, Structure and Sequence) database. Our database is designed to be versatile and expandable and will not confine analysis to a pre-existing set of data relationships. A fundamental component of this approach is the development of an intuitive query system that incorporates a variety of similarity functions capable of generating data relationships not conceived during the creation of the database. The utility of PROFESS is demonstrated by the analysis of the structural drift of homologous proteins and the identification of potential pancreatic cancer therapeutic targets based on the observation of protein-protein interaction networks.