The mission of the Bioinformatics and Genomics (BG) program is to train a new class of scientists whose primary identity is as computational biologists/bioinformaticians, and whose disciplinary core draws from an emerging set of principles on how to generate, analyze and use large biological data. The program develops a cadre of young scientists that, in addition to being fluent in the contemporary life sciences, can think algorithmically and statistically, use computational and statistical tools. These young scientists will generate both biological discovery and innovation in computational and statistical methods to keep pace with the quickly evolving landscape of high-throughput “omics” technologies.
With its focus on comprehensive analysis of biological data, the core principles of the BG program are:
- Transparency: All steps in data acquisition, processing and analysis must be clearly described using documented methods and freely available tools.
- Reproducibility: All steps in data acquisition, processing and analysis must be repeatable by second parties, and generate equivalent results.
- Statistical Soundness: Results must be reached by means of sound and appropriate statistical methodology.
- Robustness: Results must be robust to arbitrary choices in the data acquisition, processing and analysis pipeline. This is established by varying parameters to identify results that remain consistent over reasonable parameters’ ranges.
- Biological Insight and Validation: Conclusions drawn from results must improve our understanding of biological processes and mechanisms, and be amenable to experimental validation in biological systems --utilizing genetic, biochemical, evolutionary, or ecological approaches as appropriate.
- Efficiency: Computational and statistical tools intended for community use must be optimized for running time and memory usage, whenever such optimization affects their applicability.
The main objectives of the BG program are to:
- Provide students with comprehensive training in the use and development of advanced computational and statistical techniques needed to collect, process, analyze, integrate, and interpret large and complex “omics” data;
- Provide students with an in-depth understanding of the use of these techniques for addressing basic and applied research questions in the life sciences, and with the skills to communicate within interdisciplinary work teams, as well as to a broad target audience;
- Enhance the collaborative environment and facilitate productive interactions and creative efforts in “omics” research.
These objectives will be achieved through a set of required and elective courses and training activities which are designed to cover knowledge in the following areas:
- Foundations of genomics, molecular genetics
- Sequencing technologies, genome assembly, alignments, read mapping
- Basic programming and scripting for bioinformatics
- Algorithm development in bioinformatics
- Applied statistics, and statistical methods for “omics” data
- Competence in R and equivalent statistical software
- Transcriptome analyses and techniques (microarray, RNA-seq)
- Comparative genomics, molecular evolution, function inferred from signatures of negative and positive selection
- Finding and functional analysis of protein-coding genes
- Genome variation, mutagenesis, connections to phenotypes
- Genome mapping, Mendelian inheritance of genes and DNA markers.