The field of computational biology has evolved in the past two decades as a scientific discipline in its own right with textbooks, college degrees and placement opportunities unique to it. Our workshop will cover material fundamental to this field that is accessible at the undergraduate level, with the aim of providing solid background and a glimpse of research avenues in this area; the hope is that this material may be incorporated into relevant courses in discrete mathematics and computer science at the undergraduate level providing both material to motivate fundamental concepts as well as exposure to undergraduates about this potentially exciting career track.
The thrust of this Reconnect program will be problems involving the integration of information from genetic sequences and evolutionary heritage. Each day of the week will focus on one topic, with three lectures and a group problem solving session per day. Details below:
Day 1: Sequence alignment and Dynamic programming technique: (i) Maxiumum Score Segment and 2-sequence alignment (ii) Local versus global alignments, multiple sequences and objective functions (iii) Gaps, affine penalties, parametic variants.
Day 2: Character-based methods for reconstructing phylogeny: (i) Perfect phylogeny - the boolean case (ii) General case and Parsimony (iii) Gene trees versus species trees.
Day 3: Distance-based methods for reconstructing phylogeny: (i) Ultrametric and additive reconstruction algorithms (ii) Finding closest fit evolutionary trees (iii) Maximum likelihood method.
Day 4: Integrating alignments and phylogeny reconstruction: (i) Tree alignment (ii) Multiple sequence alignments via trees (iii) Lawler's taxonomy of problems involving trees and alignment.
Day 5: Problems in SNP analysis: (i) Haplotyping via perfect phylogeny (ii) Blocks and their determination (iii) Glimpse of research problems and methods.
Prerequisites: Basic knowledge in Graph Theory (E.g. Eulerian vs Hamiltonian cycles), Algorithms (E.g. Kruskal's and Dijkstra's algorithms), Data Structures (E.g., Recursion and analysis), Linear algebra (E.g., Linear programming) and Probability (E.g., Bayes' rule).
It is expected that every participant will be part of a group that will write up an educational module (5-10 pages) by the end of the week, from which acceptable ones will be recommended for publication in the DIMACS module series.