I will present modeling and algorithmic designs for two challenging problems in biology and argue that efficient computational methods enable significant advances in our understanding of cell machinery and genome evolution. The first problem is the assembly of full-length transcripts -- the collection of expressed gene products in cells -- from noisy and highly fragmented data obtained through RNA sequencing. I first formulate this problem as a graph decomposition problem, and then design an efficient algorithm for it, which can guarantee to preserve all long-range information. Integrating and assembling 7000 RNA-seq samples using this algorithm yields a more complete human transcriptome and reveals many novel transcripts. The second problem is the reconstruction of a large phylogeny -- the evolutionary history of a large collection of extant species -- based on the structures of the genome as obtained from whole-genome sequencing technology. A basic computational problem here is to define and to compute an evolutionary distance between whole genomes. I will describe our efficient exact algorithms and approximation algorithms to compare genomes under various evolutionary models. These algorithms can uncover the evolutionary relationships between genes across many genomes, even for very large mammalian genomes.
Mingfu Shao is currently a Lane Fellow at Computational Biology Department, School of Computer Science, Carnegie Mellon University. He obtained his Ph.D. from EPFL (Swiss Institute of Technology, Lausanne), Switzerland. His research interests include the development of efficient algorithms for combinatorial optimization and machine-learning problems, and their applications to computational biology and precision medicine. At CMU, he works on large-scale transcriptomics; he has developed a new transcript assembler called Scallop. His Ph.D. research focused on comparative genomics. He was awarded the prestigious Dimitris N. Chorafas foundation award for his contribution in designing innovative algorithms for problems in genome evolution.