GUEST SPEAKER: Simon Kasif University of Illinois at Chicago & Johns Hopkins University TALK TITLE: DATASCOPE: A Computational Learning Approach for Whole-Genome Interpretation Abstract: In 1995 the first complete genome, H. influenzae, was sequenced and published by The Insitute for Genomic Research (TIGR). Subsequently, the genomes of more than 20 organisms have been completely sequenced by TIGR, the Sanger Centre, Berkeley-Stanford, and other genome centers. At the same time, progress on sequencing the human genome is accelerating, with completion of the entire 3.3 billion base sequence expected soon. This breakthrough in genomic technology has vast implications for the life sciences and health care. Our research aims to develop novel approaches for learning from complex data that help facilitate the understanding of the fundamental connections between genetic sequence and biological function of living organisms. We can take advantage of the fortunate confluence of the problem of understanding DNA and basic research questions in machine learning and computational modeling. DNA encodes the rules of life in sequence patterns, and many of these patterns can be deciphered using advanced computational methods. In this talk we will describe our gene identification system, Glimmer (http://www.tigr.org/softlab). The system combines new theoretical developments in learning with careful integration of biological knowledge to achieve over 97% accuracy on finding genes previously annotated by biologists. Glimmer has been already have been used to identify thousands of genes in many whole-genome sequencing efforts at TIGR and other genome centers. These genomes of organisms that cause Lyme Disease, Syphilis, Tuberculosis, Malaria and many others that are currently in progress. We will also describe a recent project for high-resolution alignment of complete genomes. This project was supported using a new large scale genomic comparison tool that can also be utilized for comparing human vs mouse DNA, SNP identification in human DNA, tandem repeat finding, and other fundamental genomics tasks. We will conclude by discussing the important role computational genomics is likely to play in future medical research and treatment. This talk describes joint work with The Institute for Genomic Research http://www.tigr.org. Background for the talk: "Computational Methods in Molecular Biology", eds, Steven Salzberg, David Searls and Simon Kasif, Elsevier Publ., 2nd Ed. 1999.