Pierre Baldi, University of California, Irvine, USA
From Genomes to Drug Leads: Integrative Systems Biology Approaches
These three lectures will be organized around three themes at three different
scales: Genomes and Omic Data, Integrative Systems Biology, and
Chemoinformatics.
In the Genomics and Omic data theme, we will provide a brief historical
overview of genomics and survey our understanding of the human genome and
current challenges and opportunities in relation to human diseases and P4
(preventive, personalized, predictive, participatory) medicine. This will be
followed by an overview of the different kinds of omic data that are available
today.
In the Integrative Systems Biology theme, we will describe how omic and other
data can be efficiently integrated. We will demonstrate the Crick expert system
which uses databases of transcription factors and Bayesian statistical
approaches to derive comprehensive, genome-wide, maps of regulatory elements.
These maps are used to infer core regulatory circuits and loops. These
inferences in turn are augmented and refined by integrating them with other
data, such as: (1) Gene ontology; (2) Protein-protein interaction; (3) RNA; (4)
Gene expression; (5) Epigenetic modifications; (6) Chromatin and DNA 3D
structure; (7) SNPs; (8) Drugs; (9) Metabolites; and (10) Scientific
literature. This approach enables the identification of new regulatory
mechanisms and targets. Examples of collaborative projects based on the
predictions made by Crick will be described together with a relatively new
high-throughput technology for probing the response of the immune system, with
applications to antigen discovery and vaccines.
In the Chemoinformatics theme, we will provide an overview of the area,
including the available data on small molecules and their similarity measures,
and how to build databases of small molecules with the underlying efficient
compression, storage, search, and statistically significant retrieval
algorithms. We will also present machine learning methods for the prediction of
the physical, chemical, and biological properties of small molecules. Finally,
we will tie the three themes together and show how computational methods can
support the identification of novel drug leads.
Sample of references by our group:
- T. Lin, M. Melgar, S. J. Swamidass, J. Purdon, T. Tseng, G. Gago, D. Kurth, P. Baldi, H. Gramajo, and S. Tsai. Structure-Based Inhibitor Design of AccD5, an Essential acyl-CoA Carboxylase Carboxyltransferase Domain of Mycobacterium tuberculosis. Proceedings of the National Academy of Sciences USA, 103, 9, 3072-3077, (2006).
- M. Brandon, P. Baldi, and D. C. Wallace. Mitochondrial Mutations in Cancer.Oncogene, 25, 4647-4662, (2006).
- C. Azencott, A. Ksikes, S. Joshua Swamidass, J. Chen, L. Ralaivola, and P. Baldi. One- to Four- Dimensional Kernels for Virtual Screening and the Prediction of Physical, Chemical, and Biological Properties.Journal of Chemical Information and Modeling, 47, 3, 965-974, (2007).
- S. J. Swamidass and P. Baldi. Bounds and Algorithms for Fast Exact Searches of Chemical Fingerprints in Linear and Sub-Linear Time.Journal of Chemical Information and Modeling, 47, 2, 302-317, (2007).
- J. Chen, E. Linstead, S. J. Swamidass, D. Wang, and P. Baldi. ChemDB Update-Full-Text Search and Virtual Chemical Space. Bioinformatics, 23, 2348-2351, (2007).
- P. Baldi., R. W. Benz, D. S. Hirschberg, and S. Joshua Swamidass. Lossless Compression of Chemical Fingerprints Using Integer Entropy Codes Improves Storage and Retrieval. Journal of Chemical Information and Modeling, 47, 6, 2098-2109, (2007).
- P. Baldi and R. W. Benz. BLASTing Small Molecules-Statistics and Extreme Statistics of Chemical Similarity Scores. Bioinformatics, 24(13):i357-i365, (2008).
- X. Xie, P. Rigor, and P. Baldi. MotifMap: a human genome-wide map of candidate regulatory motif sites. Bioinformatics, 25, 167-174, (2009).
- P.Felgner, M. Kayala, A. Vigil, C. Burk, R. Nakajima-Sasaki, J. Pablo, D. Molina, S. Hirst, J. Chew, D. Wang, G. Tan, M. Duffield, R. Yang, J. Neel, N. Chantratita, G. Bancroft, G. Lertmemongkolchai, D. Davies, P. Baldi, S. Peacock, and R. Titball. A Burkholderia pseudomallei protein array reveals serodiagnostic and cross-reative antigens. Proceedings of the National Academy of Sciences USA , 106, 13499-13504, (2009).
- A. B. Mochon1, J. Ye, M. A. Kayala, J. R. Wingard, C. J. Clancy, M. H. Nguyen, P. Felgner, P. Baldi, and H. Liu. Serological Profiling of a Candida albicans Protein Microarray Reveals Permanent Host-Pathogen Interplay & Stage-Specific Responses during Candidemia. PLoS Pathogens, published 26 Mar, (2010).
- P. Crompton, M. Kayala, B. Traore, K. Kayentao, A. Ongoiba, G. Weiss, D. Molina, C. Burk, M. Waisberg, A. Jasinskas, X. Tan, S. Doumbo, D. Doumtabe, Y. Kone, D. Narum, X. Liang, O. Doumbo, L. Miller, D. Doolan, P. Baldi, P. Felgner, S. Pierce. A Prospective Analysis of the Antibody Response to Plasmodium falciparum Before and After a Malaria Season by Protein Microarray.Proceedings of the National Academy of Sciences USA, i107, 15, 6958-6963, (2010).
- P. Baldi and R. Nasr. When is Chemical Similarity Significant? The Statistical Distribution of Chemical Similarity Scores and Its Extreme Values. Journal of Chemical Information and Modeling, 50, 7, 1205-1222, (2010).
- R. Nasr, D. Hirschberg, and P. Baldi. Hashing Algorithms and Data Structures for Rapid Searches of Fingerprint Vectors. Journal of Chemical Information and Modeling, 50, 8, 1358-1368, (2010).
- A. Andronico, A. Randall, R. W. Benz, and P. Baldi. Data-Driven High-Throughput Prediction of the 3D Structure of Small Molecules: Review and Progress. Journal of Chemical Information and Modeling, in press, (2011).