Bio-Health AI International Symposium 2024

시간	강연
14:00-14:50	From Identification to Integration: Leveraging Curated Scientific Information and Quantitative Data for New Discoveries Prof. Mike Cherry (Stanford University)
14:50-15:20	Deep learning models for biomarker discovery and drug response prediction Prof. Sun Kim (Seoul National University)
15:20-15:50	AI-Driven Drug Discovery in the LLM Era Prof. Jaewoo Kang (Korea University)
15:50-16:00	휴식
16:00-16:30	Precision Medicine: Omics and Clinical Data Prof. Daehee Hwang (Seoul National University)
16:30-17:00	Graph Representation Learning for Identifying Gene-Disease Association Using Genomic Resources Prof. Giltae Song (Pusan National University)
17:00-17:30	Prediction of cancer-associated metabolites by using genome-scale metabolic models and medical data Prof. Hyun Uk Kim (KAIST)

Mike Cherry

Stanford University

From Identification to Integration:
Leveraging Curated Scientific Information and Quantitative Data for New Discoveries

14:00 - 14:50

Abstract

Artificial Intelligence has revolutionized our ability to decipher intricate relationships within biological entities, providing us with insights previously unattainable. Yet, the foundation of this progress lies in the source information fueling these methods. Enter our lab, where we meticulously maintain biological data resources and knowledgebases, offering validated, consistently processed experimental results alongside expertly curated insights for machine learning endeavors.

What sets us apart is our unique fusion of quantitative datasets with targeted experimental outcomes, crafting comprehensive databases that unravel the complexities of scientific phenomena. In this seminar, I will unveil the wealth of resources at our disposal, including the extensive data amassed by the ENCODE Consortium. Spanning over 12 years, this treasure trove encompasses a plethora of experiments-from RNA-Seq to ChIP-Seq and beyond-each adorned with standardized meta-data and freely accessible on a global scale.

Additionally, I will shed light on the Saccharomyces Genome Database, a repository of curated experimental findings on the budding yeast. Containing knowledge gleaned from published studies, this resource expands our understanding of genes, their functionalities, and interactions-information with profound implications for development and gene expression studies.

By liberating these indispensable databases to the scientific community, the Cherry Lab emboldens researchers and educators alike to delve deeper into the mysteries of the natural world, glean insights from existing research, and collectively propel the boundaries of scientific understanding forward.

Bio

2024 - Present Professor (Emeritus) of Genetics, Stanford University
2013 - 2024 Professor of Genetics, Stanford University
2001 - 2012 Associate Professor of Genetics, Stanford University
1998 - 2001 Director, Stanford Microarray Database (PIs: David Botstein and Pat Brown)
1993 - 2001 Director and Chief Biocurator, Saccharomyces Genome Database (PI: David Botstein)
1993 - 1996 Head, Computing. Stanford DNA Sequence and Technology Center (PI: Ron Davis)
1988 - 1993 Director of Computing, Molecular Biology, Massachusetts General Hospital
1985 - 1988 Research Fellow in Genetics, Harvard University School of Medicine (Advisor: Jack Szostak)
1985 PhD, Molecular Biology, University of California, Berkeley

Dr. J. Michael Cherry, Professor Emeritus of Genetics at Stanford University School of Medicine, leads a diverse team focused on integrating biological knowledge and experimental results into accessible software/database environments. His lab oversees resources like the Saccharomyces Genome Database (SGD) and participates in numerous projects, including the NIH ENCODE Data Coordination Center (DCC), the Gene Ontology Consortium (GOC), and the RegulomeDB project. Prioritizing expert curation of public research, his lab develops comprehensive datasets and analytical tools. Recently, they joined the Human Cell Atlas (HCA) project, contributing to metadata coordination, data wrangling, and outreach. His research spans bioinformatics, genomics, ontology development, and health informatics.

Sun Kim

Seoul National University

Deep learning models for biomarker discovery and drug response prediction

14:50 - 15:20

Abstract

Drug response prediction at the patient level is a very difficult and time-consuming task. Use of animal models has limited power of being translated to the patient level. Thus, there has long been significant efforts in predicting drug response at the cell line and molecular levels. Drug response is a huge topic and, in this talk, I will focus on cancer drug response because large molecular, cellular and patient level databases are available for computational modeling: LINCS, GDSC, and TCGA. However, these databases have limited information for predicting drug responses. First, GDSC (Genomics of Drug Sensitivity in Cancer) includes data from 722,057 genomic associations tested in terms of cancer cell death as of March 2024. The major hurdle in using GDSC for response prediction is that gene-level responses after drug treatment is not available. We developed two deep learning models for drug response using GDSC by simulating gene-level responses after drug treatment (Briefings in Bioinformatics 2023) and characterizing biological pathway-level interpretation of drug response (IJMS 2023). Fortunately, gene-level responses after drug treatment are measured and available in LINCS (The Library of Integrated Network-based Cellular Signatures). However, translating LINCS data to cellular level (e.g, GDSC) and to patient level (e.g., TCGA) remains an unresolved research problem. We recently developed a deep learning attention model for this problem (ISMB 2024). These drug response prediction tools can be paired with computational methods for other demanding problems in drug discovery. An example is a biomarker discovery. Given biomarker candidates predicted by computational tools such as GOAT (Bioinformatics 2023), drug response prediction tools can be used for evaluating biomarker candidates, for which my group is developing a comprehensive package. In closing of this talk, I will briefly talk about additional research directions. We show that both drug response prediction and biomarker discovery can be done simultaneously in a single computational framework (In submission). Another important problem is patient stratification for which drug response prediction tools can be useful at the molecular level (In submission).

Bio

2011 - Present Professor of Computer Science and Engineering, Seoul National University
2022 - Present President, MOGAM Institute for Biomedical Research
2001 - 2011 Assistant/Associate Professor/Chair, School of Informatics and Computing, Indiana University
1998 PhD, CS, The University of Iowa
1987 MS, CS, KAIST
1985 BS, Computer and Statistics, Seoul National University

Sun Kim is Professor in the School of Computer Science and Engineering, Adjunct Professor of Biological Sciences, and Director of Bioinformatics Institute (2011-2021) at Seoul National University. He is also currently President of Mogam Institute of Biomedical Research and a member of National Academy of Engineering, Korea. Before joining SNU, he was Chair of Faculty Division C; Director of Center for Bioinformatics Research, an Associate Professor in School of Informatics and Computing at Indiana University (IU) Bloomington. Prior to joining IU in 2001, he worked at DuPont Central Research from 1998 to 2001, and at the University of Illinois at Urbana-Champaign from 1997 to 1998. Sun Kim received B.S and M.S and Ph.D in Computer Science from Seoul National University, KAIST and the University of Iowa, respectively. His research is on machine learning and algorithms for AI drug discovery and Bioinformatics.

Jaewoo Kang

Korea University

AI-Driven Drug Discovery in the LLM Era

15:20 - 15:50

Abstract

Large Language Models (LLMs) are showcasing exceptional potential across various industries. In the data-intensive domain of drug development, LLMs are set to dramatically boost productivity by leveraging their capabilities to analyze extensive volumes of research papers and patents, generate cutting-edge scientific hypotheses, and craft documents for regulatory submissions. Moreover, these models hold the potential to act as sophisticated copilots for executing mission-critical tasks such as designing molecular structures and formulating clinical trial strategies. While still in early development, the path and pace of these innovations are not yet clear. Drawing upon the speaker's extensive experience in applying AI technologies to precision medicine and drug development challenges, this presentation will offer insights into the latest progress in crafting LLMs specifically for the biomedical sector. Additionally, it will discuss the transformative effects that these models could have on the pharmaceutical industry.

초록

대형언어모델(LLM)이 여러분야에서 가능성을 보여주고 있다. 지식 집약산업인 신약개발섹터에서도 방대한 양의 논문 및 특허 분석, 과학적 가설 생성, 인허가 기관 승인 문서 작성 등 다양한 업무에서 생산성 향상에 기여하게 될 것으로 보인다. 한발 더 나간다면 약물의 구조설계, 임상시험 계획 수립 등 미션 크리티컬한 태스크까지 수행 할 수 있는 코파일럿으로서 LLM이 발전할 수 있을 것으로 생각한다. 아직은 극 초기 단계라 그 끝이 어디이며 얼마나 걸릴지 예측하기는 쉽지 않다. 본 발표에서는 정밀의료 및 신약개발 문제에 인공지능 기술을 적용했던 연자의 과거 경험으로부터 출발해 최근 의약학분야 특화 LLM을 개발한 경험을 공유하고, 이를 통해 앞으로의 LLM이 동 산업분야에 어떻게 기여할 수 있을지 논의한다.

Bio

2021 - Present CEO/Founder, AIGEN Sciences Inc.
2006 - Present Professor, Korea University
2003 - 2006 Assistant Professor, North Carolina State University
2000 - 2001 CTO/Founder, WISEngine Inc.
1996 - 1997 Senior Tech Staff, AT&T Labs Research
2003 Ph.D., Univ. of Wisconsin-Madison
1996 M.S., Univ. of Colorado at Boulder
1994 B.S., Korea University

Dr. Jaewoo Kang earned his Ph.D. from the University of Wisconsin-Madison in 2003. Following his doctorate, he joined the Computer Science Department at North Carolina State University in Raleigh, NC, USA, as an Assistant Professor. In 2006, he moved to Korea University in Seoul, Korea and has since been holding the position of Professor of Computer Science and Engineering. Dr. Kang has also made contributions to the industry. Before joining the UW-Madison graduate program, he worked as a member of the technical staff at AT&T Labs Research (formerly Bell Labs), Florham Park, NJ, USA, from 1996 to 1997. In 2000, he founded WISEngine Inc., a meta-search start-up headquartered in Seoul, Korea, and with a branch in Santa Clara, CA, USA. In 2021, Dr. Kang founded AIGEN Sciences Inc., a cutting-edge start-up in the field of AI-driven drug discovery, and currently serves as its CEO.

Daehee Hwang

Seoul National University

Precision Medicine: Omics and Clinical Data

16:00 - 16:30

Abstract

As huge amounts of global data (genomic, epigenomic, transcriptomic, proteomic, and metabolomic) generated from a broad spectrum of specimens collected from human patients have been accumulated in public repositories, together with electronic health records and drug treatment information, biology is now becoming an informational science. Accordingly, there have been significant needs for bioinformatic methods that can effectively extract useful information from these data. In this talk, I will present two different precision medicine approaches using multi-omics data and clinical big data, respectively.

Bio

2019 - Present Seoul National University Professor
2013 - 2019 DGIST Professor
2006 - 2013 POSTECH Assistant/associate professor
2003 - 2006 Institute for Systems Biology Postdoc/Senior Scientist
1999 - 2003 MIT, Ph.D.
1996 - 1998 POSTECH, M.S.
1990 - 1996 POSTECH, B.S.

Giltae Song

Pusan National University

Graph Representation Learning for Identifying Gene-Disease Association Using Genomic Resources

16:30 - 17:00

Abstract

Uncovering biomarker genes, along with therapeutic gene interventions and gene therapies, plays a pivotal role in unraveling the mysteries of disease pathogenesis and accelerating the development of targeted drugs. Numerous machine learning methodologies have emerged for the identification of gene-disease associations (GDAs), leveraging existing GDA knowledge from data portals like DisGeNET. This talk explores graph-based learning approaches for identifying unknown biomarker genes associated with diseases, including hypergraph-based representation learning. The complex relationships among biological entities are captured via graphs built with genomic sources such as HumanNet, GO, Disease Ontology, Human Phenotype Ontology, and DisGeNET. It improves the performance of machine learning models for determining novel gene-disease associations. This underscores the significance of curated biological information within databases. Utilizing graph-based representation learning with genomic resources can expedite the advancement of targeted drug discovery.

Bio

2020 - Present Director, Center for Artificial Intelligence Research, Pusan National University
2016 - Present Associate Professor of Computer Science and Engineering, Pusan National University
2012 - 2016 Post-doctoral scholar in Genetics, Stanford University (Advisor: Mike Cherry)
2011 PhD, Computer Science and Engineering, Pennsylvania State University (Advisor: Webb Miller)
2001 - 2004 Full-time instructor in Computer Science, Korea Naval Academy
2001 MS, Computer Science and Engineering, Seoul National University
1999 BS, Computer Science and Engineering, Seoul National University

Giltae Song, Ph.D., is an associate professor in the School of Computer Science and Engineering at Pusan National University (PNU). Before joining PNU, he was a post-doctoral scholar in Prof. Mike Cherry's group at Stanford University. Dr. Song earned a Ph.D. in computer science and engineering at Pennsylvania State University (advised by Prof. Webb Miller), and a bachelor's degree and a master's degree in computer science and engineering at Seoul National University. His research focuses on machine learning and data mining specialized for analyzing various biomedical data (e.g. genome sequence data, experimental data for drug discovery, and clinical data in hospitals).

Hyun Uk Kim

KAIST

Prediction of cancer-associated metabolites by using genome-scale metabolic models and medical data

17:00 - 17:30

Abstract

Systematic processing of bio big data using computational models can help predict biomarkers and drug targets for a range of diseases. In the case of cancer, a substantial amount of bio big data, including patient-specific omics data (e.g., RNA-seq data) and medical data (e.g., survival data), has been collected and awaits opportunities for computational modeling. Here, bio big data do not always need to be associated with machine learning models; mechanistic models also deserve attention. In this talk, I will elaborate on a computational workflow that uses so-called genome-scale metabolic models (GEMs) along with transcriptome and mutation data from cancer patients to predict oncometabolites. Oncometabolites exhibit pro-oncogenic functions when they accumulate abnormally in cancer cells, and they are generated upon mutations in a metabolic gene. GEM is a computational model that allows predicting entire metabolic reaction fluxes. I will also showcase the application of this computational workflow to predict drug targets effective for high-risk bladder cancer patients, showing poor prognosis. The predicted drug targets were validated using in vitro and in vivo studies. Ongoing efforts in generating and applying meaningful bio big data, alongside the proper use of computational models, will revolutionize our approaches to addressing medical problems.

Bio

2023 - Present Adjunct Professor, Graduate School of Engineering Biology, KAIST
2018 - Present Assistant Professor/Associate Professor, Department of Chemical and Biomolecular Engineering, KAIST
2014 - 2016 Visiting Senior Researcher, Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark
2013 - 2018 Research Assistant Professor, BioInformatics Research Center, KAIST
2011 - 2013 Postdoctoral Researcher, BioInformatics Research Center, KAIST
2011 Ph.D., Chemical and Biomolecular Engineering, KAIST
2007 M.S., Chemical and Biomolecular Engineering, KAIST
2005 B.S., Biotechnology, Yonsei University

		Early Registration (~ 5월 23일)	Late Registration
Academy	교수	10만원	15만원
Academy	학생	5만원	7만원
Industry		10만원	15만원