Mike Cherry
Stanford UniversityArtificial Intelligence has revolutionized our ability to decipher intricate relationships
within biological entities, providing us with insights previously unattainable. Yet, the
foundation of this progress lies in the source information fueling these methods. Enter our
lab, where we meticulously maintain biological data resources and knowledgebases, offering
validated, consistently processed experimental results alongside expertly curated insights
for machine learning endeavors.
What sets us apart is our unique fusion of quantitative
datasets with targeted experimental outcomes, crafting comprehensive databases that unravel
the complexities of scientific phenomena. In this seminar, I will unveil the wealth of
resources at our disposal, including the extensive data amassed by the ENCODE Consortium.
Spanning over 12 years, this treasure trove encompasses a plethora of experiments-from
RNA-Seq to ChIP-Seq and beyond-each adorned with standardized meta-data and freely
accessible on a global scale.
Additionally, I will shed light on the Saccharomyces Genome
Database, a repository of curated experimental findings on the budding yeast. Containing
knowledge gleaned from published studies, this resource expands our understanding of genes,
their functionalities, and interactions-information with profound implications for
development and gene expression studies.
By liberating these indispensable databases to the
scientific community, the Cherry Lab emboldens researchers and educators alike to delve
deeper into the mysteries of the natural world, glean insights from existing research, and
collectively propel the boundaries of scientific understanding forward.
Dr. J. Michael Cherry, Professor Emeritus of Genetics at Stanford University
School of Medicine, leads a diverse team focused on integrating biological knowledge and
experimental results into accessible software/database environments. His lab oversees
resources like the Saccharomyces Genome Database (SGD) and participates in numerous
projects, including the NIH ENCODE Data Coordination Center (DCC), the Gene Ontology
Consortium (GOC), and the RegulomeDB project. Prioritizing expert curation of public
research, his lab develops comprehensive datasets and analytical tools. Recently, they
joined the Human Cell Atlas (HCA) project, contributing to metadata coordination, data
wrangling, and outreach. His research spans bioinformatics, genomics, ontology development,
and health informatics.
Sun Kim
Seoul National UniversityDrug response prediction at the patient level is a very difficult and time-consuming task. Use of animal models has limited power of being translated to the patient level. Thus, there has long been significant efforts in predicting drug response at the cell line and molecular levels. Drug response is a huge topic and, in this talk, I will focus on cancer drug response because large molecular, cellular and patient level databases are available for computational modeling: LINCS, GDSC, and TCGA. However, these databases have limited information for predicting drug responses. First, GDSC (Genomics of Drug Sensitivity in Cancer) includes data from 722,057 genomic associations tested in terms of cancer cell death as of March 2024. The major hurdle in using GDSC for response prediction is that gene-level responses after drug treatment is not available. We developed two deep learning models for drug response using GDSC by simulating gene-level responses after drug treatment (Briefings in Bioinformatics 2023) and characterizing biological pathway-level interpretation of drug response (IJMS 2023). Fortunately, gene-level responses after drug treatment are measured and available in LINCS (The Library of Integrated Network-based Cellular Signatures). However, translating LINCS data to cellular level (e.g, GDSC) and to patient level (e.g., TCGA) remains an unresolved research problem. We recently developed a deep learning attention model for this problem (ISMB 2024). These drug response prediction tools can be paired with computational methods for other demanding problems in drug discovery. An example is a biomarker discovery. Given biomarker candidates predicted by computational tools such as GOAT (Bioinformatics 2023), drug response prediction tools can be used for evaluating biomarker candidates, for which my group is developing a comprehensive package. In closing of this talk, I will briefly talk about additional research directions. We show that both drug response prediction and biomarker discovery can be done simultaneously in a single computational framework (In submission). Another important problem is patient stratification for which drug response prediction tools can be useful at the molecular level (In submission).
Sun Kim is Professor in the School of Computer Science and Engineering, Adjunct
Professor of
Biological Sciences, and Director of Bioinformatics Institute (2011-2021) at Seoul National
University. He is also currently President of Mogam Institute of Biomedical Research and a
member of National Academy of Engineering, Korea. Before joining SNU, he was Chair of
Faculty
Division C; Director of Center for Bioinformatics Research, an Associate Professor in School
of Informatics and Computing at Indiana University (IU) Bloomington. Prior to joining IU in
2001, he worked at DuPont Central Research from 1998 to 2001, and at the University of
Illinois at Urbana-Champaign from 1997 to 1998. Sun Kim received B.S and M.S and Ph.D in
Computer Science from Seoul National University, KAIST and the University of Iowa,
respectively. His research is on machine learning and algorithms for AI drug discovery and
Bioinformatics.
Jaewoo Kang
Korea UniversityLarge Language Models (LLMs) are showcasing exceptional potential across various industries. In the data-intensive domain of drug development, LLMs are set to dramatically boost productivity by leveraging their capabilities to analyze extensive volumes of research papers and patents, generate cutting-edge scientific hypotheses, and craft documents for regulatory submissions. Moreover, these models hold the potential to act as sophisticated copilots for executing mission-critical tasks such as designing molecular structures and formulating clinical trial strategies. While still in early development, the path and pace of these innovations are not yet clear. Drawing upon the speaker's extensive experience in applying AI technologies to precision medicine and drug development challenges, this presentation will offer insights into the latest progress in crafting LLMs specifically for the biomedical sector. Additionally, it will discuss the transformative effects that these models could have on the pharmaceutical industry.
대형언어모델(LLM)이 여러분야에서 가능성을 보여주고 있다. 지식 집약산업인 신약개발섹터에서도 방대한 양의 논문 및 특허 분석, 과학적 가설 생성, 인허가 기관 승인 문서 작성 등 다양한 업무에서 생산성 향상에 기여하게 될 것으로 보인다. 한발 더 나간다면 약물의 구조설계, 임상시험 계획 수립 등 미션 크리티컬한 태스크까지 수행 할 수 있는 코파일럿으로서 LLM이 발전할 수 있을 것으로 생각한다. 아직은 극 초기 단계라 그 끝이 어디이며 얼마나 걸릴지 예측하기는 쉽지 않다. 본 발표에서는 정밀의료 및 신약개발 문제에 인공지능 기술을 적용했던 연자의 과거 경험으로부터 출발해 최근 의약학분야 특화 LLM을 개발한 경험을 공유하고, 이를 통해 앞으로의 LLM이 동 산업분야에 어떻게 기여할 수 있을지 논의한다.
Dr. Jaewoo Kang earned his Ph.D. from the University of Wisconsin-Madison in
2003. Following his doctorate, he joined the Computer Science Department at North Carolina
State University in Raleigh, NC, USA, as an Assistant Professor. In 2006, he moved to Korea
University in Seoul, Korea and has since been holding the position of Professor of Computer
Science and Engineering. Dr. Kang has also made contributions to the industry. Before
joining the UW-Madison graduate program, he worked as a member of the technical staff at
AT&T Labs Research (formerly Bell Labs), Florham Park, NJ, USA, from 1996 to 1997. In
2000, he founded WISEngine Inc., a meta-search start-up headquartered in Seoul, Korea, and
with a branch in Santa Clara, CA, USA. In 2021, Dr. Kang founded AIGEN Sciences Inc., a
cutting-edge start-up in the field of AI-driven drug discovery, and currently serves as its
CEO.
Daehee Hwang
Seoul National UniversityAs huge amounts of global data (genomic, epigenomic, transcriptomic, proteomic, and metabolomic) generated from a broad spectrum of specimens collected from human patients have been accumulated in public repositories, together with electronic health records and drug treatment information, biology is now becoming an informational science. Accordingly, there have been significant needs for bioinformatic methods that can effectively extract useful information from these data. In this talk, I will present two different precision medicine approaches using multi-omics data and clinical big data, respectively.
Giltae Song
Pusan National UniversityUncovering biomarker genes, along with therapeutic gene interventions and gene therapies, plays a pivotal role in unraveling the mysteries of disease pathogenesis and accelerating the development of targeted drugs. Numerous machine learning methodologies have emerged for the identification of gene-disease associations (GDAs), leveraging existing GDA knowledge from data portals like DisGeNET. This talk explores graph-based learning approaches for identifying unknown biomarker genes associated with diseases, including hypergraph-based representation learning. The complex relationships among biological entities are captured via graphs built with genomic sources such as HumanNet, GO, Disease Ontology, Human Phenotype Ontology, and DisGeNET. It improves the performance of machine learning models for determining novel gene-disease associations. This underscores the significance of curated biological information within databases. Utilizing graph-based representation learning with genomic resources can expedite the advancement of targeted drug discovery.
Giltae Song, Ph.D., is an associate professor in the School of Computer
Science and Engineering at Pusan National University (PNU). Before joining PNU, he was a
post-doctoral scholar in Prof. Mike Cherry's group at Stanford University. Dr. Song
earned a Ph.D. in computer science and engineering at Pennsylvania State University (advised
by Prof. Webb Miller), and a bachelor's degree and a master's degree in computer science and
engineering at Seoul National University. His research focuses on machine learning and data
mining specialized for analyzing various biomedical data (e.g. genome sequence data,
experimental data for drug discovery, and clinical data in hospitals).
Hyun Uk Kim
KAISTSystematic processing of bio big data using computational models can help predict biomarkers and drug targets for a range of diseases. In the case of cancer, a substantial amount of bio big data, including patient-specific omics data (e.g., RNA-seq data) and medical data (e.g., survival data), has been collected and awaits opportunities for computational modeling. Here, bio big data do not always need to be associated with machine learning models; mechanistic models also deserve attention. In this talk, I will elaborate on a computational workflow that uses so-called genome-scale metabolic models (GEMs) along with transcriptome and mutation data from cancer patients to predict oncometabolites. Oncometabolites exhibit pro-oncogenic functions when they accumulate abnormally in cancer cells, and they are generated upon mutations in a metabolic gene. GEM is a computational model that allows predicting entire metabolic reaction fluxes. I will also showcase the application of this computational workflow to predict drug targets effective for high-risk bladder cancer patients, showing poor prognosis. The predicted drug targets were validated using in vitro and in vivo studies. Ongoing efforts in generating and applying meaningful bio big data, alongside the proper use of computational models, will revolutionize our approaches to addressing medical problems.
Early Registration (~ 5월 23일) |
Late Registration | ||
---|---|---|---|
Academy | 교수 | 10만원 | 15만원 |
학생 | 5만원 | 7만원 | |
Industry | 10만원 | 15만원 |