Invited Talks

8월 19일 (월)
13:00 - 13:45
  • 김건희 교수 (서울대)
  • 초거대 언어모델의 시간 관념 갖추기 (LLMs with Temporal Awareness) (45분)
13:45 - 14:30
  • 허재필 교수 (성균관대)
  • Semantic Segmentation in Data-Scarce Scenarios and with Foundation Models (45분)
15:00 - 15:45
  • 여진영 교수 (연세대)
  • Beyond System 1, Towards System 2 in Conversational AI: Persona, Knowledge, Empathy, Commonsense, and Memory (45분)
15:45 - 16:30
  • 주한별 교수 (서울대)
  • Generative Modeling for Photorealistic 3D Digital Humans (45분)
8월 20일 (화)
13:00 - 13:45
  • 윤철희 교수 (KAIST)
  • Understanding Modern Data Augmentation: Why Do Mixup, Cutout, and CutMix Help? (45분)
13:45 - 14:30
  • 김찬우 교수 (고려대)
  • Speech Model and Language Model and Their Integration into a Single Generative Speech/Language Model (45분)
15:00 - 15:45
  • 이기민 교수 (KAIST)
  • Are we ready to face the Terminator (or superintelligence)? (45분)
15:45 - 16:30
  • 나승훈 교수 (전북대)
  • Towards Human-Level Knowledge Mastering Agents from Large Language Models (45분)

김건희 교수
(서울대)

Biography
Gunhee Kim is a full professor in the Department of Computer Science and Engineering of Seoul National University from 2015. He was a postdoctoral researcher with Leonid Sigal at Disney Research for one and a half years. He received his PhD in 2013 under supervision of Eric P. Xing from Computer Science Department of Carnegie Mellon University. Prior to starting PhD study in 2009, he earned a master’s degree under supervision of Martial Hebert in Robotics Institute, CMU. His research interests are solving vision and language problems that emerge from big multimodal data shared online, by developing scalable machine learning techniques. He is a recipient of 2014 ACM SIGKDD doctoral dissertation award, 2015 Naver New faculty award, Best Full Paper Runner-up at ACM VRST 2019, and Outstanding Paper Award at EMNLP 2023. Please visit his website for more details: https://vision.snu.ac.kr/gunhee/.

초거대 언어모델의 시간 관념 갖추기 (LLMs with Temporal Awareness) (45분)

In this talk, I will present some of our recent works for improving the temporal understanding ability of large language models. First, I introduce the importance of point-in-time role-playing to make LLM based agents accurately represent characters at specific time points. We introduce TIMECHARA, a new benchmark designed to evaluate point-in-time character hallucination where they display knowledge that contradicts their characters’ identities and historical timelines. Second, we propose GrowOVER as dynamic open-domain QA and dialogue benchmarks that adhere to a continuous cycle of updates, keeping pace with the rapid evolution of knowledge. It is motivated by that the retrieval-augmented language model struggles with knowledge that has not been trained on or has been outdated. We suggest an interactive retrieval-augmented generation as a novel framework where the language model evaluates its answers and reflects their answers for further re-retrieval. These works were recently published in ACL 2024.


허재필 교수
(성균관대)

Biography
2017.03-현재 성균관대학교 소프트웨어학과, 부교수
2015.12 - 2017.02 한국전자통신연구원 (ETRI), 연구원
2015.03 - 2015.11 KAIST, 연수연구원
2013. 12 - 2014. 03,
2014. 07 - 2014. 10
Adobe, Research Intern
2015.02 박사, KAIST, 전산학과
2015.02 석사, KAIST, 전산학과
2015.02 학사, KAIST, 전산학과

Semantic Segmentation in Data-Scarce Scenarios and with Foundation Models (45분)

Semantic segmentation is essential in visual information recognition and has been extensively studied over the years. However, training segmentation models typically requires a large amount of labeled data, which can be costly to acquire. This talk introduces recent methodologies to overcome this challenge by employing techniques such as unsupervised learning and few-shot learning to train models in data-scarce scenarios. Furthermore, the presentation will conclude with a discussion on the latest semantic segmentation approaches that leverage the knowledge of vision foundation models.


여진영 교수
(연세대)

Biography
Jinyoung Yeo is an assistant professor in the Department of Artificial Intelligence at Yonsei University, the co-director of Data & Language Intelligence (DLI) Lab. He received his Ph.D. in Computer Science from Pohang University of Science and Technology, under the supervision of Prof. Seung-won Hwang. Jinyoung Yeo mainly works on large language models, conversational agents, and diverse topics for natural language processing.

Beyond System 1, Towards System 2 in Conversational AI: Persona, Knowledge, Empathy, Commonsense, and Memory (45분)

In this talk, I will present a series of research projects focused on various aspects of conversational agents. These recent studies are categorized into handling persona, knowledge, empathy, commonsense, and memory independently. Our current research aims to integrate these distinct areas into a unified, scalable conversational AI, while also considering the development of additional conversational skills. We believe that this integrated approach is crucial for building a foundational model or engine for product-ready conversational agents, enabling them to possess powerful communication abilities.


주한별 교수
(서울대)

Biography
2022.03 - 현재 서울대학교 컴퓨터공학부 조교수
2019 - 2022.02 Research Scientist, Facebook AI Research (FAIR), Menlo Park
2018 카네기멜론대학교 The Robotics Institute 박사
2009 KAIST 전기 및 전자공학 석사 (2009)
2007 KAIST 전산학 학사
2009 - 2012 연구원, ETRI

Generative Modeling for Photorealistic 3D Digital Humans (45분)

In this talk, I will present our latest research on developing generative models for creating highly realistic 3D digital humans. Three state-of-the-art approaches will be introduced: NCHO (ICCV 2023) for learning neural 3D composition of humans and clothing, Chupa (ICCV 2023) for creating 3D clothed humans using 2D diffusion probabilistic models, and GALA (CVPR 2024) for generating animatable layered assets from a single 3D scan.

Key challenges, methodologies, and results will be discussed, along with insights into the broader impacts and future directions of this field.


윤철희 교수
(KAIST)

Biography
Chulhee Yun is an assistant professor at KAIST Kim Jaechul Graduate School of AI, where he directs the Optimization & Machine Learning Laboratory since 2022. He finished his PhD from the Laboratory for Information and Decision Systems at MIT, under the joint supervision of Prof. Suvrit Sra and Prof. Ali Jadbabaie. Prior to that, he finished his MSc from Stanford University and BSc from KAIST. Chulhee mainly works on theoretical aspects of optimization algorithms, machine learning, and deep learning, with the driving goal of bridging the gap between theory and practice in these areas.

Understanding Modern Data Augmentation: Why Do Mixup, Cutout, and CutMix Help? (45분)

In this talk, I will talk about an ongoing line of research on understanding the provable benefits of widely adopted data augmentation techniques such as Mixup, Cutout, and CutMix. In the first part of the talk, we investigate how pairwise data augmentations like Mixup and CutMix affect the sample complexity of finding optimal decision boundaries in a binary linear classification problem. We show that Mixup and CutMix greatly reduces the number of samples required to precisely locate the optimal boundary, whereas vanilla training suffers the “curse of separability”: the necessary and sufficient number of samples grow exponentially as positive and negative samples become more separable. In the second part, we study the augmentation techniques from a feature learning perspective. By adopting a popular assumption called the feature-noise data distribution, we show that Cutout and CutMix allow a 2-layer convolutional neural network to learn rarer feature vectors than vanilla training, hence leading to superior generalization performance.


김찬우 교수
(고려대)

Biography
Chanwoo Kim is a professor at the Department of Artificial Intelligence at Korea University. Till 2023, he has been a corporate Executive Vice President (EVP) at Samsung Research leading the Language and Voice Team (LVT). He joined Samsung Research as a corporate Vice President (VP) heading the speech processing Lab in Feb. 2018. He has been leading research on end- to-end speech recognition, end-to-end text-to-speech (TTS), Natural Language Understanding (NLU), Language Modeling (LM) , speech enhancement, key-word spotting, and so on at Samsung Research. Most of these research outcomes have been commercialized for Samsung products. He was a senior software engineer at the Google speech team between Feb. 2013 and Feb. 2018. He worked on acoustic modeling for speech recognition systems and enhancing noise robustness using deep learning techniques. While working for Google, he contributed to data-augmentation and acoustic modeling of Google speech recognition systems. He contributed to the commercialization of various Google AI speakers and Google speech recognition systems. He was a speech scientist at Microsoft from Jan. 2011 to Jan. 2013. Dr. Kim received his Ph. D. from the Language Technologies Institute of the School of Computer Science at Carnegie Mellon University in Dec. 2010. He received his B.S. and M.S. degrees in Electrical Engineering from Seoul National University in 1998 and 2001, respectively. Dr. Kim’s doctoral research was focused on enhancing the robustness of automatic speech recognition systems in noisy environments. Between 2003 and 2005 Dr. Kim was a Senior Research Engineer at LG Electronics, where he worked primarily on embedded signal processing and protocol stacks for multimedia systems. Prior to his employment at LG, he worked for EdumediaTek and SK Teletech as an R&D engineer.

Speech Model and Language Model and Their Integration into a Single Generative Speech/Language Model (45분)

Conventionally, speech recognition, Text-to-Speech (TTS), Natural Language Understanding (NLU), and Natural Language Generation (NLG) have been developed as separate models. Recently, with advances in Generative Language Models, it has been demonstrated that various Natural Language Processing (NLP) tasks can be performed using a single generative language model. Furthermore, efforts have been made to enhance these generative language models into multi-modal models by incorporating speech capabilities. In this talk, I will provide an overview of how to build a generative speech model and how it can be integrated with a generative language model to address various multi-modal tasks.


이기민 교수
(KAIST)

Biography
Kimin Lee is an assistant professor at the Graduate School of AI at KAIST. He is interested in building safe and capable AI agents. His recent research directions are (1) reinforcement learning from human feedback, (2) decision-making agents, (3) AI safety and (4) world models. Before joining KAIST, Kimin Lee was a research scientist at Google Research in Mountain View. He completed his postdoctoral training at UC Berkeley (advised by Prof. Pieter Abbeel) and received his Ph.D. from KAIST (advised by Prof. Jinwoo Shin). During his Ph.D., he also interned and collaborated closely with Honglak Lee at the University of Michigan.

Are we ready to face the Terminator (or superintelligence)? (45분)

As foundation models evolve, they are being equipped with an increasing range of modalities and sophisticated tools. This evolution is leading to the creation of more autonomous systems through advanced agent architectures that incorporate elements of planning and memory. As these systems are made more agentic, this could unlock a wider range of beneficial use-cases, but also introduces new challenges in ensuring that such systems are trustworthy. This talk will explore the dual aspects of opportunity and risk presented by agentic systems. We will discuss the necessity for proactive strategies in assessing and mitigating risks associated with these technologies.


나승훈 교수
(전북대)

Biography
Seung-Hoon Na is a full processor at the Department of Computer Science & Artificial Intelligence, Jeonbuk National University, where he has been leading the Laboratory of Cognitive Computing since 2015. He was a senior researcher at ETRI, after working in the School of Computing, Natural University of Singapore. He received his PhD degree from the Department of Computer Science at POSTECH in 2008, under the supervision of Prof. Jong-Hyeok Lee. Prior to that, he received his MS degree in computer science from POSTECH in 2003 and his BS degree in information and computer science from Ajou University in 2001. Currently, He serves as a standing reviewer of Computational Linguistics and is the chair of the Special Interest Group of Human and Cognitive Language Technology, Republic of Korea. He also served as a publication co-chair at COLING 2022. His research interests include natural language processing, information retrieval, and machine learning

Towards Human-Level Knowledge Mastering Agents from Large Language Models (45분)

In this talk, I will present a seminar on human-level knowledge mastering agents, focusing on extending large language models (LLMs) to perform the task of "knowledge mastering." which far goes beyond the current short-term “knowledge editing”, which mainly focuses on factual information or events. First, I will introduce the limitations of LLMs as world models and motivate the need for knowledge injection and manipulation, and knowledge mastering. Next, I will summarize current studies on representative knowledge updating approaches—knowledge editing and retrieval-augmented generation—and discuss how knowledge mastering is not fully achieved by these methods. Then, I will provide an informal or formal definition of the knowledge mastering task, outline its properties such as long-term learning and concept representation tracking, and present a brief sketch for designing benchmarks. Additionally, I will explore potential new technologies for extending LLMs to address knowledge mastering tasks, and conclude by stating that there are opportunities to substantially improve current unary approaches based on Transformers, when seeking towards knowledge mastering.