패턴인식/기계학습 여름학교

8월 16일 (화요일)
11:00 - 12:30	최승진 소장 (Intellicode) Predictive Uncertainty in Machine Learning (90분)
13:30 - 15:00	한보형 교수 (서울대학교) Continual Learning (90분)
15:30 - 17:00	주재걸 교수 (KAIST) Debiasing Image Classifiers (90분)

8월 17일 (수)
10:30 - 12:00	김선 교수, 이상선 박사 (서울대학교) Convex optimization (90분)
13:00 - 14:30	이슬 교수 (아주대학교) Deep Trees as Accurate and Interpretable Inference Models (90분)
15:00 - 16:30	김기응 교수 (KAIST) OptiDICE for Offline Reinforcement Learning (90분)

최승진 소장
(Intellicode)

Biography
2021-2022	소장, Intellicode
2021-2022	상임고문, BARO AI Academy
2019-2020	CTO, BARO AI
2001-2019	POSTECH 컴퓨터공학과 교수
2019-2021	정보과학회 인공지능소사이어티 회장
2018	삼성전자 종합기술원 자문교수
2017-2018	삼성리서치 AI센터 자문교수
2016-2017	신한카드 빅데이터센터 자문교수
2014-2016	정보과학회 머신러닝연구회 초대위원장
1997-2001	충북대학교 전기전자공학부 조교수
1997	Frontier Researcher, RIKEN, Japan
1996	Ph.D. University of Notre Dame, Indiana, USA

Predictive Uncertainty in Machine Learning (90분)

Most of current machine learning models (in particular deep models) are very poor at handling uncertainty in their predictions, which often yields incorrect but overconfident decisions. Thus, for high-risk applications such as medical diagnosis and self-driving cars, it is essential to quantify uncertainty in a model’s predictions to avoid costly mistakes. The predictive uncertainty constitutes aleatoric uncertainty and episdemic uncertainty. The aleatoric uncertainty arises from the measurement errors and sensor noise, which is referred to "data uncertainty”. The episdemic uncertainty is due to limited data and knowledge, known as "model uncertainty”. In this talk, I will provide an overview of predictive uncertainty estimation in various machine learning models. I begin with the predictive uncertainty estimation in tree models such as random forest, extreme randomized trees, and gradient boosted trees. I will also introduce our own recent technique, the bagging with oversampling in trees. Then I will introduce various ensemble methods in deep learning to compute the predictive uncertainty. Deep ensembles and MC dropout are widely-used practical methods to approximately perform Bayesian model averaging. Anchored sampling based on randomized MAP sampling is another ensemble method. These methods require the construction of an ensemble of more than a few neural networks, which often requires more resources than what we can afford. Finally, I will introduce "beyond sampling” methods where a single neural network is trained to meet our goal. Deep evidential learning is an interesting method where a single non-Bayesian neural network is trained to estimate a target as well as associated evidence to learn both aleatoric and episdemic uncertainty. SWAG is a simple practical method, where the stochastic weight averaging (SWA) solutions are used to fit a Gaussian to compute the predictive uncertainty.

한보형 교수
(서울대학교)

Biography
2018-현재	서울대학교 전기정보공학부 부교수, 교수
2010-2018	POSTECH 컴퓨터공학과 조교수, 부교수
2000-2005	Ph.D. in Computer Science in the University of Maryland

Continual Learning (90분)

Continual learning is a novel learning framework to process incoming data in an online manner, and there exist several variations in its problem formulation. This talk focuses on one of the most popular formulations, class incremental learning, which accepts a set of new classes with labeled examples for the current task while having limited memory to store the examples corresponding to the previous tasks. I will introduce recent algorithms for this problem and discuss their potential and limitations.

주재걸 교수
(KAIST)

Biography
2020-현재	KAIST 인공지능대학원 부교수
2019-2020	고려대학교 인공지능학과 부교수
2015-2019	고려대학교 컴퓨터학과 조교수
2011-2015	Research Scientist, Georgia Tech

Debiasing Image Classifiers (90분)

이미지 분류 모델에서 편향 문제는 해당 클래스의 본연의 특질 (예: 새 클래스에 대해 새의 날개, 부리 등) 대신, 학습 데이터 내에서 높은 빈도로 함께 발생되는 부차적인 속성들 (예: 새가 날고 있는 파란 하늘 배경, 새가 앉아 있는 나무 등) 에 의존하여 이미지 분류를 수행하는 문제를 의미한다. 본 튜토리얼에서는 이러한 이미지 편향 문제를 해결하기 위한 다양한 방법론들을 소개하고, 향후 연구 방향들을 논의한다.

김선 교수
(서울대학교)

Biography
2022-현재	목암생명과학연구소 소장
2011-현재	서울대학교 컴퓨터공학부 교수
2011–2021	서울대학교 생물정보연구소 소장
2009–2011	미국 인디애나대학교 School of Informatics and Computing 학과장
2001–2011	미국 인디애나대학교 School of Informatics and Computing 조교수, 부교수
1998–2001	미국 듀퐁중앙연구소 선임 연구원
1997	아이오와대학교 컴퓨터과학 박사
1987	한국과학기술원 컴퓨터과학 석사
1985	서울대학교 계산통계학 학사

이상선 박사
(서울대학교)

Biography
2020-현재	서울대학교 컴퓨터연구소 박사후연구원
2014-2020	서울대학교 컴퓨터공학부 박사
2010-2013	서울대학교 컴퓨터공학부 학사

Graph Learning for Personalized Medicine and Drug Discovery (90분)

In this talk, we survey recent developments in graph-level learning. Traditional and widely investigated topics on graph learning are mostly on node-level and edge-/link prediction in networks, e.g., in social networks. Graph learning has become more important in scientific and medical domains, which is the topic of this tutorial lecture. Graph learning in scientific and medical domains is fundamentally different from traditional graph learning in two respects. First, in these domains, an individual, e.g., a patient and a chemical compound, is a graph where an individual is a whole graph and nodes in each graph are features of an individual such as atoms or genes. Note that in social networks, individuals are nodes. Second, graph learning in these domains is to discover distinguishing features of patient groups or chemical compound sets and classify them in terms of annotated characteristics such as cancer metastasis or toxicity, thus mining on a number of graphs. This tutorial is to survey recent developments in this topic. There are a number of research questions in this topic. First, graph construction is not straightforward and construction of graphs from data requires thoughtful strategies. Second, there are usually small number of samples in these domains, thus graph augmentations is another important research question. Third, in the medical domains, graph learning needs to require decoding of complex interactions of features, e.g., genes. Fourth, in the chemical domain, graphs vary significantly in terms of graph size, and graph mining in the chemical domain requires to optimize quite a number of tasks simultaneously, thus multi-task learning with graphs of varying sizes. Lastly, in the drug repositioning task, it is necessary to deal with heterogeneous networks where nodes are of completely different entities such as genes, diseases, and drug compounds. Mining associations in these heterogeneous networks require sophisticated computational strategies. In this tutorial, we will define research questions and survey recent developments in each of research questions with limitations of current technologies and future directions.

이슬 교수
(아주대학교)

Biography
2019-현재	아주대학교 소프트웨어학과 부교수
2012-2018	한국뉴욕주립대 (Stony Brook Univ.) 컴퓨터학과 조교수
2011-2012	삼성종합기술원 Data Analytics 그룹 전문연구원
2010-2011	Purdue University 박사후연구원
2010	Purdue University 컴퓨터학과 박사
2005	고려대학교 컴퓨터학과 학사

Deep Trees as Accurate and Interpretable Inference Models (90분)

Accurate but interpretable inference models are in need for various real-world data analyses. For feature-based data, decision trees have been one of the traditional methods for interpretability. However, the generalizability of the decision trees is low. In this talk, I’ll introduce a series of efforts on improving prediction accuracies while maintaining the interpretability of the tree-structured model. More specifically, I’ll introduce 1) the use of Knowledge Distillation as an effort to improve the interpretability and accuracy of soft decision trees, 2) the use of Gaussian Mixture nodes to improve the interpretability of deep trees, and lastly, 3) a general representation of deep tree models for assisting automatic model structure search.

김기응 교수
(KAIST)

Biography
2006-현재	KAIST 전산학부/김재철AI대학원 교수
2004-2006	삼성종합기술원 전문연구원
2001-2003	삼성SDS 책임연구원
2001	Brown University 박사

OptiDICE for Offline Reinforcement Learning (90분)

Offline reinforcement learning (RL) refers to the problem setting where the agent aims to optimize the policy solely from the pre-collected data without further environment interactions. In offline RL, the distributional shift becomes the primary source of difficulty, which arises from the deviation of the target policy being optimized from the behavior policy used for data collection. This typically causes overestimation of action values, which poses severe problems for model-free algorithms that use bootstrapping. To mitigate the problem, prior offline RL algorithms often used sophisticated techniques that encourage underestimation of action values, which introduces an additional set of hyperparameters that need to be tuned properly. In this talk, I present OptiDICE, an offline RL algorithm that prevents overestimation in a more principled way. OptiDICE directly estimates the stationary distribution corrections of the optimal policy and does not rely on policy-gradients, unlike previous offline RL algorithms. Using an extensive set of benchmark datasets for offline RL, OptiDICE is shown to perform competitively with the state-of-the-art methods. This is a joint work with Jongmin Lee (UC Berkeley), Wonseok Jeon (Qualcomm), Byung-Jun Lee (Korea U.), and Joelle Pineau (MILA).