패턴인식/기계학습 여름학교

9월 30일 (목요일)
10:00 - 11:30	신현정교수(아주대학교) Convex optimization (90분)
13:30 - 15:00	최승진교수(BARO AI) Bayesian optimization (90분)
15:30 - 17:00	장청재교수(한양대학교) Stochastic Gradient Descent (90분)

10월 1일 (금요일)
10:00 - 11:30	오민환교수(서울대학교) Bandit optimization (90분)
13:30 - 15:00	송현오교수(서울대학교) Combinatorial optimization (90분)
15:30 - 17:00	윤세영교수(KAIST) Submodular optimization (90분)

신현정교수
(아주대학교)

Biography
2006-현재	아주대학교 산업공학과/인공지능학과/융합시스템공학과 교수
2017-현재	한국정보과학회 인공지능소사이어티 교육부회장
2017-현재	한국정보과학회 기계학습연구회 회장
2014-현재	서울대 의과대학 유전체/임상/정보분석 전문가과정 강사
2013-현재	대한 의료정보학회 정보의학인증의 운영진 및 강사
2011-현재	국민건강보험 심사평가원 검사 심의위원회 부위원장
2007-현재	한국 BI 데이터마이닝 학회 등기이사
2006	서울대학교 의과대학 연구교수
2005-2006	Friedrich-Mierscher-Lab, Max-Planck-Institute (독일) 수석연구원
2004-2005	Max-Planck-Institute for Biological Cybernetics (독일) 연구원
2000-2005	서울대학교 공과대학 (산업공학/데이터마이닝) 공학박사

Convex optimization (90분)

Convex optimization is an important enough topic that everyone who learns machine learning should know at least little bit about it. Many problems in machine learning are based on finding parameters that minimize some objective function. Very often, it is a weighted sum of two components: a cost term and a regularization term. If both of these components are convex, then the problem is a convex optimization problem. There are great advantages to recognizing or formulating a problem as a convex optimization problem. Most importantly, if a function is strictly convex, it is guaranteed to have a unique global minimum, and it can be solved, very reliably and efficiently, using standard methods. There are also theoretical or conceptual advantages that the associated dual problem, for example, often has an interesting interpretation in terms of the original problem, and sometimes leads to an efficient method for solving it. Typical examples of convex optimization problems in machine learning include support vector machines, semi-supervised learning, ridge regression with Tikhonov regularization, whereas neural networks, maximum likelihood mixtures of Gaussians are non-convex problems. In this talk, we give an overview of mathematical optimization, focusing on the special role of convex optimization, and then describe the convex programming formulations of widely known machine learning algorithms.

최승진교수
(BARO AI)

Biography
2021-현재	상임고문, BARO AI Academy
2019-2020	CTO, BARO AI
2001-2019	POSTECH 컴퓨터공학과 교수
2019-2021	정보과학회 인공지능소사이어티 회장
2018	삼성전자 종합기술원 자문교수
2017-2018	삼성리서치 AI센터 자문교수
2016-2017	신한카드 빅데이터센터 자문교수
2014-2016	정보과학회 머신러닝연구회 초대위원장
1997-2001	충북대학교 전기전자공학부 조교수
1997	Frontier Researcher, RIKEN, Japan
1996	Ph.D. University of Notre Dame, Indiana, USA

Bayesian optimization (90분)

Bayesian optimization is a sample-efficient method for finding a global optimum of an expensive-to-evaluate black-box function. It has been widely used in various applications such as automated machine learning, hyperparameter optimization, gait optimization, biological sequence design, material design, to name a few. Bayesian optimization is a rapidly-growing principled method which is useful when we look for a solution with limited budget or resource. In this talk, I begin with the standard Bayesian optimization where the decision variables take real values. I will explain two core ingredients of Bayesian optimization: (1) statistical surrogate models which estimate the black-box function; (2) acquisition function optimization to determine where next to sample from the objective function. I will also discuss Bayesian optimization with black-box constraints, referred to as ‘constrained Bayesian optimization’. If time permitted, I will illustrated how to handle discrete decision variables.

장청재교수
(한양대학교)

Biography
2020-현재	한양대학교 인공지능연구원 연구조교수
2019-2020	한국과학기술연구원 Post-Doc.
2012-2019	서울대학교 공과대학 기계항공공학부 공학박사
2008-2012	서울대학교 공과대학 기계항공공학부 공학사

Stochastic Gradient Descent (90분)

Stochastic gradient descent (SGD) is an iterative method that optimizes an objective function using stochastically approximated gradients (computed from random samples drawn from training data) rather than the actual gradients (computed from the entire training data). SGD has been a significant workhorse in modern machine learning. In this tutorial, we will cover some theoretical properties of SGD to figure out which characteristics of SGD have led to its prevalence. A few examples will be provided to illustrate interesting behaviors of SGD in modern machine learning problems, which will be summarized into two essential features: (i) a more efficient large-scale learning and (ii) a better generalization of deep neural networks. The former can be explained by theoretical analyses of the optimal approximation-estimation-optimization error trade-off in the expected risk under a limited computation time budget. The latter is observed in recent empirical results that, with small batch sizes, SGD can make the parameters for deep neural networks converge to flat minima of the objective function.

오민환교수
(서울대학교)

Biography
2020-현재	Assistant Professor, Graduate School of Data Science, Seoul National University
2020	Ph.D. Operations Research with Data Science specialization, Columbia University
2016	M.S. Operations Research, Columbia University
2015	B.A. Mathematics-Statistics, Columbia University
His research interests include Sequential decision making under uncertainty, Reinforcement learning, Bandit algorithms and Statistical machine learning

Bandit optimization (90분)

Multi-armed bandits (MAB) is a powerful and general framework for algorithms that make sequential decisions over time under uncertainty. MAB is a problem where a decision-making agent simultaneously attempts to obtain new information (i.e., "exploration") and optimize its decisions based on existing information (i.e., "exploitation"). Seeking a suitable balance between exploration and exploitation, the agent tries to find an optimal action. Hence, the MAB problem is a classic reinforcement learning problem that exemplifies the exploration-exploitation trade-off dilemma. The MAB enjoys the availability of many efficient algorithms in the literature, with rigorous near-optimal theoretical guarantees on performance. This talk discusses some of the foundational techniques and recent advances in MAB. Specifically, we discuss progress on techniques based on upper confidence bounds and Thompson sampling for the contextual bandit, a sequential decision-making problem that bridges between the basic MAB and reinforcement learning.

송현오교수
(서울대학교)

Biography
2017-현재	서울대학교 컴퓨터공학부, 전임교수
2016-2017	Google Research (Research Scientist)
2014-2016	Stanford University (Postdoctoral fellow)
2014	University of California, Berkeley, Computer Science 박사

Combinatorial optimization (90분)

In this talk, I will first introduce various recent techniques in machine learning on data augmentation and data mixup for improving the generalization ability of the learned ML models. Then, I will discuss our latest research on the state of the art mixup methods which appeared at ICML2020 and ICLR2021 (oral).

윤세영교수
(KAIST)

Biography
2017-현재	KAIST 김재철 AI대학원/산업 및 시스템 공학과 교수
2021-현재	KAIST AI 기상 예측 연구센터장
2016-2017	Los Alamos National Lab (미국), 박사후연구원
2015-2016	Microsoft Research, Cambridge (영국) 방문연구원
2014-2015	MSR-INRIA joint research center (프랑스) 연구원
2013-2014	KTH (스웨덴), 박사후연구원
2012	KAIST 공학 박사

Submodular optimization (90분)

Submodular optimization has a lot of applications in machine learning. Submodularity is a property of set functions where each element contributes a decreasing increment as the input set expands. For instance, information gained by placing a new sensor diminishes as we deploy more other sensors. There are many other submodular optimization problems in ML including sparse reconstruction, video analysis, clustering, document summarization, object detection, information retrieval, network inference, and discrete adversarial attacks.. Unfortunately, it is extremely difficult to find the exact optimal solution for submodular optimization problems. Many algorithms thus do not find the optimal solution but approximate the optimal solution with provable guarantees and having practical computations. In this talk, we will first introduce various types of submodular optimization problems and their applications. We then cover algorithms for each submodular problem and their theoretical analysis.