최승진
IntellicodeThis tutorial aims to provide a comprehensive introduction to optimization and training techniques for beginners in the field of deep learning. Divided into two parts, the tutorial begins with an exploration of fundamental concepts in gradient descent and its convergence behavior in convex optimization. Part I delves into the evolution of optimization algorithms, transitioning from traditional gradient descent to more sophisticated methods like stochastic gradient descent (SGD), momentum, AdaGrad, RMSProp, and Adam. In Part II, the tutorial shifts its focus to various training methods essential for building deep neural networks. It introduces dropout, a regularization technique that combats overfitting, and batch normalization, a method designed to stabilize and expedite training. The tutorial also explores transfer learning via fine-tuning, an approach to adapt pre-trained models to new tasks. Finally, it introduces the lottery ticket hypothesis, uncovering the secrets behind finding sparse and trainable neural networks with matching the test accuracy of the original network.
노영균
한양대학교/In this lecture, we review how we can take advantage of using information theory. As a theory, it provides referential and comprehensive concepts for practical applications. I will explain the theoretical motivations and justifications for many machine learning algorithms. The explanations will include various methods for feature selection, regularization and in particular, recent advances in the estimation of information contents for various applications.
신현정
아주대학교The choice of loss functions and performance metrics is a fundamental aspect of machine learning and deep learning, playing a crucial role in model training and evaluation. This presentation will explore commonly used loss functions and performance metrics, evaluating their strengths and limitations. The goal is to offer insights into various loss and performance measurements, helping guide the selection of suitable ones for specific tasks.
윤철희
KAISTOver the last decade, deep learning has redefined the state-of-the-art in many application areas of machine learning and artificial intelligence. However, the success of deep learning has left many theoretical questions that still remain elusive today. This tutorial aims to provide a crash course on foundations and recent research results in the mathematical theory of deep learning. The following three central questions are covered. 1) Approximation: What functions can deep neural networks represent and approximate? 2) Optimization: Why can gradient-based optimization methods train deep neural networks to global optimality? 3) Generalization: Why can deep neural networks interpolate all training data points, and generalize to unseen data at the same time?
김은솔
한양대학교In this lecture, we aim to understand the structure of the Transformer, which is widely used in many large-scale foundation models. Starting with Recurrent Neural Networks, we'll explain the background leading to the proposal of the attention mechanism and delve into the structure and learning principles of the Transformer. Particularly, we will explain the relationship between attention mechanisms and max-margin token selection through an understanding of the nonconvex optimization dynamics of the soft attention method. Additionally, we will introduce recent methods proposed to overcome the limitations of attention methods from a sequential learning perspective. Specifically, we will present the latest approaches (Mamba, Hyena, S4, etc.) proposed based on the state space machine method to address the long-range dependency problem.
Early Registration (~ 2월 2일) |
Late Registration | ||
---|---|---|---|
Academy | 교수 | 25만원 | 30만원 |
학생 | 15만원 | 20만원 | |
Industry | 25만원 | 30만원 |