산업체 세션

8월 18일 (목요일)
09:30 - 10:15	김세훈 그룹장 (카카오브레인) 카카오브레인 재미있고 의미있는 일을 하는 곳인가요? (45분)
10:15 - 11:00	김찬우 부사장 (삼성리서치) An overview of end-to-end speech recognition and text-to-speech algorithms (45분)
11:00 - 11:45	최재식 대표 (인이지) 산업현장에서 생산 품질과 에너지 효율을 높이는 인공지능 기술 (45분)
11:45 - 12:30	이윤근 소장 (ETRI) ETRI 인공지능 R&D 추진 전략 및 주요 성과 (45분)
13:30 - 14:15	김익재 소장 (KIST) KIST에서의 AI·로봇 분야 연구 및 주요 기술 소개 (45분)
14:15 - 15:00	정하욱 부대표 (라이드플럭스) AI 발전과 자율주행 모빌리티 서비스 (45분)
15:30 - 16:15	윤재선 상무 (SELVAS AI) 음성인식 AI 솔루션 및 적용 사례 (45분)
16:15 - 17:00	유정기 팀장 (베어로보틱스) 서빙로봇의 기술적 특징과 해당 관점에서의 AI (45분)

김세훈 그룹장
(카카오브레인)

Biography
2020-현재	카카오브레인 AI Researcher
2017-2019	AITRICS 연구팀장
2018	POSTECH 컴퓨터공학 박사
2012	MSRA, MSR 연구 인턴
2009	POSTECH 컴퓨터공학 학사

카카오브레인 재미있고 의미있는 일을 하는 곳인가요? (45분)

Building general-purpose AI systems is an undoubtedly very challenging task but brings many interesting and meaningful applications. To develop this powerful model, three factors, (a) large-scale data, (b) high-performing computing infrastructure, and (c) sophisticated AI models, are equally important. In this talk, we will present our efforts to advance the current limitations in these three factors. Firstly, we will briefly overview our billion-scale multi-modal dataset, including technical challenges encountered during this dataset construction. Secondly, we will introduce our easy-to-use GPU cloud system (called BrainCloud) and discuss how this internal service has improved the productivity of ML researchers and engineers. In addition, we will present our implementation of scalable pipeline parallelism, called Torch GPipe, which is one of our many attempts to accelerate the training of large-scale models on BrainCloud. Lastly, in the fields of multi-modal visual understanding, we will conclude this talk by sharing our latest models and future research plans.

김찬우 부사장
(삼성리서치)

Biography
Chanwoo Kim has been a corporate executive vice president at Samsung research leading the language and voice team. He joined Samsung research as a corporate vice president heading the speech processing Lab in Feb. 2018. He has been leading research on end-to-end speech recognition, end-to-end text-to-speech (TTS), machine translation, Natural Language Understanding (NLU), Language Modeling (LM) and Question Answering (QA), speech enhancement, key-word spotting, and so on at Samsung Research. Most of these research outcomes have been commercialized for Samsung products. He was a software engineer at the Google speech team between Feb. 2013 and Feb. 2018. He worked for acoustic modeling for speech recognition systems and enhancing noise robustness using deep learning techniques. While working for Google, he contributed to data-augmentation and acoustic modeling of Google speech recognition systems. He contributed to the commercialization of various Google AI speakers and google speech recognition systems. He was a speech scientist at Microsoft from Jan. 2011 to Jan. 2013. Dr. Kim received his Ph. D. from the Language Technologies Institute of School of Computer Science at Carnegie Mellon University in Dec. 2010. He received his B.S and M.S. degrees in Electrical Engineering from Seoul National University in 1998 and 2001, respectively. Dr. Kim’s doctoral research was focused on enhancing the robustness of automatic speech recognition systems in noisy environments. Between 2003 and 2005 Dr. Kim was a Senior Research Engineer at LG Electronics, where he worked primarily on embedded signal processing and protocol stacks for multimedia systems. Prior to his employment at LG, he worked for EdumediaTek and SK Teletech as an R&D engineer.

Biography

Chanwoo Kim has been a corporate executive vice president at Samsung research leading the language and voice team. He joined Samsung research as a corporate vice president heading the speech processing Lab in Feb. 2018. He has been leading research on end-to-end speech recognition, end-to-end text-to-speech (TTS), machine translation, Natural Language Understanding (NLU), Language Modeling (LM) and Question Answering (QA), speech enhancement, key-word spotting, and so on at Samsung Research. Most of these research outcomes have been commercialized for Samsung products. He was a software engineer at the Google speech team between Feb. 2013 and Feb. 2018. He worked for acoustic modeling for speech recognition systems and enhancing noise robustness using deep learning techniques. While working for Google, he contributed to data-augmentation and acoustic modeling of Google speech recognition systems. He contributed to the commercialization of various Google AI speakers and google speech recognition systems. He was a speech scientist at Microsoft from Jan. 2011 to Jan. 2013. Dr. Kim received his Ph. D. from the Language Technologies Institute of School of Computer Science at Carnegie Mellon University in Dec. 2010. He received his B.S and M.S. degrees in Electrical Engineering from Seoul National University in 1998 and 2001, respectively. Dr. Kim’s doctoral research was focused on enhancing the robustness of automatic speech recognition systems in noisy environments. Between 2003 and 2005 Dr. Kim was a Senior Research Engineer at LG Electronics, where he worked primarily on embedded signal processing and protocol stacks for multimedia systems. Prior to his employment at LG, he worked for EdumediaTek and SK Teletech as an R&D engineer.

An overview of end-to-end speech recognition and text-to-speech algorithms (45분)

In this talk, we give an overview of the latest end-to-end Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) algorithms. We also discuss optimization techniques to reduce the model size and obtain better performance. Conventional ASR and TTS systems consist of multiple handcrafted components. However, the introduction of the fully neural sequence-to-sequence technologies has greatly simplified the structure while significantly improving the performance. In this talk, we give an overview of the important end-to-end ASR structures including a stack of neural network layers with a Connectionist Temporal Classification (CTC) loss, Recurrent Neural Network Transducer (RNN-T), Transformer Transducer, and Conformer Transducer (Conformer-T), and models based on Attention-based Encoder-Decoder (AED). We also describe well-known TTS models including Tacotron, Tacotron 2, and Deep Convolutional Text-To-Speech (DC-TTS).

최재식 대표
(인이지)

Biography
2019-현재	인이지 대표
2020-현재	대통령직속 4차산업혁명위 과기혁신위 AI분과 소위원장
2019-현재	KAIST 김재철 AI 대학원, 부교수
2019-현재	삼성전자 미래기술연구회 회원
2017-현재	과기부 설명가능인공지능 연구센터장
2019	국무총리표창
2013-2019	UNIST 전기전자컴퓨터 조교수/부교수
2013-2019	미국 로렌스 버클리 국립연구소 겸임교수
2012	일리노이대학교 컴퓨터과학 박사
2004	서울대학교 컴퓨터공학 학사

산업현장에서 생산 품질과 에너지 효율을 높이는 인공지능 기술 (45분)

인공지능 기술은 여러 산업 분야에 적용되어 생산 효율을 높이고 자동화를 통해 서비스 성능을 높이고 있다. 그 중 설명가능 인공지능 기반 예측 기술은 제조업에 적용되어 생산량을 높이고, 생산 효율을 높이는 등의 성과를 보이고 있다. 최근에는 우크라이나 전쟁 및 공급망 재편등으로 에너지 비용이 많이 증가하여 제조 기업들의 생산비용이 가중되고 있다. 본 발표에서는 철강, 발전, 화학등의 분야에 적용하여 에너지 비용을 줄이고, 생산 효율을 높이는 설명가능 인공지능 기술을 소개하고, 그 발전 방향을 조망한다.

이윤근 소장
(ETRI)

Biography
2021-현재	인공지능 법제정비단 위원
2020-현재	한국전자통신연구원 오픈소스위원회 위원장
2019-현재	한국전자통신연구원 인공지능연구소장
2005-2019	한국전자통신연구원 음성처리연구실장, 자동통역·언어지능연구부장
2000-2004	(주)보이스웨어 연구소장 (음성인식/합성 연구)
1988-2000	LG전자 기술원 책임연구원 (음성인식/합성 연구)
1998	KAIST 정보및통신공학과 박사
1988	KAIST 전기및전자공학과 석사
1986	서울대학교 공과대학 제어계측공학과 학사

ETRI 인공지능 R&D 추진 전략 및 주요 성과 (45분)

전 세계는 현재, 인공지능 무한경쟁의 시대에 살고 있으며, 주요 선진국들은 국가 차원의 인공지능 혁신전략을 수립, 운영하고 있다. 본 강연에서는, 인공지능 기술의 글로벌 발전동향과 우리나라의 위치, 그리고 ETRI의 인공지능 실행전략에 대해 소개한다. 또한 인공지능연구소에서 연구하는 언어지능, 시각지능 등 다양한 인공지능 핵심기술과 인공지능 컴퓨팅, 반도체와 같은 인프라 기술, 그리고 로봇, 자율주행차와 같은 모빌리티 서비스기술의 주요 연구성과를 소개한다.

김익재 소장
(KIST)

Biography
2020-현재	KIST AI·로봇연구소 소장
2022	한국공학한림원 젊은 공학인상 수상
2021-현재	연세대학교, 고려대학교 겸임교수
2020	산업훈장(석탑) 수상
2020	국가연구개발 우수성과 100선 (최우수)
2017~ 2020	KIST 영상미디어연구단 단장
2009-2010	MIT Media Lab 박사후연구원
2009	서울대학교 전기컴퓨터공학 박사

KIST에서의 AI·로봇 분야 연구 및 주요 기술 소개 (45분)

KIST AI·로봇연구소에서는 첨단 AI 및 로봇 기술을 개발하여, 미래 사회에서 우리가 더욱 편리하고 안전한 삶을 누릴 수 있도록 다양한 연구를 수행 중에 있으며, 본 발표에서는 연구소에서 수행하고 있는 연구 중, 주요 사례로 초고령 사회에 대응하는 고령자 건강 진단 및 근력 보조 웨어러블 로봇 기술, 국민의 안전한 삶을 위한 복합인지 기술 기반 신원확인 기술, 스마트 팩토리를 위한 인공지능 기술, 사람의 손을 닮은 AI-로봇 핸드 기술 등을 소개하고자 한다.

정하욱 부대표
(라이드플럭스)

Biography
2020-현재	라이드플럭스 부대표
2021-현재	중앙대학교 첨단영상대학원 영상학과 겸임교수
2018-2020	라이드플럭스 연구원
2015-2018	삼성전자 DMC 연구소/무선사업부 Staff Engineer
2015	서울대학교 전기컴퓨터공학부 박사
2011	서울대학교 전기컴퓨터공학부 석사
2009	서울대학교 전기공학부 학사

AI 발전과 자율주행 모빌리티 서비스 (45분)

자율주행 기술은 고도로 발전하면 교통사고를 줄이고, 운전자가 운전대를 잡는 시간을 다른 활동으로 더 가치 있게 쓸 수 있어 중요한 미래기술 중 하나로 꼽힙니다. 인공지능, 로보틱스, 센서, 플랫폼 등의 기술발전은 과거 10년간 자율주행 관련 기술을 빠르게 발전시켜왔지만, 완전자율주행 기술을 이루기 위해서는 아직 해결해야할 과제가 많이 있습니다. 본 강연에서는 자율주행 기술의 핵심 키워드를 살펴보고 완전 자율주행을 점진적으로 이루기 위한 신뢰성 확보 방법에 대해 다루고자 합니다.

윤재선 상무
(SELVAS AI)

Biography
현) SELVAS AI STT Lab장
전) DIOTEK 음성인식 팀장
전) VoiceTech 음성인식 책임

음성인식 AI 솔루션 및 적용 사례 (45분)

AI & Data 적용을 통한 디지털 트랜스포메이션를 주도하는 국내 1호 인공지능 전문기업 셀바스 AI의 Selvy STT의 솔루션을 설명하고, 컨택센터, 의료분야, 회의록 분야, 교육분야 등 실제로 적용된 다양한 사업 분야의 적용 사례를 소개합니다.

유정기 팀장
(베어로보틱스)

Biography
2021-현재	베어로보틱스 코리아 로보틱스팀 팀장/매니져
2015-2021	대한로봇축구협회 이사
2015-2021	대전대학교 정보통신전자공학과 조교수
2012-2015	삼성전자 DMC연구소 책임 연구원
2012	KAIST 전자공학전공 박사
2004	KAIST 전자공학전공 석사
2002	연세대학교 전자공학 학사

베어로보틱스 소개 및 서빙로봇관점에서의 AI (45분)

베어로보틱스의 발자취를 소개하고, 서빙 로봇 분야의 기술 키워드를 기반으로 로봇 분야 AI 적용 시 요구되는 요소들에 대해 이야기하고자 합니다.