Response type selection for chat-like spoken dialog systems based on LSTM and multi-task learning

Ohta, Kengo; Nishimura, Ryota; Kitaoka, Norihide

doi:10.1016/j.specom.2021.07.003

直近一年間の累計

アクセス数 : ? 件

ダウンロード数 : ? 件

この文献の参照には次のURLをご利用ください : https://repo.lib.tokushima-u.ac.jp/116668

ID	116668
著者	Ohta, Kengo National Institute of Technology, Anan College 西村, 良太 Tokushima University 徳島大学教育研究者総覧北岡, 教英 Toyohashi University of Technology KAKEN研究者をさがす
キーワード	Spoken dialog system Response type selection Encoder–decoder model Multi-task learning
資料タイプ	学術雑誌論文
抄録	We propose a method of automatically selecting appropriate responses in conversational spoken dialog systems by explicitly determining the correct response type that is needed first, based on a comparison of the user’s input utterance with many other utterances. Response utterances are then generated based on this response type designation (back channel, changing the topic, expanding the topic, etc.). This allows the generation of more appropriate responses than conventional end-to-end approaches, which only use the user’s input to directly generate response utterances. As a response type selector, we propose an LSTM-based encoder–decoder framework utilizing acoustic and linguistic features extracted from input utterances. In order to extract these features more accurately, we utilize not only input utterances but also response utterances in the training corpus. To do so, multi-task learning using multiple decoders is also investigated. To evaluate our proposed method, we conducted experiments using a corpus of dialogs between elderly people and an interviewer. Our proposed method outperformed conventional methods using either a point-wise classifier based on Support Vector Machines, or a single-task learning LSTM. The best performance was achieved when our two response type selectors (one trained using acoustic features, and the other trained using linguistic features) were combined, and multi-task learning was also performed.
掲載誌名	Speech Communication
ISSN	01676393
cat書誌ID	AA10630135 AA11541653
出版者	Elsevier
巻	133
開始ページ	23
終了ページ	30
発行日	2021-07-15
権利情報	This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
EDB ID	376748
出版社版DOI	10.1016/j.specom.2021.07.003
出版社版URL	https://doi.org/10.1016/j.specom.2021.07.003
フルテキストファイル	specom_133_23.pdf 1.05 MB
言語	eng
著者版フラグ	出版社版
部局	理工学系