• Chỉ mục bởi
  • Năm xuất bản
LIÊN KẾT WEBSITE

A rich task-oriented dialogue corpus in Vietnamese

Luong FPT Technology Research Institute, FPT University, Hanoi, Viet Nam|
Oanh Thi (57214800352) | Phuong (55203653900); Tran University of Science, Vietnam National University Hanoi, Hanoi, Viet Nam| Tho Chi (57203504278); Le-Hong International School, Vietnam National University Hanoi, Hanoi, Viet Nam|

Language Resources and Evaluation Số , năm 2022 (Tập , trang -)

ISSN: 1574020X

ISSN: 1574020X

DOI:

Tài liệu thuộc danh mục:

Article

English

Tóm tắt tiếng anh
This paper introduces a new Vietnamese multi-domain task-oriented dialogue corpus which is fully labeled with rich information on dialogue structure and contextual information. The corpus contains 1910 dialogues, with a total of more than 18,000 turns in four domains (i.e., ProductInfo, OrderInfo, Shipping and Chatchit). To the best of our knowledge, this is the first dialogue corpus towards building automated conversations in e-commerce. We describe the rigorous annotation process of labelling rich information about dialogue segmentation, dialogue acts (DAs, a.k.a communicative functions), dependency relations, rhetorical relations and slot-values on both user and system sides. This corpus will alleviate the shortage of dialogue datasets in low-resource languages, namely Vietnamese. It can be exploited in diverse contexts to facilitate research toward building complete dialogue systems. The large size and rich annotation of the corpus make it suitable to investigate a variety of different tasks in conversational systems. In this paper, we perform extensive experiments and report preliminary results for future studies in this interesting yet unexplored field. Specifically, we illustrate the usage of the corpus in developing key modules such as natural language understanding, belief tracking, dialogue policy management and natural language generation. � 2022, The Author(s), under exclusive licence to Springer Nature B.V.

Xem chi tiết