VNDS A vietnamese dataset for summarization

Proceedings - 2019 6th NAFOSTED Conference on Information and Computer Science, NICS 2019 Số , năm 2019 (Tập , trang 375-380)

We have seen a lot of interesting developments and research in text summarization. While numerous approaches for summarization have been widely studied and applied in various domains in English, it is still an early stage in Vietnamese due to a few number of papers, systems, and the lack of benchmark datasets. Inspired to contribute to make a progress in Vietnamese language research, firstly in this paper we create a standard dataset for document summarization. To the best our knowledge, we are the first to formally publish the large benchmark dataset of summarization. Secondly, we make a comparison of traditional and state-of-the-art extractive and abstractive summarization on our dataset. We strongly believe that the results of our work will facilitate studies of text summarization in Vietnamese for the future. � 2019 IEEE.

VNDS: A vietnamese dataset for summarization