LIÊN KẾT WEBSITE
Cross lingual transfer learning for sentiment analysis of Italian TripAdvisor reviews
Expert Systems with Applications Số , năm 2022 (Tập 209, trang -)
ISSN: 9574174
ISSN: 9574174
DOI:
Tài liệu thuộc danh mục:
Article
English
Từ khóa: Convolutional neural networks; Transfer learning; BERT; Cross-lingual; Google+; Italian dataset; Language model; Model-based OPC; Performance; Sentiment analysis; Transfer learning; Tripadvisor; Sentiment analysis
Tóm tắt tiếng anh
Over the years, the attention of the scientific world towards the techniques of sentiment analysis has increased considerably, driven by industry. The arrival of the Google BERT language model has confirmed the superiority of models based on a particular structure of artificial neural network called Transformer, from which many variants have resulted. These models are generally pre-trained on large text corpora and only later specialized according to the precise task to be faced on much smaller amounts of data. For these reasons, countless versions were developed to meet the specific needs of each language, especially in the case of languages with relatively few datasets available. At the same time, models that were pre-trained for multiple languages became widespread, providing greater flexibility of use in exchange for lower performance. This study shows how the use of techniques to transfer learning from languages with high resources to languages with low resources provides an important performance increase: a multilingual BERT model fine tuned on a mixed English/Italian dataset (using for the English a literature dataset and for the Italian a reviews dataset created ad-hoc from the well-known platform TripAdvisor), provides much higher performance than models specific to Italian. Overall, the results obtained by comparing the different possible approaches indicate which one is the most promising to pursue in order to obtain the best results in low resource scenarios. 2022 Elsevier Ltd