• Chỉ mục bởi
  • Năm xuất bản
LIÊN KẾT WEBSITE

Cross lingual transfer learning for sentiment analysis of Italian TripAdvisor reviews

Catelli Institute for High Performance Computing and Networking (ICAR), National Research Council (CNR), Naples, Italy|
Massimo (25926920500) | Giuseppe (6508247659); Esposito | Hamido (35611951900); De Pietro | Massimo (13404533800); Fujita | Vladimiro (15135832300); Magaldi Engineering Ingegneria Informatica S.p.A., Naples, Italy| Nicola (57201642807); Scotto di Carlo i-somet Incorporation Association, Morioka, Japan| Luca (25928252900); Mariniello National Taipei University of Technology, Taipei, Taiwan| Rosario (57210848004); Bevilacqua Faculty of Information Technology, Ho Chi Minh City University of Technology (HUTECH), Ho Chi Minh City, Viet Nam|

Expert Systems with Applications Số , năm 2022 (Tập 209, trang -)

ISSN: 9574174

ISSN: 9574174

DOI:

Tài liệu thuộc danh mục:

Article

English

Từ khóa: Convolutional neural networks; Transfer learning; BERT; Cross-lingual; Google+; Italian dataset; Language model; Model-based OPC; Performance; Sentiment analysis; Transfer learning; Tripadvisor; Sentiment analysis
Tóm tắt tiếng anh
Over the years, the attention of the scientific world towards the techniques of sentiment analysis has increased considerably, driven by industry. The arrival of the Google BERT language model has confirmed the superiority of models based on a particular structure of artificial neural network called Transformer, from which many variants have resulted. These models are generally pre-trained on large text corpora and only later specialized according to the precise task to be faced on much smaller amounts of data. For these reasons, countless versions were developed to meet the specific needs of each language, especially in the case of languages with relatively few datasets available. At the same time, models that were pre-trained for multiple languages became widespread, providing greater flexibility of use in exchange for lower performance. This study shows how the use of techniques to transfer learning from languages with high resources to languages with low resources provides an important performance increase: a multilingual BERT model fine tuned on a mixed English/Italian dataset (using for the English a literature dataset and for the Italian a reviews dataset created ad-hoc from the well-known platform TripAdvisor), provides much higher performance than models specific to Italian. Overall, the results obtained by comparing the different possible approaches indicate which one is the most promising to pursue in order to obtain the best results in low resource scenarios. � 2022 Elsevier Ltd

Xem chi tiết