Improving sentiment classification using a RoBERTa-based hybrid model
Authors:
- Noura A. Semary,
- Wesam Ahmed,
- Khalid Amin,
- Paweł Pławiak,
- Mohamed Hammad
Abstract
Introduction: Several attempts have been made to enhance text-based sentiment analysis’s performance. The classifiers and word embedding models have been among the most prominent attempts. This work aims to develop a hybrid deep learning approach that combines the advantages of transformer models and sequence models with the elimination of sequence models’ shortcomings. Methods: In this paper, we present a hybrid model based on the transformer model and deep learning models to enhance sentiment classification process. Robustly optimized BERT (RoBERTa) was selected for the representative vectors of the input sentences and the Long Short-Term Memory (LSTM) model in conjunction with the Convolutional Neural Networks (CNN) model was used to improve the suggested model’s ability to comprehend the semantics and context of each input sentence. We tested the proposed model with two datasets with different topics. The first dataset is a Twitter review of US airlines and the second is the IMDb movie reviews dataset. We propose using word embeddings in conjunction with the SMOTE technique to overcome the challenge of imbalanced classes of the Twitter dataset. Results: With an accuracy of 96.28% on the IMDb reviews dataset and 94.2% on the Twitter reviews dataset, the hybrid model that has been suggested outperforms the standard methods. Discussion: It is clear from these results that the proposed hybrid RoBERTa–(CNN+ LSTM) method is an effective model in sentiment classification.
- Record ID
- CUT0c899974716e45cd993268cb27cf1d35
- Publication categories
- ;
- Author
- Journal series
- Frontiers in Human Neuroscience, ISSN 1662-5161
- Issue year
- 2023
- Vol
- 17
- Pages
- [1-10]
- Other elements of collation
- rys.; tab.; wykr.; Bibliografia (na s.) - 10; Numeracja w czasopiśmie - Vol. 17
- Keywords in English
- sentiment analysis, word embedding, RoBERTa, SMOTE, LSTM, CNN+LSTM
- ASJC Classification
- ; ; ; ;
- DOI
- DOI:10.3389/fnhum.2023.1292010 Opening in a new tab
- URL
- https://www.frontiersin.org/articles/10.3389/fnhum.2023.1292010/full Opening in a new tab
- Language
- eng (en) English
- License
- Score (nominal)
- 100
- Score source
- journalList
- Score
- Publication indicators
- Citation count
- 1
- Uniform Resource Identifier
- https://cris.pk.edu.pl/info/article/CUT0c899974716e45cd993268cb27cf1d35/
- URN
urn:pkr-prod:CUT0c899974716e45cd993268cb27cf1d35
* presented citation count is obtained through Internet information analysis, and it is close to the number calculated by the Publish or PerishOpening in a new tab system.