Reducing the Effect of Imbalance in Text Classification Using SVD and GloVe with Ensemble and Deep Learning

Authors

  • Tajbia Hossain Department of Computer Science and Engineering, Ahsanullah University of Science and Technology, Dhaka, Bangladesh
  • Humaira Zahin Mauni Department of Computer Science and Engineering, Ahsanullah University of Science and Technology, Dhaka, Bangladesh
  • Raqeebir Rab Department of Computer Science and Engineering, Ahsanullah University of Science and Technology, Dhaka, Bangladesh

DOI:

https://doi.org/10.31577/cai_2022_1_98

Keywords:

Deep learning, ensemble learning, machine learning, text classification, imbalanced data, singular value decomposition, global vectors

Abstract

Due to the recent escalation in the amount of text data available and used online, text classification has become a staple for data analysts when extracting relevant information. Yet, machine learning algorithms are susceptible to biases when implemented on any large-scale automated task, especially in text analysis. With the popularization of newer branches of study emerging from the field of machine learning – such as ensemble and deep learning – we must analyze the potential pitfalls in the common experimental setup centered around learning algorithms. Imbalance in text data is one such pitfall – when data is not equally distributed across all categories in a dataset, it can influence and undermine the classification of underrepresented categories. In our research, we have proposed several techniques and unique approaches to tackle this obstacle. We prepared four datasets of varying degrees of imbalance to conduct our experimentation. We proved that feature extraction techniques singular value decomposition (SVD) and GloVe are the key to reducing the effect of imbalance in text classification, especially in ensemble and deep learning. Using the result of our research, we have also proposed a modified ensemble classifier that can classify imbalanced and balanced data alike.

Downloads

Download data is not yet available.

Downloads

Published

2022-04-29

How to Cite

Hossain, T., Zahin Mauni, H., & Rab, R. (2022). Reducing the Effect of Imbalance in Text Classification Using SVD and GloVe with Ensemble and Deep Learning. Computing and Informatics, 41(1), 98–115. https://doi.org/10.31577/cai_2022_1_98

Issue

Section

Special Section Articles

Most read articles by the same author(s)