Bengali Cyberbullying Detection in Social Media Using Machine Learning Algorithms Conference

Saha, S, Islam, MS, Alam, MM et al. (2023). Bengali Cyberbullying Detection in Social Media Using Machine Learning Algorithms . 10.1109/STI59863.2023.10464740

cited authors

  • Saha, S; Islam, MS; Alam, MM; Rahman, MM; Hasan Majumder, MZ; Alam, MS; Hossain, MK

abstract

  • Social media has become more prevalent and it is now fairly easy to communicate with people online. Social network users have many options to cooperate, interact positively, and exchange information. The same system might create a toxic environment that can create an unpleasant environment for online abuse and bullies. Young adults and celebrities are vulnerable to online abuse more often. That's why cyberbullying should be identified and eliminated from social media because it may significantly lead to psychological as well as emotional suffering. By utilizing Natural Language Processing (NLP), Machine Learning (ML), as well as Deep Learning Models based on Transformers like BERT, we can identify patterns in social media texts used by bullies and create an automated method that can detect abusive texts. In this study, we proposed a reliable machine-learning model for social media cyberbullying detection in the Bengali language. We applied text preprocessing, followed by feature extraction using the TF-IDF vectorizer. Then, we applied 4 ML algorithms and 1 transformer-based pretrained BERT model and evaluated their performances by different performance metrics. Our study found that BERT worked best compared to other algorithms and achieved an accuracy of 90% and an AUC (Area under the ROC Curve) of 0.96.

publication date

  • January 1, 2023

Digital Object Identifier (DOI)