Cyberbullying image and text detection using CNN
Data science projects in pondicherry
Create New

Cyberbullying image and text detection using CNN

Project period

02/01/2019 - 03/14/2019

Views

465

1

Project Category

Computer Science



Cyberbullying image and text detection using CNN
Cyberbullying image and text detection using CNN

This project is performed to detect text and image-based CyberBullying in social network sites like Instagram. To implement the Convolutional Neural Network (CNN) for the detection of bullying image and Bag of Words (BOW) model for the detection of bullying words list. This project describes the process of building a cyberbullying intervention interface driven by a machine-learning-based text-classification service. We make two main contributions. First, we show that cyberbullying can be identified in real-time before it takes place, with available machine learning and natural language processing tools, in particular, CNN. Second, we present a mechanism that provides individuals with early feedback about how other people would feel about wording choices in their messages before they are sent out. This interface not only gives a chance for the user to revise the text but also provides a system-level flagging/intervention in a situation related to cyberbullying. 

Why: Problem statement

CyberBullying is an increasingly important and serious social problem, which can negatively affect individuals. It is defined as the phenomenon of using the internet, cell phones, and other electronic devices to willfully hurt or harass others. Due to the growth of social media platforms like Instagram, CyberBullying is becoming more and more prevalent.

How: Solution description

To reduce the cybercrime, we are going to detect the bullying images and the text from various social media using deep learning techniques.

Data collection:

We have collected some images from various social media like Facebook, Twitter, and Instagram.

Data Preprocessing:

The data pre-processing is an important phase in representing data in feature space to the classifiers. Social network data are noisy, thus pre-processing has been applied to improve the quality of the data collected.

Feature Extraction:

This module is used for extracting the data required from processed data. The part of speech for every word in the conversation is obtained. We used OCR (Optical character recognition) to extract text from the images.

Bag of words:

The Bag of words (BOW) model is a baseline text feature wherein the given text is represented as a multiset of its words, disregarding grammar and word order. The multiplicity of words are maintained and stored as a word frequency vector. Finally, we create a word vector, where each component represents a word in the dictionary which we have generated and its value corresponds to its frequency.

Naive Bayes:

A Naive Bayes classifier is a simple probabilistic classifier based on applying Bayes theorem (from Bayesian statistics) with strong (naive) independence assumptions. We trained the bullying and non-bullying text taken from social media using naive Bayes classifier. We got the testing accuracy of 98.2%.

Convolutional Neural Networks:

Convolutional neural networks are used primarily to classify images. They are algorithms that can identify faces, individuals, street signs, tumors, platypuses and many other aspects of visual data. It will extract image features using a pre-trained Convolutional Neural Network (CNN) which is the benchmark standard for image classification. Here, we trained some bullying and non-bullying images taken from social media.  We got the testing accuracy of 97.6%.

How is it different from competition

We got more accuracy for the bullying and non bullying images when compared to previous models.

Who are your customers

People who are in the crime department and the cops can use this project.

Project Phases and Schedule

Phase 1: Data collection

Phase 2: Data cleaning and feature extraction

Phase 3: Training using Naive Bayes

Phase 4: Implementation of CNN

Resources Required

Anaconda tool with Python 3.7 version

Installation of required libraries

Jupyter notebook

Comments

Leave a Comment

Post a Comment

Are you Interested in this project?


Do you need help with a similar project? We can guide you. Please Click the Contact Us button.


Contact Us

Social Sharing