Skip to content

leen449/Data-Science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

134 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Offense Level Based On Account Age

image



⚙ Project Overview

With the growth of online communities, it’s important to explore factors behind toxic vs. non-toxic comments. This project analyzes Reddit comments, focusing on toxicity detection, account age (in years), and subreddit patterns to uncover what drives negative interactions.



💎Objectives

We aimed to answer the following key questions: • Do toxic comments appear more in certain subreddits? • Are older accounts less likely to post toxic comments? • What words are most linked to toxic behavior? • How balanced is the dataset between toxic and non-toxic comments?



📊 Data Collection

Description Source
Raw dataset of Reddit comments reddit.com


🛠️ Tools Used

  • 🐍 Python Libraries: pandas, numpy, matplotlib,etc...
  • 🧹 Text Cleaning: Lowercasing text , removing the username column , replacing the account_age_days with account_age_years
  • 🔎 Features Extracted: subreddit, comment_text, account_age (years), toxicity label


👥 Team & Supervision

  • [leen binmueqal]
  • [Ghalia Alkhaldie]
  • [Rana Alnagashy]
  • [Juri Alghamdi]
  • [Aryam Almutairi]

Supervised by: Dr. [ Abeer Aldayel]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors