Description

Sentiment Analyzer predicts the sentiment (positive or negative) for online reviews . It uses the Naive Bayes algorithm to analyze text input and determine whether it expresses a positive or negative sentiment.

The model was trained on an IMDB dataset of 50k movie reviews, each pre-labeled with sentiment so I didn't have too much of a problem training it

Unlike my last ML project this one has a web-based UI built with Flask so you can test it in real time.

Uses:

Python
Flask (for web app)
HTML/CSS
Bootstrap
NLTK
Scikit-Learn

Usage

Clone the repository: git clone https://github.com/Shady2kOver/naive-bayers-sentiment-analysis.git
Install the required dependencies: pip install -r requirements.txt
Run the application: python main.py
Redirect yourself through the output to the localhost web-page
Enter the review text in the provided text area and click the "Analyze Sentiment" button
The predicted sentiment (positive or negative) will be displayed below the button.

Details

Naive Bayes Classifier:

Bayes' theorem states that the probability of an event A given the occurrence of event B is equal to the probability of event B given event A multiplied by the probability of event A, divided by the probability of event B.

_{Note that this is a simple explanation for 2 events, when considering many events we will take in the total probability in the denominator}

This can be represented as follows -

Now this classifier is called "Naive Bayes" classifier as it uses bayes theorem with a slight assumption, The classifier assumes that the presence or absence of a particular feature in a class is independent of the presence or absence of any other feature. In other words, it assumes that the features are conditionally independent given the class.

This assumption is considered "naive" because it oversimplifies the relationships between features. In reality, many features are often correlated or dependent on each other to some extent.

However, despite this simplification, Naive Bayes has been found to work well in many practical applications and can achieve good results, especially when the independence assumption is reasonably satisfied or when the dependencies between features are not critical for accurate classification. Furthermore, it can be easily implemented through the scikit-learn library

Negation Handling:

An issue (which still isn't efficiently handled) that I faced in this was handling negations, when you tokenize words and omit stop-words you can run into a few problems .

Say there's two reviews as follows :

I dislike this product! It has absolutely ruined my life.

I don't dislike this product at all.

In the second review, the word don't negates the word dislike , the second review is showcasing a positive sentiment but the word dislike is known by the algorithm after training to be associated with negative sentiments, so the algorithm might mark this review as negative.

I've incorporated very simple negation handling where negation words like "not" "neither" "nor" negate the word after them and get tokenized as "not_good" for example, but this still doesn't handle cases like where suffixes like "-less" are used.

I've also used word lemmatization prior to negation to make the training process more smooth and not require too many word variations or inflections.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
models		models
samples		samples
templates		templates
README.md		README.md
dataset.csv		dataset.csv
main.py		main.py
requirements.txt		requirements.txt
train_model.py		train_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Uses:

Usage

Details

Naive Bayes Classifier:

Negation Handling:

About

Releases

Packages

Languages

Shady2kOver/naive-bayers-sentiment-analysis

Folders and files

Latest commit

History

Repository files navigation

Description

Uses:

Usage

Details

Naive Bayes Classifier:

Negation Handling:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages