NLP-Based Yelp Sentiment Prediction System
For my 10-301 Introduction to Machine Learning course, I built an NLP system using binary logistic regression to classify Yelp restaurant reviews as positive or negative.
For this project, we are looking at restaurant reviews from Yelp, a popular website that crowd sources reviews for many businesses. So we used Yelp's dataset here from the Yelp Open Dataset.
We used GloVe word embeddings to convert text into numerical vectors while ignoring out-of-vocabulary words. The model was then trained using gradient descent to optimize a sigmoid function and create a logistic regression curve, iterating over the new converted dataset for 1000+ epochs.
This project demonstrated the application of feature engineering, sentiment analysis, and logistic regression in NLP.
Further Analysis
Results:
The results above uses the large dataset on my Github
Sample Data:
Raw Text Data
Processed Data with Embeddings