Sentiment Analysis Web App
A movie review analysis tool built from scratch using custom logistic regression and TF-IDF features.
Description
I built a sentiment classifier from scratch using only NumPy, no sklearn shortcuts for the actual model. It processes movie reviews and predicts whether they're positive or negative.
The system uses TF-IDF with bigrams to turn text into numbers, feeding 5,000 features into a logistic regression model trained with gradient descent. I also built the same thing using sklearn to see how my implementation stacked up. Both hit around 86% accuracy on IMDB reviews.
The whole thing runs through a Flask web app where you can type in a review and get an instant prediction. Added some visualizations to show how the model learns and what features matter most.
Papers Read
•
Text Classification Using TF-IDF
Explored TF-IDF (Term Frequency-Inverse Document Frequency) for feature extraction from text data, using unigrams and bigrams to capture word patterns.
•
Logistic Regression for Binary Classification
Implemented gradient descent from scratch with NumPy. Used binary cross-entropy loss function with learning rate 0.1 over 1000 iterations.