Husan

Reading

Writing

Studio

About

Husan

Sentiment Analysis Web App

A movie review analysis tool built from scratch using custom logistic regression and TF-IDF features.

GitHubTwitter

Description

I built a sentiment classifier from scratch using only NumPy, no sklearn shortcuts for the actual model. It processes movie reviews and predicts whether they're positive or negative.

The system uses TF-IDF with bigrams to turn text into numbers, feeding 5,000 features into a logistic regression model trained with gradient descent. I also built the same thing using sklearn to see how my implementation stacked up. Both hit around 86% accuracy on IMDB reviews.

The whole thing runs through a Flask web app where you can type in a review and get an instant prediction. Added some visualizations to show how the model learns and what features matter most.


Papers Read

Text Classification Using TF-IDF

Explored TF-IDF (Term Frequency-Inverse Document Frequency) for feature extraction from text data, using unigrams and bigrams to capture word patterns.

Logistic Regression for Binary Classification

Implemented gradient descent from scratch with NumPy. Used binary cross-entropy loss function with learning rate 0.1 over 1000 iterations.