Breast Cancer Prediction

Machine Learning Project

Logistic Regression Model

Sklearn library & LogisticRegression methods were used to import this classification algoritham.

Step 1: Load the data

Step 2: Logistic Regression measures the relationship between the dependent. Variable and the independent variables, by estimating probabilities using its underlying logistic function.

Step 3: These probabilities must then be transformed into binary values in order to actually make a prediction

Step 4: The Function takes any real-valued number and map it into a value between the range of 0 and 1.

Logistic Regression Model with Full 30 Features

Logistic Regression Model with 7 BestFeature Selected Features

Logistic Regression Model with 7 Correlation Selected Features

Observations:

1. Model accuracy only declined slightly after reducing feature number from 30 to 7.

2. Although accuracy for both 7 feature models were the same, the model with 7 BestFeature selected features was better at predicting Benign while the model with 7 correlation selected features was better at predicting Malignant.

Northwestern Data Visualization Final Project