Logistic Regression

Published by

on

Introduction

Logistic Regression is a powerful statistical method extensively used for binary classification tasks. In this blog post, I will provide a detailed walkthrough of the theoretical foundations of logistic regression and present a step-by-step implementation in Python. The objective is to not only understand the theoretical underpinnings but also to delve into the code that drives logistic regression computations.

Logistic Regression Follows a sigmoid Curve which is S-Shaped when plotted.

Data Preparation:

In this section, we start by initializing two lists representing the counts of successful and unsuccessful events. We then calculate the total events by summing the corresponding counts.

Section 2: Probability Calculation

Here, we compute the success and failure probabilities for each event, rounding them to three decimal places for clarity.

Section 3: Odds and Log Odds

This part involves the calculation of odds and log odds based on the previously computed probabilities.

Section 4: Regression Coefficients

In this section, we employ the least squares method to determine the regression coefficients b0 and b1.

Section 5: Error Analysis

This part involves an analysis of errors, calculating the sum of squared errors (SSE), total sum of squares (TSS), and R-square.

Section 6: Comparison with LogisticRegression Object

Now, let’s compare our manually implemented logistic regression with scikit-learn’s LogisticRegression() object, highlighting the ease of use and efficiency that libraries offer.

LogisticRegression class from scikit-learn. The X variable is reshaped to comply with scikit-learn’s input requirements, and the y variable is set to our log odds. We then create and fit a Logistic Regression model using the fit() method. Finally, we extract the intercept and coefficient from the scikit-learn model, showcasing the simplicity of obtaining regression coefficients using a library.

By comparing the coefficients obtained from our manual implementation with those from the scikit-learn LogisticRegression() object, readers can appreciate the streamlined approach that libraries offer, reducing the complexity of implementing logistic regression from scratch.

Leave a comment