Introduction
Logistic Regression is a powerful statistical method extensively used for binary classification tasks. In this blog post, I will provide a detailed walkthrough of the theoretical foundations of logistic regression and present a step-by-step implementation in Python. The objective is to not only understand the theoretical underpinnings but also to delve into the code that drives logistic regression computations.
Logistic Regression Follows a sigmoid Curve which is S-Shaped when plotted.
Data Preparation:
import numpy as np
import pandas as pd
import math
# Data Preparation
successfull = [20, 22, 38, 15, 28, 17, 10]
unsuccessfull = [7, 18, 22, 20, 32, 23, 20]
n = len(successfull)
total_events = [success + unsuccess for success, unsuccess in zip(successfull, unsuccessfull)]
print(total_events)
In this section, we start by initializing two lists representing the counts of successful and unsuccessful events. We then calculate the total events by summing the corresponding counts.
Section 2: Probability Calculation
# Probability Calculation
success_probability = [round(success / total, 3) for success, total in zip(successfull, total_events)]
unsuccess_probability = [round(unsuccess / total, 3) for unsuccess, total in zip(unsuccessfull, total_events)]
print(success_probability)
print(unsuccess_probability)
Here, we compute the success and failure probabilities for each event, rounding them to three decimal places for clarity.
Section 3: Odds and Log Odds
# Odds and Log Odds
odds = [round(success / unsuccess, 3) for success, unsuccess in zip(success_probability, unsuccess_probability)]
logodds = [round(math.log(odd), 3) for odd in odds]
print(odds)
print(logodds)
This part involves the calculation of odds and log odds based on the previously computed probabilities.
Section 4: Regression Coefficients
# Regression Coefficients
x = list(range(1, n + 1))
a, b, c, d = 0, 0, 0, 0
for i in range(n):
a += x[i]
b += logodds[i]
c += x[i] * logodds[i]
d += x[i] * x[i]
b1 = ((n * c) - (a * b)) / ((n * d) - (a * a))
b0 = (b - b1 * a) / n
print(b0)
print(b1)
print(f"Y = {b0} + {b1} * X")
In this section, we employ the least squares method to determine the regression coefficients b0 and b1.
Section 5: Error Analysis
# Error Analysis
sse = [round(log - (b0 + b1 * xi), 3) for log, xi in zip(logodds, x)]
tss = [round(log - np.mean(logodds), 3) for log in logodds]
SSE = sum(sse)
TSS = sum(tss)
R_square = 1 - (SSE / TSS)
print('Coefficients:')
print(b0)
print(b1)
print(f"Y = {b0} + {b1} * X")
print('Log odds:')
print(logodds)
This part involves an analysis of errors, calculating the sum of squared errors (SSE), total sum of squares (TSS), and R-square.
Section 6: Comparison with LogisticRegression Object
Now, let’s compare our manually implemented logistic regression with scikit-learn’s LogisticRegression() object, highlighting the ease of use and efficiency that libraries offer.
from sklearn.linear_model import LogisticRegression
# Reshaping data for scikit-learn compatibility
X = np.array(x).reshape(-1, 1)
y = np.array(logodds)
# Create and fit Logistic Regression model
model = LogisticRegression()
model.fit(X, y)
# Extract coefficients from the scikit-learn model
sklearn_b0 = model.intercept_[0]
sklearn_b1 = model.coef_[0][0]
# Print scikit-learn coefficients
print(f"Scikit-learn Coefficients:")
print(sklearn_b0)
print(sklearn_b1)
print(f"Y = {sklearn_b0} + {sklearn_b1} * X")
LogisticRegression class from scikit-learn. The X variable is reshaped to comply with scikit-learn’s input requirements, and the y variable is set to our log odds. We then create and fit a Logistic Regression model using the fit() method. Finally, we extract the intercept and coefficient from the scikit-learn model, showcasing the simplicity of obtaining regression coefficients using a library.
By comparing the coefficients obtained from our manual implementation with those from the scikit-learn LogisticRegression() object, readers can appreciate the streamlined approach that libraries offer, reducing the complexity of implementing logistic regression from scratch.

Leave a comment