手把手教你用Python实现SGD逻辑回归：从零编写For循环代码

在线计算网 · 发布于 2025-03-22 05:35:03 · 已经有18人使用

引言

在机器学习的世界里，逻辑回归是一种非常经典的分类算法。而随机梯度下降（SGD）则是优化逻辑回归模型的重要手段。今天，我们将从零开始，手把手教你如何在Python中使用For循环手动编写SGD逻辑回归代码。

什么是逻辑回归和SGD

逻辑回归

逻辑回归是一种用于二分类问题的统计模型，其核心思想是通过一个逻辑函数（如Sigmoid函数）将线性回归的输出压缩到0和1之间，表示概率。

随机梯度下降（SGD）

SGD是一种优化算法，通过每次迭代只使用一个样本来更新模型参数，从而加快训练速度，特别适用于大规模数据集。

手动编写SGD逻辑回归

1. 导入必要的库

首先，我们需要导入一些必要的Python库。


import numpy as np

2. 定义Sigmoid函数

Sigmoid函数是逻辑回归的核心，用于将线性组合的结果转换为概率。


def sigmoid(x):
    return 1 / (1 + np.exp(-x))

3. 初始化参数

我们需要初始化权重和偏置。


theta = np.zeros(X.shape[1])
bias = 0

4. 编写SGD算法

接下来，我们使用For循环实现SGD算法。


for epoch in range(epochs):
    for i in range(m):
        z = np.dot(X[i], theta) + bias
        h = sigmoid(z)
        error = h - y[i]
        theta -= learning_rate * error * X[i]
        bias -= learning_rate * error

5. 预测函数

最后，我们定义一个预测函数来评估模型性能。


def predict(X, theta, bias):
    return sigmoid(np.dot(X, theta) + bias)

实验与验证

我们可以使用一些常见的二分类数据集（如鸢尾花数据集）来验证我们的模型。


from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

## 加载数据
iris = datasets.load_iris()
X = iris.data[iris.target != 0]
y = iris.target[iris.target != 0]
y = np.where(y == 1, 0, 1)

## 数据标准化
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

## 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

## 训练模型
theta, bias = train_sgd(X_train, y_train)

## 预测并计算准确率
predictions = predict(X_test, theta, bias)
accuracy = np.mean(predictions >= 0.5) == y_test
print(f'Accuracy: {accuracy * 100:.2f}%')