可以看到,每次 epoch,我们都会更新数据集里每行的系数。系数的更新是基于模型生成的误差。该误差被算作候选系数的预测值和预期输出值之间的差。 error = prediction - expected 有一个系数用于加权每一个输入属性,这些属性将以连续的方式进行更新,比如 b1(t+1) = b1(t) - learning_rate * error(t) * x1(t) 列表开始的特殊系数,也被称为截距(intercept)或偏差(bias),也以类似的方式更新,但因其不与特定输入值相关,所以无输入值。 b0(t+1) = b0(t) - learning_rate * error(t) 现在我们把所有东西组合在一起。coefficients_sgd() 函数正是用随机梯度下降来计算一个训练集的系数值,下面即是该函数:
此外,我们追踪每个 epoch 的方差(正值)总和从而在循环之后得到一个好的结果。
我们用 0.001 的学习速率训练该模型 50 次,即把整个训练数据集的系数曝光 50 次。运行一个 epoch 系统就将该次循环中的和方差(sum squared error)和以及最终系数集合 print 一次: >epoch=45, lrate=0.001, error=2.650 >epoch=46, lrate=0.001, error=2.627 >epoch=47, lrate=0.001, error=2.607 >epoch=48, lrate=0.001, error=2.589 >epoch=49, lrate=0.001, error=2.573 [0.22998234937311363, 0.8017220304137576] 可以看到误差是如何在历次 epoch 中持续降低的。或许我们可以增加训练次数(epoch)或者每个 epoch 中的系数总量(调高学习速率)。 尝试一下看你能得到什么结果。 现在,我们将这个算法用到实际的数据当中。 3. 葡萄酒品质预测 我们将使用随机阶梯下降的方法为葡萄酒品质数据集训练一个线性回归模型。本示例假定一个名为 winequality—white.csv 的 csv 文件副本已经存在于当前工作目录。 首先加载该数据集,将字符串转换成数字,j2直播,并将输出列从字符串转换成数值 0 和 1. 这个过程是通过辅助函数 load_csv()、str_column_to_float() 以及 dataset_minmax() 和 normalize_dataset() 来分别实现的。 我们将通过 K 次交叉验证来预估得到的学习模型在未知数据上的表现。这就意味着我们将创建并评估 K 个模型并预估这 K 个模型的平均误差。辅助函数 cross_validation_split()、rmse_metric() 和 evaluate_algorithm() 用于求导根均方差以及评估每一个生成的模型。 我们用之前创建的函数 predict()、coefficients_sgd() 以及 linear_regression_sgd() 来训练模型。完整代码如下: # Linear Regression With Stochastic Gradient Descent for Wine Quality from random import seed from random import randrange from csv import reader from math import sqrt # Load a CSV file def load_csv(filename): dataset = list() with open(filename, 'r') as file: csv_reader = reader(file) for row in csv_reader: if not row: continue dataset.append(row) return dataset # Convert string column to float def str_column_to_float(dataset, column): for row in dataset: row[column] = float(row[column].strip()) # Find the min and max values for each column def dataset_minmax(dataset): minmax = list() for i in range(len(dataset[0])): col_values = [row[i] for row in dataset] value_min = min(col_values) value_max = max(col_values) minmax.append([value_min, value_max]) return minmax # Rescale dataset columns to the range 0-1 def normalize_dataset(dataset, minmax): for row in dataset: for i in range(len(row)): row[i] = (row[i] - minmax[i][0]) / (minmax[i][1] - minmax[i][0]) # Split a dataset into k folds def cross_validation_split(dataset, n_folds): dataset_split = list() dataset_copy = list(dataset) fold_size = len(dataset) / n_folds for i in range(n_folds): fold = list() while len(fold) < fold_size: index = randrange(len(dataset_copy)) fold.append(dataset_copy.pop(index)) dataset_split.append(fold) return dataset_split # Calculate root mean squared error def rmse_metric(actual, predicted): sum_error = 0.0 for i in range(len(actual)): prediction_error = predicted[i] - actual[i] sum_error += (prediction_error ** 2) mean_error = sum_error / float(len(actual)) return sqrt(mean_error) # Evaluate an algorithm using a cross validation split def evaluate_algorithm(dataset, algorithm, n_folds, *args): folds = cross_validation_split(dataset, n_folds) scores = list() for fold in folds: train_set = list(folds) train_set.remove(fold) train_set = sum(train_set, []) test_set = list() for row in fold: row_copy = list(row) test_set.append(row_copy) row_copy[-1] = None predicted = algorithm(train_set, test_set, *args) actual = [row[-1] for row in fold] rmse = rmse_metric(actual, predicted) scores.append(rmse) return scores # Make a prediction with coefficients def predict(row, coefficients): yhat = coefficients[0] for i in range(len(row)-1): yhat += coefficients[i + 1] * row[i] return yhat # Estimate linear regression coefficients using stochastic gradient descent def coefficients_sgd(train, l_rate, n_epoch): coef = [0.0 for i in range(len(train[0]))] for epoch in range(n_epoch): for row in train: yhat = predict(row, coef) error = yhat - row[-1] coef[0] = coef[0] - l_rate * error for i in range(len(row)-1): coef[i + 1] = coef[i + 1] - l_rate * error * row[i] # print(l_rate, n_epoch, error) return coef # Linear Regression Algorithm With Stochastic Gradient Descent def linear_regression_sgd(train, test, l_rate, n_epoch): predictions = list() coef = coefficients_sgd(train, l_rate, n_epoch) for row in test: yhat = predict(row, coef) predictions.append(yhat) return(predictions) # Linear Regression on wine quality dataset seed(1) # load and prepare data filename = 'winequality-white.csv' dataset = load_csv(filename) for i in range(len(dataset[0])): str_column_to_float(dataset, i) # normalize minmax = dataset_minmax(dataset) normalize_dataset(dataset, minmax) # evaluate algorithm n_folds = 5 l_rate = 0.01 n_epoch = 50 scores = evaluate_algorithm(dataset, linear_regression_sgd, n_folds, l_rate, n_epoch) print('Scores: %s' % scores) print('Mean RMSE: %.3f' % (sum(scores)/float(len(scores)))) (责任编辑:本港台直播) |