使用 torch.nn 模块的神经网络层和函数构建模型
自动梯度计算的机制,这是 基于梯度的模型训练
使用 TensorBoard 可视化训练进度和其他活动
我们将熟悉 dataset 和 dataloader 抽象,以及如何 它们简化了在训练循环期间向模型提供数据的过程
我们将看看 PyTorch 优化器,它们实现了调整 基于损失函数结果的模型权重
最后,我们将所有这些放在一起,看到一个完整的 PyTorch 训练循环的实际应用。
数据集和 DataLoader¶
的 and 类封装了
从存储中提取数据并将其公开给 Training Loop
在本教程中,我们将使用 Fashion-MNIST 数据集,该数据集由
TorchVision 的 TorchVision 中。我们过去常常
zero-center 并规范化图像瓦片内容的分布,
import torch
import torchvision
import torchvision.transforms as transforms
# PyTorch TensorBoard support
from torch.utils.tensorboard import SummaryWriter
from datetime import datetime
transform = transforms.Compose(
transforms.Normalize((0.5,), (0.5,))])
# Create datasets for training & validation, download if necessary
training_set = torchvision.datasets.FashionMNIST('./data', train=True, transform=transform, download=True)
validation_set = torchvision.datasets.FashionMNIST('./data', train=False, transform=transform, download=True)
# Create data loaders for our datasets; shuffle for training, not for validation
training_loader = torch.utils.data.DataLoader(training_set, batch_size=4, shuffle=True)
validation_loader = torch.utils.data.DataLoader(validation_set, batch_size=4, shuffle=False)
# Class labels
classes = ('T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot')
# Report split sizes
print('Training set has {} instances'.format(len(training_set)))
print('Validation set has {} instances'.format(len(validation_set)))
import matplotlib.pyplot as plt
import numpy as np
# Helper function for inline image display
def matplotlib_imshow(img, one_channel=False):
if one_channel:
img = img.mean(dim=0)
img = img / 2 + 0.5 # unnormalize
npimg = img.numpy()
if one_channel:
plt.imshow(npimg, cmap="Greys")
plt.imshow(np.transpose(npimg, (1, 2, 0)))
dataiter = iter(training_loader)
images, labels = next(dataiter)
# Create a grid from the images and show them
img_grid = torchvision.utils.make_grid(images)
matplotlib_imshow(img_grid, one_channel=True)
print(' '.join(classes[labels[j]] for j in range(4)))
Sandal Sneaker Coat Sneaker
我们在这个例子中使用的模型是 LeNet-5 的变体 - 它应该 如果您观看过本系列中的前几个视频,请熟悉。
import torch.nn as nn
import torch.nn.functional as F
# PyTorch models inherit from torch.nn.Module
class GarmentClassifier(nn.Module):
def __init__(self):
super(GarmentClassifier, self).__init__()
self.conv1 = nn.Conv2d(1, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 4 * 4, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 4 * 4)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
model = GarmentClassifier()
在此示例中,我们将使用交叉熵损失。用于演示 目的,我们将创建批量的虚拟输出和标签值,运行 他们通过损失函数,并检查结果。
loss_fn = torch.nn.CrossEntropyLoss()
# NB: Loss functions expect data in batches, so we're creating batches of 4
# Represents the model's confidence in each of the 10 classes for a given input
dummy_outputs = torch.rand(4, 10)
# Represents the correct class among the 10 being tested
dummy_labels = torch.tensor([1, 5, 3, 7])
loss = loss_fn(dummy_outputs, dummy_labels)
print('Total loss for this batch: {}'.format(loss.item()))
在此示例中,我们将使用简单的随机梯度 以势头下降。
尝试此优化的一些变体可能很有指导意义 方案:
学习率确定优化器的步骤大小 需要。不同的学习率对您的培训有什么影响 结果,在准确性和收敛时间方面?
Momentum 将优化器推向 Gradient 最强的方向 多个步骤。更改此值对结果有什么影响?
尝试一些不同的优化算法,例如 averaged SGD、Adagrad 或 亚当。您的结果有何不同?
# Optimizers specified in the torch.optim package
optimizer = torch.optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
下面,我们有一个执行一个训练 epoch 的函数。它 枚举来自 DataLoader 的数据,并在循环的每一次传递中 以下内容:
从 DataLoader 获取一批训练数据
执行推理 - 即,从模型中获取输入批次的预测
告诉优化器执行一个学习步骤 - 即,调整模型的 根据该批次的观测梯度学习权重,根据 我们选择的优化算法
它报告每 1000 个批次的损失。
最后,它报告最后一个 1000 个批次,用于与验证运行进行比较
def train_one_epoch(epoch_index, tb_writer):
running_loss = 0.
last_loss = 0.
# Here, we use enumerate(training_loader) instead of
# iter(training_loader) so that we can track the batch
# index and do some intra-epoch reporting
for i, data in enumerate(training_loader):
# Every data instance is an input + label pair
inputs, labels = data
# Zero your gradients for every batch!
# Make predictions for this batch
outputs = model(inputs)
# Compute the loss and its gradients
loss = loss_fn(outputs, labels)
# Adjust learning weights
# Gather data and report
running_loss += loss.item()
if i % 1000 == 999:
last_loss = running_loss / 1000 # loss per batch
print(' batch {} loss: {}'.format(i + 1, last_loss))
tb_x = epoch_index * len(training_loader) + i + 1
tb_writer.add_scalar('Loss/train', last_loss, tb_x)
running_loss = 0.
return last_loss
每个 Epoch 活动¶
我们希望在每个 epoch 中执行一次以下操作:
通过检查一组数据的相对损失来执行验证,而 用于培训,并报告此
在这里,我们将在 TensorBoard 中执行报告。这将需要转到 用于启动 TensorBoard 的命令行,并在另一个浏览器中打开它 标签。
# Initializing in a separate cell so we can easily add more epochs to the same run
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
writer = SummaryWriter('runs/fashion_trainer_{}'.format(timestamp))
epoch_number = 0
best_vloss = 1_000_000.
for epoch in range(EPOCHS):
print('EPOCH {}:'.format(epoch_number + 1))
# Make sure gradient tracking is on, and do a pass over the data
avg_loss = train_one_epoch(epoch_number, writer)
running_vloss = 0.0
# Set the model to evaluation mode, disabling dropout and using population
# statistics for batch normalization.
# Disable gradient computation and reduce memory consumption.
with torch.no_grad():
for i, vdata in enumerate(validation_loader):
vinputs, vlabels = vdata
voutputs = model(vinputs)
vloss = loss_fn(voutputs, vlabels)
running_vloss += vloss
avg_vloss = running_vloss / (i + 1)
print('LOSS train {} valid {}'.format(avg_loss, avg_vloss))
# Log the running loss averaged per batch
# for both training and validation
writer.add_scalars('Training vs. Validation Loss',
{ 'Training' : avg_loss, 'Validation' : avg_vloss },
epoch_number + 1)
# Track best performance, and save the model's state
if avg_vloss < best_vloss:
best_vloss = avg_vloss
model_path = 'model_{}_{}'.format(timestamp, epoch_number)
torch.save(model.state_dict(), model_path)
epoch_number += 1
saved_model = GarmentClassifier()
加载模型后,它就可以满足你的任何需求了 - 更多的训练、推理或分析。
请注意,如果您的模型具有影响模型的 constructor 参数 结构,您需要提供它们并配置模型 与保存时的状态相同。
