注意
单击此处下载完整的示例代码
简介 ||张量 ||Autograd ||建筑模型 ||TensorBoard 支持 ||训练模型 ||模型理解
PyTorch TensorBoard 支持¶
创建时间: 2021年11月30日 |上次更新时间:2024 年 5 月 29 日 |上次验证: Nov 05, 2024
请跟随下面的视频或在 youtube 上观看。
开始之前¶
要运行本教程,您需要安装 PyTorch、TorchVision、 Matplotlib 和 TensorBoard 的 TensorBoard 中。
跟:conda
conda install pytorch torchvision -c pytorch
conda install matplotlib tensorboard
跟:pip
pip install torch torchvision matplotlib tensorboard
安装依赖项后,在 Python 中重新启动此笔记本 安装它们的环境。
介绍¶
在本笔记本中,我们将针对 Fashion-MNIST 数据集。Fashion-MNIST 是一组描绘 各种服装,带有 10 个等级标签,表示服装类型 描绘。
# PyTorch model and training necessities
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
# Image datasets and image manipulation
import torchvision
import torchvision.transforms as transforms
# Image display
import matplotlib.pyplot as plt
import numpy as np
# PyTorch TensorBoard support
from torch.utils.tensorboard import SummaryWriter
# In case you are using an environment that has TensorFlow installed,
# such as Google Colab, uncomment the following code to avoid
# a bug with saving embeddings to your TensorBoard directory
# import tensorflow as tf
# import tensorboard as tb
# tf.io.gfile = tb.compat.tensorflow_stub.io.gfile
在 TensorBoard 中显示图像¶
我们首先将数据集中的样本图像添加到 TensorBoard:
# Gather datasets and prepare them for consumption
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))])
# Store separate training and validations splits in ./data
training_set = torchvision.datasets.FashionMNIST('./data',
download=True,
train=True,
transform=transform)
validation_set = torchvision.datasets.FashionMNIST('./data',
download=True,
train=False,
transform=transform)
training_loader = torch.utils.data.DataLoader(training_set,
batch_size=4,
shuffle=True,
num_workers=2)
validation_loader = torch.utils.data.DataLoader(validation_set,
batch_size=4,
shuffle=False,
num_workers=2)
# Class labels
classes = ('T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot')
# Helper function for inline image display
def matplotlib_imshow(img, one_channel=False):
if one_channel:
img = img.mean(dim=0)
img = img / 2 + 0.5 # unnormalize
npimg = img.numpy()
if one_channel:
plt.imshow(npimg, cmap="Greys")
else:
plt.imshow(np.transpose(npimg, (1, 2, 0)))
# Extract a batch of 4 images
dataiter = iter(training_loader)
images, labels = next(dataiter)
# Create a grid from the images and show them
img_grid = torchvision.utils.make_grid(images)
matplotlib_imshow(img_grid, one_channel=True)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz
0%| | 0.00/26.4M [00:00<?, ?B/s]
0%| | 65.5k/26.4M [00:00<01:12, 364kB/s]
1%| | 229k/26.4M [00:00<00:38, 684kB/s]
3%|3 | 918k/26.4M [00:00<00:09, 2.62MB/s]
7%|7 | 1.93M/26.4M [00:00<00:05, 4.10MB/s]
25%|##4 | 6.52M/26.4M [00:00<00:01, 15.2MB/s]
38%|###8 | 10.1M/26.4M [00:00<00:00, 17.5MB/s]
57%|#####7 | 15.1M/26.4M [00:01<00:00, 25.6MB/s]
72%|#######1 | 19.0M/26.4M [00:01<00:00, 24.6MB/s]
89%|########9 | 23.6M/26.4M [00:01<00:00, 29.8MB/s]
100%|##########| 26.4M/26.4M [00:01<00:00, 19.4MB/s]
Extracting ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz
0%| | 0.00/29.5k [00:00<?, ?B/s]
100%|##########| 29.5k/29.5k [00:00<00:00, 328kB/s]
Extracting ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz
0%| | 0.00/4.42M [00:00<?, ?B/s]
1%|1 | 65.5k/4.42M [00:00<00:12, 363kB/s]
5%|5 | 229k/4.42M [00:00<00:06, 681kB/s]
21%|## | 918k/4.42M [00:00<00:01, 2.54MB/s]
44%|####3 | 1.93M/4.42M [00:00<00:00, 4.11MB/s]
100%|##########| 4.42M/4.42M [00:00<00:00, 6.09MB/s]
Extracting ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz
0%| | 0.00/5.15k [00:00<?, ?B/s]
100%|##########| 5.15k/5.15k [00:00<00:00, 36.0MB/s]
Extracting ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw
上面,我们使用 TorchVision 和 Matplotlib 创建了一个
小批量的输入数据。下面,我们使用 on 调用来记录图像以供 TensorBoard 使用,并且
我们还会打电话确保它立即写入磁盘。add_image()
SummaryWriter
flush()
# Default log_dir argument is "runs" - but it's good to be specific
# torch.utils.tensorboard.SummaryWriter is imported above
writer = SummaryWriter('runs/fashion_mnist_experiment_1')
# Write image data to TensorBoard log dir
writer.add_image('Four Fashion-MNIST Images', img_grid)
writer.flush()
# To view, start TensorBoard on the command line with:
# tensorboard --logdir=runs
# ...and open a browser tab to http://localhost:6006/
如果您在命令行启动 TensorBoard 并在新的 browser 选项卡(通常位于 localhost:6006),您应该 请参阅 IMAGES 选项卡下的图像网格。
绘制标量以可视化训练¶
TensorBoard 可用于跟踪 训练。下面,我们将运行一个训练循环,跟踪一些指标,并将 TensorBoard 消耗的数据。
让我们定义一个模型来对图像图块进行分类,并定义一个优化器和 用于训练的损失函数:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 4 * 4, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 4 * 4)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
现在让我们训练一个 epoch,并评估训练与验证 每 1000 个 batch 设置 loss:
print(len(validation_loader))
for epoch in range(1): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(training_loader, 0):
# basic training loop
inputs, labels = data
optimizer.zero_grad()
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
if i % 1000 == 999: # Every 1000 mini-batches...
print('Batch {}'.format(i + 1))
# Check against the validation set
running_vloss = 0.0
# In evaluation mode some model specific operations can be omitted eg. dropout layer
net.train(False) # Switching to evaluation mode, eg. turning off regularisation
for j, vdata in enumerate(validation_loader, 0):
vinputs, vlabels = vdata
voutputs = net(vinputs)
vloss = criterion(voutputs, vlabels)
running_vloss += vloss.item()
net.train(True) # Switching back to training mode, eg. turning on regularisation
avg_loss = running_loss / 1000
avg_vloss = running_vloss / len(validation_loader)
# Log the running loss averaged per batch
writer.add_scalars('Training vs. Validation Loss',
{ 'Training' : avg_loss, 'Validation' : avg_vloss },
epoch * len(training_loader) + i)
running_loss = 0.0
print('Finished Training')
writer.flush()
2500
Batch 1000
Batch 2000
Batch 3000
Batch 4000
Batch 5000
Batch 6000
Batch 7000
Batch 8000
Batch 9000
Batch 10000
Batch 11000
Batch 12000
Batch 13000
Batch 14000
Batch 15000
Finished Training
切换到打开的 TensorBoard,然后查看 SCALARS 选项卡。
可视化模型¶
TensorBoard 还可用于检查模型中的数据流。
为此,请使用 model 和 sample 调用
输入:add_graph()
# Again, grab a single mini-batch of images
dataiter = iter(training_loader)
images, labels = next(dataiter)
# add_graph() will trace the sample input through your model,
# and render it as a graph.
writer.add_graph(net, images)
writer.flush()
切换到 TensorBoard 时,您应该会看到一个 GRAPHS 选项卡。 双击“NET”节点可查看 型。
使用嵌入可视化数据集¶
我们使用的 28 x 28 图像图块可以建模为 784 维
向量 (28 * 28 = 784)。将此投影到
低维表示。该方法将
将一组数据投影到方差最大的三个维度上,
并将其显示为交互式 3D 图表。该方法通过投影到三个维度来自动执行此作
具有最大的方差。add_embedding()
add_embedding()
下面,我们将获取数据样本,并生成这样的 embedding:
# Select a random subset of data and corresponding labels
def select_n_random(data, labels, n=100):
assert len(data) == len(labels)
perm = torch.randperm(len(data))
return data[perm][:n], labels[perm][:n]
# Extract a random subset of data
images, labels = select_n_random(training_set.data, training_set.targets)
# get the class labels for each image
class_labels = [classes[label] for label in labels]
# log embeddings
features = images.view(-1, 28 * 28)
writer.add_embedding(features,
metadata=class_labels,
label_img=images.unsqueeze(1))
writer.flush()
writer.close()
现在,如果您切换到 TensorBoard 并选择 PROJECTOR 选项卡,则 应该看到投影的 3D 表示。您可以旋转和 缩放模型。在大尺度和小尺度上检查它,看看 您可以在投影数据中发现模式,并且 标签。
为了获得更好的可见性,建议:
从左侧的 “Color by” 下拉菜单中选择 “label”。
切换顶部的夜间模式图标,将 深色背景上的浅色图像。
其他资源¶
有关更多信息,请查看:
torch.utils.tensorboard.SummaryWriter 上的 PyTorch 文档
PyTorch.org Tutorials 中的 Tensorboard 教程内容
有关 TensorBoard 的更多信息,请参阅 TensorBoard 文档
脚本总运行时间:(2 分 34.811 秒)