torchvision.datasets¶

所有数据集都是 i.即，它们具有和实现的方法。因此，它们都可以传递给 a，后者可以使用 worker 并行加载多个样本。例如：torch.utils.data.Dataset__getitem____len__torch.utils.data.DataLoadertorch.multiprocessing

imagenet_data = torchvision.datasets.ImageNet('path/to/imagenet_root/')
data_loader = torch.utils.data.DataLoader(imagenet_data,
                                          batch_size=4,
                                          shuffle=True,
                                          num_workers=args.nThreads)

以下数据集可用：

数据

所有数据集都有几乎相似的 API。它们都有两个常见的参数：分别转换 input 和 target。transformtarget_transform

名人 ¶

class （root： str， split： str = 'train'， target_type： Union[List[str]， str] = 'attr'， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， download： bool = False） → None[来源]torchvision.datasets.CelebA¶

大规模 CelebFaces 属性（CelebA）数据集数据。

参数：

root （string） – 将图像下载到的根目录。
split （string） - {'train'， 'valid'， 'test'， 'all'} 之一。因此，选择了 dataset。
target_type （字符串或列表，可选） –
要使用的目标类型、、或。也可以是 list 以输出具有所有指定目标类型的 Tuples。目标代表：attridentitybboxlandmarks
- attr（np.array shape=（40，） dtype=int）：属性的二进制（0， 1）标签
- identity（int）：每个人员的标签（具有相同身份的数据点是同一个人）
- bbox（np.array shape=（4，） dtype=int）：边界框（x， y， width， height）
- landmarks（np.array shape=（10，） dtype=int）：地标点（lefteye_x， lefteye_y， righteye_x， righteye_y、nose_x、nose_y、leftmouth_x、leftmouth_y、rightmouth_x、rightmouth_y）
默认为。如果为空，将作为 target 返回。attrNone
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.ToTensor
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
download （bool， optional） – 如果为 true，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。

CIFAR 公司 ¶

class （root： str， train： bool = True， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， download： bool = False） → None[来源]torchvision.datasets.CIFAR10¶

CIFAR10数据。

参数：

root （string） – 数据集的根目录，如果 download 设置为 True，则目录存在或将保存到其中。cifar-10-batches-py
train （bool， optional） – 如果为 True，则从训练集创建数据集，否则从测试集创建。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
download （bool， optional） – 如果为 true，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。

__getitem__(index： int） → Tuple[Any， Any][来源]¶

参数：	index （int） – 索引
返回：	（image， target），其中 target 是 target 类的索引。
返回类型：	元

class （root： str， train： bool = True， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， download： bool = False） → None[来源]torchvision.datasets.CIFAR100¶

CIFAR100数据。

这是 CIFAR10 Dataset 的子类。

城市景观 ¶

注意

需要下载 Cityscape。

类（root： str， split： str = 'train'， mode： str = 'fine'， target_type： Union[List[str]， str] = 'instance'，转换： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None，转换： Union[Callable， NoneType] = None） → None[来源]torchvision.datasets.Cityscapes¶

城市景观数据。

参数：

root （string） – directory 和 or 所在的数据集的根目录。leftImg8bitgtFinegtCoarse
split （string， optional） – 要使用的图像分割，或者 if mode=“fine” 否则，或traintestvaltraintrain_extraval
mode （string， optional） – 要使用的质量模式，或finecoarse
target_type （string or list， optional） – 要使用的目标类型、、或 .也可以是 list 以输出具有所有指定目标类型的 Tuples。instancesemanticpolygoncolor
transform （可调用，可选） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
transforms （callable， optional） – 将输入样本及其目标作为入口的函数/转换并返回转换后的版本。

例子

获取语义分割目标

dataset = Cityscapes('./data/cityscapes', split='train', mode='fine',
                     target_type='semantic')

img, smnt = dataset[0]

获取多个目标

dataset = Cityscapes('./data/cityscapes', split='train', mode='fine',
                     target_type=['instance', 'color', 'polygon'])

img, (inst, col, poly) = dataset[0]

在 “coarse” 集上验证

dataset = Cityscapes('./data/cityscapes', split='val', mode='coarse',
                     target_type='semantic')

img, smnt = dataset[0]

__getitem__(index： int） → Tuple[Any， Any][来源]¶

参数：	index （int） – 索引
返回：	（image， target），其中 target 是所有目标类型的元组，如果 target_type 是包含更多比 1 项。否则，如果 target_type=“polygon”，则 target 为 json 对象，否则为图像分割。
返回类型：	元

可可 ¶

注意

这些需要安装 COCO API

字幕 ¶

class （root： str， annFile： str， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， transforms： Union[Callable， NoneType] = None）[源]torchvision.datasets.CocoCaptions¶

MS Coco 字幕数据。

参数：

root （string） – 将图像下载到的根目录。
annFile （string） – json 注释文件的路径。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.ToTensor
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
transforms （callable， optional） – 将输入样本及其目标作为入口的函数/转换并返回转换后的版本。

例

import torchvision.datasets as dset
import torchvision.transforms as transforms
cap = dset.CocoCaptions(root = 'dir where images are',
                        annFile = 'json annotation file',
                        transform=transforms.ToTensor())

print('Number of samples: ', len(cap))
img, target = cap[3] # load 4th sample

print("Image Size: ", img.size())
print(target)

输出：

Number of samples: 82783
Image Size: (3L, 427L, 640L)
[u'A plane emitting smoke stream flying over a mountain.',
u'A plane darts across a bright blue sky behind a mountain covered in snow',
u'A plane leaves a contrail above the snowy mountain top.',
u'A mountain that has a plane flying overheard in the distance.',
u'A mountain view with a plume of smoke in the background']

检波 ¶

class （root： str， annFile： str， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， transforms： Union[Callable， NoneType] = None）[源]torchvision.datasets.CocoDetection¶

MS Coco 检测数据。

参数：

root （string） – 将图像下载到的根目录。
annFile （string） – json 注释文件的路径。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.ToTensor
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
transforms （callable， optional） – 将输入样本及其目标作为入口的函数/转换并返回转换后的版本。

数据集文件夹 ¶

class （root： str， loader： Callable[[str]， Any]，扩展： Union[Tuple[str， ...]， NoneType] = None， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， is_valid_file： Union[Callable[[str]， bool]， NoneType] = None） → None[来源]torchvision.datasets.DatasetFolder¶

一个通用数据加载器，其中样本按以下方式排列：

root/class_x/xxx.ext
root/class_x/xxy.ext
root/class_x/[...]/xxz.ext

root/class_y/123.ext
root/class_y/nsdf3.ext
root/class_y/[...]/asd932_.ext

参数：

root （string） – 根目录路径。
loader （callable） – 一个函数，用于在给定路径的情况下加载样本。
extensions （tuple[string]） – 允许的扩展列表。不应传递 extensions 和 is_valid_file。
transform （callable， optional） – 一个接受一个示例，并返回转换后的版本。例如，对于图像。transforms.RandomCrop
target_transform （可调用，可选） – 采用并对其进行转换。
is_valid_file – 获取文件路径的函数并检查文件是否为有效文件（用于检查损坏的文件）不应传递 extensions 和 is_valid_file。

__getitem__(index： int） → Tuple[Any， Any][来源]¶

参数：	index （int） – 索引
返回：	（sample， target），其中 target 是 Target 类的class_index。
返回类型：	元

EMNIST ¶

class （root： str， split： str， **kwargs） → 无[来源]torchvision.datasets.EMNIST¶

EMNIST数据。

参数：

root （string） – 数据集的根目录，其中和 exist。EMNIST/processed/training.ptEMNIST/processed/test.pt
split （string） – 数据集有 6 个不同的拆分：、、、、和。此参数指定使用哪一个。byclassbymergebalancedlettersdigitsmnist
train （bool， optional） – 如果为 True，则从中创建数据集，否则从 .training.pttest.pt
download （bool， optional） – 如果为 true，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。

假数据 ¶

类（大小：int = 1000，image_size：元组[int， int， int] = （3， 224， 224），num_classes：int = 10，转换：Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， random_offset： int = 0） → None[来源]torchvision.datasets.FakeData¶

一个虚假数据集，返回随机生成的图像并将其作为 PIL 图像返回

参数：

size （int， optional） – 数据集的大小。默认值：1000 张图片
image_size （tuple， optional） – 如果返回的图像大小。默认值：（3， 224， 224）
num_classes （int， optional） – 数据集中的类数。默认值：10
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
random_offset （int） – 偏移用于生成每个图像。默认值：0

时尚 MNIST ¶

class （root： str， train： bool = True， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， download： bool = False） → None[来源]torchvision.datasets.FashionMNIST¶

时尚 MNIST数据。

参数：

root （string） – 数据集的根目录，其中和 exist。FashionMNIST/processed/training.ptFashionMNIST/processed/test.pt
train （bool， optional） – 如果为 True，则从中创建数据集，否则从 .training.pttest.pt
download （bool， optional） – 如果为 true，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。

Flickr的 ¶

class （根： str， ann_file： str， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None） → None[来源]torchvision.datasets.Flickr8k¶

Flickr8k 实体数据。

参数：	root （string） – 将图像下载到的根目录。 ann_file （string） – 注释文件的路径。 transform （可调用，可选） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，`transforms.ToTensor` target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。

__getitem__(index： int） → Tuple[Any， Any][来源]¶

参数：	index （int） – 索引
返回：	元组（image， target）。target 是图像的字幕列表。
返回类型：	元

class （根： str， ann_file： str， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None） → None[来源]torchvision.datasets.Flickr30k¶

Flickr30k 实体数据。

参数：	root （string） – 将图像下载到的根目录。 ann_file （string） – 注释文件的路径。 transform （可调用，可选） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，`transforms.ToTensor` target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。

__getitem__(index： int） → Tuple[Any， Any][来源]¶

参数：	index （int） – 索引
返回：	元组（image， target）。target 是图像的字幕列表。
返回类型：	元

HMDB51 系列 ¶

class （root， annotation_path， frames_per_clip， step_between_clips=1， frame_rate=None， fold=1， train=True， transform=None， _precomputed_metadata=None， num_workers=1， _video_width=0， _video_height=0， _video_min_dimension=0， _audio_samples=0）[来源]torchvision.datasets.HMDB51¶

HMDB51 数据集。

HMDB51 是一个动作识别视频数据集。此数据集将每个视频视为固定大小的视频剪辑的集合，指定 by ，其中每个剪辑之间的帧步长由给出。frames_per_clipstep_between_clips

举个例子，对于分别具有 10 帧和 15 帧的 2 个视频，如果和，则数据集大小将为（2 + 3） = 5，其中前两个元素将来自视频 1，接下来的三个元素将来自视频 2。请注意，我们删除的剪辑没有 exactly 元素，因此不是全部视频中的帧可能存在。frames_per_clip=5step_between_clips=5frames_per_clip

在内部，它使用 VideoClips 对象来处理剪辑创建。

参数：

root （string） – HMDB51 数据集的根目录。
annotation_path （str） – 包含拆分文件的文件夹的路径。
frames_per_clip （int） – 剪辑中的帧数。
step_between_clips （int） – 每个剪辑之间的帧数。
fold （int， optional） – 要使用的折叠。应介于 1 和 3 之间。
train （bool， optional） – 如果，则从训练拆分创建数据集，否则来自分裂。Truetest
transform （callable， optional）（可调用，可选） – 接收 TxHxWxC 视频的函数/转换并返回转换后的版本。

返回：

具有以下条目的 3 元组：

video （Tensor[T， H， W， C]）： T 视频帧

audio（Tensor[K， L]）：音频帧数，其中 K 是声道数 L 是点数

label （int）：视频剪辑的类

返回类型：

元

ImageFolder （图像文件夹）¶

class （root： str， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， loader： Callable[[str]， Any] = <function default_loader>， is_valid_file： Union[Callable[[str]， bool]， NoneType] = None）[来源]torchvision.datasets.ImageFolder¶

一个通用数据加载器，其中图像按以下方式排列：

root/dog/xxx.png
root/dog/xxy.png
root/dog/[...]/xxz.png

root/cat/123.png
root/cat/nsdf3.png
root/cat/[...]/asd932_.png

参数：

root （string） – 根目录路径。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
loader （callable， optional） – 一个函数，用于在给定图像的路径下加载图像。
is_valid_file – 采用 Image 文件路径的函数并检查文件是否为有效文件（用于检查损坏的文件）

__getitem__(index： int） → Tuple[Any， Any]¶

参数：	index （int） – 索引
返回：	（sample， target），其中 target 是 Target 类的class_index。
返回类型：	元

图像网 ¶

class （root： str， split： str = 'train'， download： Union[str， NoneType] = None， **kwargs） → None[来源]torchvision.datasets.ImageNet¶

ImageNet 2012 分类数据集。

参数：

root （string） – ImageNet 数据集的根目录。
split （string， optional） – 数据集 split、supports 或 .trainval
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
loader – 一个函数，用于在给定路径的情况下加载图像。

注意

这需要安装 scipy

动力学 400 ¶

class （root， frames_per_clip， step_between_clips=1， frame_rate=None， extensions=（'avi'，）， transform=None， _precomputed_metadata=None， num_workers=1， _video_width=0， _video_height=0， _video_min_dimension=0， _audio_samples=0， _audio_channels=0）[来源]torchvision.datasets.Kinetics400¶

Kinetics-400 数据集。

Kinetics-400 是一个动作识别视频数据集。此数据集将每个视频视为固定大小的视频剪辑的集合，指定 by ，其中每个剪辑之间的帧步长由给出。frames_per_clipstep_between_clips

举个例子，对于分别具有 10 帧和 15 帧的 2 个视频，如果和，则数据集大小将为（2 + 3） = 5，其中前两个元素将来自视频 1，接下来的三个元素将来自视频 2。请注意，我们删除的剪辑没有 exactly 元素，因此不是全部视频中的帧可能存在。frames_per_clip=5step_between_clips=5frames_per_clip

在内部，它使用 VideoClips 对象来处理剪辑创建。

参数：

root （string） – Kinetics-400 数据集的根目录。
frames_per_clip （int） – 剪辑中的帧数
step_between_clips （int） – 每个剪辑之间的帧数
transform （callable， optional）（可调用，可选） – 接收 TxHxWxC 视频的函数/转换并返回转换后的版本。

返回：

具有以下条目的 3 元组：

video （Tensor[T， H， W， C]）： T 视频帧

audio（Tensor[K， L]）：音频帧数，其中 K 是声道数 L 是点数

label （int）：视频剪辑的类

返回类型：

元

KMNIST ¶

class （root： str， train： bool = True， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， download： bool = False） → None[来源]torchvision.datasets.KMNIST¶

Kuzushiji-MNIST数据。

参数：

root （string） – 数据集的根目录，其中和 exist。KMNIST/processed/training.ptKMNIST/processed/test.pt
train （bool， optional） – 如果为 True，则从中创建数据集，否则从 .training.pttest.pt
download （bool， optional） – 如果为 true，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。

LSUN 公司 ¶

class （root： str， classes： Union[str， List[str]] = 'train'， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None） → None[源]torchvision.datasets.LSUN¶

LSUN 数据集。

参数：

root （string） – 数据库文件的根目录。
classes （string or list） - {'train'， 'val'， 'test'} 之一或类别来加载。例如，['bedroom_train'， 'church_outdoor_train']。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。

__getitem__(index： int） → Tuple[Any， Any][来源]¶

参数：	index （int） – 索引
返回：	Tuple （image， target），其中 target 是目标类别的索引。
返回类型：	元

MNIST ¶

class （root： str， train： bool = True， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， download： bool = False） → None[来源]torchvision.datasets.MNIST¶

MNIST数据。

参数：

root （string） – 数据集的根目录，其中和 exist。MNIST/processed/training.ptMNIST/processed/test.pt
train （bool， optional） – 如果为 True，则从中创建数据集，否则从 .training.pttest.pt
download （bool， optional） – 如果为 true，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。

Omniglot （全能）¶

class （root： str， background： bool = True， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， download： bool = False） → None[来源]torchvision.datasets.Omniglot¶

Omniglot （全能）数据。

参数：

root （string） – 目录所在的数据集的根目录。omniglot-py
background （bool，可选） – 如果为 True，则从 “background” 集创建数据集，否则从 “evaluation” 集创建。此术语由作者定义。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
download （bool， optional） – 如果为 true，则从 Internet 下载数据集 zip 文件，并且将其放在根目录中。如果 zip 文件已下载，则不会下载再次下载。

摄影导览 ¶

class （root： str， name： str， train： bool = True， transform： Union[Callable， NoneType] = None， download： bool = False） → 无[来源]torchvision.datasets.PhotoTour¶

学习本地图像描述符数据数据。

参数：	root （string） – 图像所在的根目录。 name （string） – 要加载的数据集的名称。 transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。 download （bool， optional） – 如果为 true，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。

__getitem__(index： int） → Union[torch.Tensor、Tuple[Any、Any、torch.张量]][来源]¶

参数：	index （int） – 索引
返回：	（data1， data2，匹配）
返回类型：	元

地点365 ¶

class （root： str， split： str = 'train-standard'， small： bool = False， download： bool = False， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， loader： Callable[[str]， Any] = <function default_loader>） → None[来源]torchvision.datasets.Places365¶

Places365 分类数据集。

参数：

root （string） – Places365 数据集的根目录。
split （string， optional）（拆分） – 数据集拆分。可以是（default）、、 .train-standardtrain-challendgeval
small （bool，可选） – 如果，则使用小图像，即调整为 256 x 256 像素，而不是高分辨率的。True
download （bool， optional） – 如果，则下载数据集组件并将其放置在 .已经下载的存档不会再次下载。Trueroot
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
loader – 一个函数，用于在给定路径的情况下加载图像。

提高：

– if 和 meta 文件（即 devkit）不存在或已损坏。download is False
– 如果并且图像存档已提取。download is True

QMNIST 公司 ¶

class （root： str， what： Union[str， NoneType] = None， compat： bool = True， train： bool = True， **kwargs） → 无[来源]torchvision.datasets.QMNIST¶

QMNIST 公司数据。

参数：

root （string） – 数据集的根目录，其子目录包含包含数据集的 torch 二进制文件。processed
what （string，optional） – 可以是 'train'， 'test'， 'test10k'， 'test50k' 或 'nist' 分别对应于 mnist 兼容训练集、60K QMNIST 测试集、10K QMNIST 与 MNIST 测试集 50k 匹配的示例其余的 QMNIST 测试示例，或所有 NIST 数字。默认选择 'train' 或 'test' 根据兼容性参数 'train'。
compat （bool，optional） – 一个布尔值，表示目标对于每个示例都是类编号（为了与 MNIST 数据加载器）或包含完整的 QMNIST 信息。默认值 = True。
download （bool， optional） – 如果为 true，则从 Internet 并将其放在根目录中。如果 dataset 为已下载，则不会再次下载。
transform （callable， optional） – 一个函数/转换接收 PIL 图像并返回转换后的版本。例如，transforms.RandomCrop
target_transform （callable， optional） – 函数/转换接收目标并对其进行转换。
train （bool，optional，compatibility） – 当参数 'what' 是什么时未指定，则此布尔值决定是否加载 training set 的 test set 添加到测试集。默认值：True。

SBD ¶

class （root： str， image_set： str = 'train'， mode： str = 'boundaries'， download： bool = False， transforms： Union[Callable， NoneType] = None） → None[来源]torchvision.datasets.SBDataset¶

语义边界数据集

SBD 当前包含来自 PASCAL VOC 2011 数据集的 11355 张图像的注释。

注意

请注意，此数据集中包含的 train 和 val splits 与 PASCAL VOC 数据集中的拆分。特别是，一些 “train” 图像可能是 VOC2012 val. 如果您对 VOC 2012 值测试感兴趣，请使用 image_set='train_noval'，，不包括所有 val 图像。

警告

此类需要 scipy 从 .mat 格式加载目标文件。

参数：

root （string） – 语义边界数据集的根目录
image_set （string， optional） – 选择要使用的image_set、或 . 图像集不包括 VOC 2012 val 图像。trainvaltrain_novaltrain_noval
mode （string， optional）（模式字符串，可选） – 选择目标类型。可能的值 'boundaries' 或 'segmentation'。在 'boundaries' 的情况下，目标是形状为 [num_classes， H， W] 的数组，其中 num_classes=20。
download （bool， optional） – 如果为 true，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。
transforms （callable， optional） – 将输入样本及其目标作为入口的函数/转换并返回转换后的版本。输入样本是 PIL 图像，目标是 numpy 数组如果 mode='boundaries' 或 PIL 图像 if mode='segmentation'。

小型总线单元 ¶

class （root： str， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， download： bool = True） → None[来源]torchvision.datasets.SBU¶

SBU 字幕照片数据。

参数：

root （string） – 存在 tarball 的数据集的根目录。SBUCaptionedPhotoDataset.tar.gz
transform （可调用，可选） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
download （bool， optional） – 如果为 True，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。

__getitem__(index： int） → Tuple[Any， Any][来源]¶

参数：	index （int） – 索引
返回：	（image， target），其中 target 是照片的题注。
返回类型：	元

STL10 系列 ¶

class （root： str， split： str = 'train'， folds： Union[int， NoneType] = None， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， download： bool = False） → None[来源]torchvision.datasets.STL10¶

STL10 系列数据。

参数：

root （string） – 目录所在的数据集的根目录。stl10_binary
split （string） - {'train'， 'test'， 'unlabeled'， 'train+unlabeled'} 之一。因此，选择了 dataset。
folds （int， optional） – {0-9} 或 None 之一。对于训练，加载 1k 样本的 10 个预定义折叠之一，用于标准评估程序。如果未传递任何值，则加载 5k 样本。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
download （bool， optional） – 如果为 true，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。

__getitem__(index： int） → Tuple[Any， Any][来源]¶

参数：	index （int） – 索引
返回：	（image， target），其中 target 是 target 类的索引。
返回类型：	元

SVHN 系列 ¶

class （root： str， split： str = 'train'， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， download： bool = False） → None[来源]torchvision.datasets.SVHN¶

SVHN 系列数据。注意：SVHN 数据集将标签 10 分配给数字 0。但是，在此 Dataset 中，我们将标签 0 分配给数字 0 以兼容 PyTorch 损失函数，该函数类标签应在 [0， C-1] 范围内

警告

这个类需要 scipy 来加载 .mat 格式的数据。

参数：

root （string） – 目录所在的数据集的根目录。SVHN
split （string） - {'train'， 'test'， 'extra'} 之一。因此，选择了 dataset。'extra' 是 Extra 训练集。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
download （bool， optional） – 如果为 true，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。

__getitem__(index： int） → Tuple[Any， Any][来源]¶

参数：	index （int） – 索引
返回：	（image， target），其中 target 是 target 类的索引。
返回类型：	元

UCF101 型 ¶

class （root， annotation_path， frames_per_clip， step_between_clips=1， frame_rate=None， fold=1， train=True， transform=None， _precomputed_metadata=None， num_workers=1， _video_width=0， _video_height=0， _video_min_dimension=0， _audio_samples=0）[来源]torchvision.datasets.UCF101¶

UCF101 数据集。

UCF101 是一个动作识别视频数据集。此数据集将每个视频视为固定大小的视频剪辑的集合，指定 by ，其中每个剪辑之间的帧步长由给出。frames_per_clipstep_between_clips

举个例子，对于分别具有 10 帧和 15 帧的 2 个视频，如果和，则数据集大小将为（2 + 3） = 5，其中前两个元素将来自视频 1，接下来的三个元素将来自视频 2。请注意，我们删除的剪辑没有 exactly 元素，因此不是全部视频中的帧可能存在。frames_per_clip=5step_between_clips=5frames_per_clip

在内部，它使用 VideoClips 对象来处理剪辑创建。

参数：

root （string） – UCF101 数据集的根目录。
annotation_path （str） – 包含拆分文件的文件夹的路径
frames_per_clip （int） - 剪辑中的帧数。
step_between_clips （int， optional） – 每个剪辑之间的帧数。
fold （int， optional） – 要使用的 fold。应介于 1 和 3 之间。
train （bool， optional） – 如果，则从训练拆分创建数据集，否则来自分裂。Truetest
transform （callable， optional）（可调用，可选） – 接收 TxHxWxC 视频的函数/转换并返回转换后的版本。

返回：

具有以下条目的 3 元组：

video （Tensor[T， H， W， C]）： T 视频帧

audio（Tensor[K， L]）：音频帧数，其中 K 是声道数 L 是点数

label （int）：视频剪辑的类

返回类型：

元

美国邮政 ¶

class （root： str， train： bool = True， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， download： bool = False） → None[来源]torchvision.datasets.USPS¶

美国邮政数据。数据格式为： [label [index：value ]*256 n] * num_lines，其中位于。每个像素的值位于中。在这里，我们将转换为并在中生成像素值。label[1, 10][-1, 1]label[0, 9][0, 255]

参数：

root （string） – 用于存储 USPS'' 数据文件的数据集的根目录。
train （bool， optional） – 如果为 True，则从中创建数据集，否则从 .usps.bz2usps.t.bz2
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
download （bool， optional） – 如果为 true，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。

__getitem__(index： int） → Tuple[Any， Any][来源]¶

参数：	index （int） – 索引
返回：	（image， target），其中 target 是 target 类的索引。
返回类型：	元

挥发性有机化合物 ¶

class （root： str， year： str = '2012'， image_set： str = 'train'， download： bool = False， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， transforms： Union[Callable， NoneType] = None）[来源]torchvision.datasets.VOCSegmentation¶

帕斯卡 VOCSegmentation 数据集。

参数：

root （string） – VOC 数据集的根目录。
year （string， optional） – 数据集 year，支持 years 到。"2007""2012"
image_set （string， optional） – 选择要使用的image_set、或 .如果，也可以是。"train""trainval""val"year=="2007""test"
download （bool， optional） – 如果为 true，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （可调用，可选） – 一个函数/转换，它接受 target 并对其进行转换。
transforms （callable， optional） – 将输入样本及其目标作为入口的函数/转换并返回转换后的版本。

__getitem__(index： int） → Tuple[Any， Any][来源]¶

参数：	index （int） – 索引
返回：	（image， target），其中 target 是图像分割。
返回类型：	元

class （root： str， year： str = '2012'， image_set： str = 'train'， download： bool = False， transform： Union[Callable， NoneType] = None， target_transform： Union[Callable， NoneType] = None， transforms： Union[Callable， NoneType] = None）[来源]torchvision.datasets.VOCDetection¶

帕斯卡 VOC检测数据集。

参数：

root （string） – VOC 数据集的根目录。
year （string， optional） – 数据集 year，支持 years 到。"2007""2012"
image_set （string， optional） – 选择要使用的image_set、或 .如果，也可以是。"train""trainval""val"year=="2007""test"
download （bool， optional） – 如果为 true，则从 Internet 下载数据集，并且将其放在根目录中。如果 dataset 已下载，则不是再次下载。（默认：VOC 的 20 个类的字母索引）。
transform （callable， optional） – 接收 PIL 图像的函数/转换并返回转换后的版本。例如，transforms.RandomCrop
target_transform （callable， required） – 一个函数/转换，它接受 target 并对其进行转换。
transforms （callable， optional） – 将输入样本及其目标作为入口的函数/转换并返回转换后的版本。

__getitem__(index： int） → Tuple[Any， Any][来源]¶

参数：	index （int） – 索引
返回：	（image， target），其中 target 是 XML 树的字典。
返回类型：	元

torchvision.datasets¶

名人 ¶

CIFAR 公司 ¶

城市景观 ¶

可可 ¶

字幕 ¶

检波 ¶

数据集文件夹 ¶

EMNIST ¶

假数据 ¶

时尚 MNIST ¶

Flickr的 ¶

HMDB51 系列 ¶

ImageFolder （图像文件夹）¶

图像网 ¶

动力学 400 ¶

KMNIST ¶

LSUN 公司 ¶

MNIST ¶

Omniglot （全能）¶

摄影导览 ¶

地点365 ¶

QMNIST 公司 ¶

SBD ¶

小型总线单元 ¶

STL10 系列 ¶

SVHN 系列 ¶

UCF101 型 ¶

美国邮政 ¶

挥发性有机化合物 ¶

文档

教程

资源