读取/写入图像和视频¶

该软件包提供用于执行 IO 的函数操作。它们目前特定于读取和写入视频和图像。torchvision.io

视频¶

（文件名[， start_pts， end_pts， ...]）	从文件中读取视频，同时返回视频帧和音频帧
（文件名[， pts_unit]）	列出视频帧时间戳。
（文件名、video_array、fps[， ...]）	在视频文件中写入 [T， H， W， C] 格式的 4d 张量

细粒度视频 API¶

除了功能之外，我们还提供高性能与函数相比，用于更精细的控制。它在完全支持 torchscript 的同时完成所有这些工作。read_videoread_video

警告

细粒度视频 API 目前处于 Beta 阶段，不保证向后兼容性。

（[src， stream， num_threads， path]）

细粒度视频阅读 API。

检查视频的示例：

import torchvision
video_path = "path to a test video"
# Constructor allocates memory and a threaded decoder
# instance per video. At the moment it takes two arguments:
# path to the video file, and a wanted stream.
reader = torchvision.io.VideoReader(video_path, "video")

# The information about the video can be retrieved using the
# `get_metadata()` method. It returns a dictionary for every stream, with
# duration and other relevant metadata (often frame rate)
reader_md = reader.get_metadata()

# metadata is structured as a dict of dicts with following structure
# {"stream_type": {"attribute": [attribute per stream]}}
#
# following would print out the list of frame rates for every present video stream
print(reader_md["video"]["fps"])

# we explicitly select the stream we would like to operate on. In
# the constructor we select a default video stream, but
# in practice, we can set whichever stream we would like
video.set_current_stream("video:0")

图像¶

（值）	读取图像时支持各种模式。

（路径[，模式]）	将 JPEG 或 PNG 图像读取为 3 维 RGB 或灰度 Tensor。
（input[，模式]）	检测图像是 JPEG 还是 PNG，并执行适当的操作将图像解码为 3 维 RGB 或灰度 Tensor。
（input[，质量]）	采用 CHW 布局中的输入张量，并返回一个缓冲区，其中包含其相应 JPEG 文件的内容。
（input[， mode， device]）	将 JPEG 图像解码为 3 维 RGB 或灰度 Tensor。
（输入、文件名[、质量]）	获取 CHW 布局中的输入张量并将其保存在 JPEG 文件中。
（输入 [， compression_level]）	采用 CHW 布局中的输入张量，并返回一个缓冲区，其中包含其相应 PNG 文件的内容。
（input[，模式]）	将 PNG 图像解码为 3 维 RGB 或灰度 Tensor。
（输入，文件名[， compression_level]）	获取 CHW 布局中的输入张量（如果是灰度图像，则为 HW）并将其保存在 PNG 文件中。
（路径）	读取文件的字节内容并将其输出为具有一维的 uint8 张量。
（文件名、数据）	将具有一维的 uint8 张量的内容写入文件。

读取/写入图像和视频¶

视频¶

细粒度视频 API¶

图像¶

文档

教程

资源