torch.nested¶

介绍¶

警告

嵌套张量的 PyTorch API 处于原型阶段，将在不久的将来发生变化。

NestedTensor 允许用户将 Tensor 列表打包到一个高效的数据结构中。

对输入 Tensor 的唯一约束是它们的维度必须匹配。

这样可以更高效地表示元数据和访问专门构建的内核。

NestedTensors 的一个应用是表达各个域中的 Sequential 数据。虽然传统方法是填充可变长度序列，但 NestedTensor 使用户能够绕过填充。用于对嵌套张量调用作的 API 也不例外从常规的，它应该允许与现有模型无缝集成，主要区别在于 inputs 的构造。torch.Tensor

由于这是一个原型功能，因此支持的作仍然有效有限。但是，我们欢迎问题、功能请求和贡献。有关贡献的更多信息，请参阅此自述文件。

建设¶

构造很简单，涉及将 Tensor 列表传递给构造函数。torch.nested.nested_tensor

>>> a, b = torch.arange(3), torch.arange(5) + 3
>>> a
tensor([0, 1, 2])
>>> b
tensor([3, 4, 5, 6, 7])
>>> nt = torch.nested.nested_tensor([a, b])
>>> nt
nested_tensor([
  tensor([0, 1, 2]),
    tensor([3, 4, 5, 6, 7])
    ])

数据类型、设备以及是否需要梯度可以通过通常的关键字参数进行选择。

>>> nt = torch.nested.nested_tensor([a, b], dtype=torch.float32, device="cuda", requires_grad=True)
>>> nt
nested_tensor([
  tensor([0., 1., 2.], device='cuda:0', requires_grad=True),
  tensor([3., 4., 5., 6., 7.], device='cuda:0', requires_grad=True)
], device='cuda:0', requires_grad=True)

在中，可以用来保留 autograd 传递给构造函数的张量的 history。有关更多信息，请参阅嵌套张量构造函数和转换函数.torch.as_tensortorch.nested.as_nested_tensor

为了形成一个有效的 NestedTensor，所有传递的 Tensor 都需要在维度上匹配，但其他属性都不需要匹配。

>>> a = torch.randn(3, 50, 70) # image 1
>>> b = torch.randn(3, 128, 64) # image 2
>>> nt = torch.nested.nested_tensor([a, b], dtype=torch.float32)
>>> nt.dim()
4

如果其中一个维度不匹配，则构造函数将引发错误。

>>> a = torch.randn(50, 128) # text 1
>>> b = torch.randn(3, 128, 64) # image 2
>>> nt = torch.nested.nested_tensor([a, b], dtype=torch.float32)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: All Tensors given to nested_tensor must have the same dimension. Found dimension 3 for Tensor at index 1 and dimension 2 for Tensor at index 0.

请注意，传递的 Tensor 将被复制到一个连续的内存块中。结果 NestedTensor 分配新的内存来存储它们，并且不保留引用。

目前我们只支持一个级别的嵌套，即简单、扁平的 Tensor 列表。未来我们可以添加对多级嵌套的支持，例如完全由 Tensor 列表组成的列表。请注意，对于此扩展，请务必在条目之间保持均匀的嵌套水平，以便生成的 NestedTensor 具有明确定义的维度。如果您需要此功能，请提交功能请求，以便我们可以跟踪它并相应地进行计划。

大小¶

即使 NestedTensor 不支持（or ），它也支持维度 i 是否为规则。.size().shape.size(i)

>>> a = torch.randn(50, 128) # text 1
>>> b = torch.randn(32, 128) # text 2
>>> nt = torch.nested.nested_tensor([a, b], dtype=torch.float32)
>>> nt.size(0)
2
>>> nt.size(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: Given dimension 1 is irregular and does not have a size.
>>> nt.size(2)
128

如果所有维度都是 regular 的，则 NestedTensor 在语义上与 regular 没有区别。torch.Tensor

>>> a = torch.randn(20, 128) # text 1
>>> nt = torch.nested.nested_tensor([a, a], dtype=torch.float32)
>>> nt.size(0)
2
>>> nt.size(1)
20
>>> nt.size(2)
128
>>> torch.stack(nt.unbind()).size()
torch.Size([2, 20, 128])
>>> torch.stack([a, a]).size()
torch.Size([2, 20, 128])
>>> torch.equal(torch.stack(nt.unbind()), torch.stack([a, a]))
True

将来，我们可能会更轻松地检测此情况并无缝转换。

如果您需要此功能（或任何其他相关功能），请打开功能请求。

㩠¶

unbind允许您检索成分股的视图。

>>> import torch
>>> a = torch.randn(2, 3)
>>> b = torch.randn(3, 4)
>>> nt = torch.nested.nested_tensor([a, b], dtype=torch.float32)
>>> nt
nested_tensor([
  tensor([[ 1.2286, -1.2343, -1.4842],
          [-0.7827,  0.6745,  0.0658]]),
  tensor([[-1.1247, -0.4078, -1.0633,  0.8083],
          [-0.2871, -0.2980,  0.5559,  1.9885],
          [ 0.4074,  2.4855,  0.0733,  0.8285]])
])
>>> nt.unbind()
(tensor([[ 1.2286, -1.2343, -1.4842],
        [-0.7827,  0.6745,  0.0658]]), tensor([[-1.1247, -0.4078, -1.0633,  0.8083],
        [-0.2871, -0.2980,  0.5559,  1.9885],
        [ 0.4074,  2.4855,  0.0733,  0.8285]]))
>>> nt.unbind()[0] is not a
True
>>> nt.unbind()[0].mul_(3)
tensor([[ 3.6858, -3.7030, -4.4525],
        [-2.3481,  2.0236,  0.1975]])
>>> nt
nested_tensor([
  tensor([[ 3.6858, -3.7030, -4.4525],
          [-2.3481,  2.0236,  0.1975]]),
  tensor([[-1.1247, -0.4078, -1.0633,  0.8083],
          [-0.2871, -0.2980,  0.5559,  1.9885],
          [ 0.4074,  2.4855,  0.0733,  0.8285]])
])

请注意，这不是副本，而是底层内存的切片，它表示 NestedTensor 的第一个条目或组成部分。nt.unbind()[0]

嵌套张量构造函数和转换函数¶

以下函数与嵌套张量相关：

torch.nested 中。nested_tensor（tensor_list， *， dtype=无， layout=无， device=无， requires_grad=False， pin_memory=False）[来源]¶

从张量列表中构造一个没有 autograd 历史记录的嵌套张量（也称为“叶张量”，请参阅 Autograd 机制）。tensor_list

参数

tensor_list （List[array_like]） – 张量列表，或可以传递给 torch.tensor 的任何内容，
維度。（其中列表的每个元素都相同）–

关键字参数

DTYPE (torch.dtype，可选） – 返回的嵌套张量的所需类型。默认值：如果 None ，则相同torch.dtype作为列表中最左边的张量。
布局 (torch.layout，可选） – 返回的嵌套张量的所需布局。仅支持跨纹和锯齿状布局。默认值：如果为 None，则为 sttrided 布局。
装置 (torch.device，可选） – 返回的嵌套张量的所需设备。默认值：如果 None ，则相同torch.device作为列表中最左边的张量
requires_grad （bool， optional） – 如果 autograd 应该记录对返回 Nested Tensor。违约：。False
pin_memory （bool，可选） – 如果设置，则返回的嵌套张量将在固定的内存。仅适用于 CPU 张量。违约：。False

返回类型

张肌

例：

>>> a = torch.arange(3, dtype=torch.float, requires_grad=True)
>>> b = torch.arange(5, dtype=torch.float, requires_grad=True)
>>> nt = torch.nested.nested_tensor([a, b], requires_grad=True)
>>> nt.is_leaf
True

torch.nested 中。as_nested_tensor（ts， dtype=None， device=None， layout=None）[来源]¶

构造一个嵌套张量，保留来自张量或张。

如果传递了嵌套张量，则直接返回该张量，除非 device / dtype / layout 不同。请注意，转换 device / dtype 将导致副本，而转换布局此功能目前不支持。

如果传递非嵌套张量，则将其视为大小一致的成分批次。如果传递的设备/dtype 与输入的设备/dtype 不同，或者如果输入是非连续的。否则，将直接使用 input 的存储。

如果提供了张量列表，则在构建嵌套张量。

参数

ts （Tensor or List[Tensor] or Tuple[Tensor]） – 要视为嵌套张量的张量或具有相同 NDIM 的张量列表 / 元组

关键字参数

DTYPE (torch.dtype，可选） – 返回的嵌套张量的所需类型。默认值：如果 None ，则相同torch.dtype作为列表中最左边的张量。
装置 (torch.device，可选） – 返回的嵌套张量的所需设备。默认值：如果 None ，则相同torch.device作为列表中最左边的张量
布局 (torch.layout，可选） – 返回的嵌套张量的所需布局。仅支持跨纹和锯齿状布局。默认值：如果为 None，则为 sttrided 布局。

返回类型

张肌

例：

>>> a = torch.arange(3, dtype=torch.float, requires_grad=True)
>>> b = torch.arange(5, dtype=torch.float, requires_grad=True)
>>> nt = torch.nested.as_nested_tensor([a, b])
>>> nt.is_leaf
False
>>> fake_grad = torch.nested.nested_tensor([torch.ones_like(a), torch.zeros_like(b)])
>>> nt.backward(fake_grad)
>>> a.grad
tensor([1., 1., 1.])
>>> b.grad
tensor([0., 0., 0., 0., 0.])
>>> c = torch.randn(3, 5, requires_grad=True)
>>> nt2 = torch.nested.as_nested_tensor(c)

torch.nested 中。to_padded_tensor（input， padding， output_size=None， out=None） → 张量¶

通过填充嵌套张量返回一个新的（非嵌套的）Tensor。前导条目将填充嵌套数据，而尾随条目将被填充。input

警告

to_padded_tensor()始终复制基础数据，因为嵌套和非嵌套张量的内存布局不同。

参数

padding （float） – 尾随条目的填充值。

关键字参数

output_size （Tuple[int]） – 输出张量的大小。如果给定，则它必须足够大以包含所有嵌套数据; else，将通过获取每个维度上每个嵌套子张量的最大大小来推断。
out （Tensor， optional） - 输出张量。

例：

>>> nt = torch.nested.nested_tensor([torch.randn((2, 5)), torch.randn((3, 4))])
nested_tensor([
  tensor([[ 1.6862, -1.1282,  1.1031,  0.0464, -1.3276],
          [-1.9967, -1.0054,  1.8972,  0.9174, -1.4995]]),
  tensor([[-1.8546, -0.7194, -0.2918, -0.1846],
          [ 0.2773,  0.8793, -0.5183, -0.6447],
          [ 1.8009,  1.8468, -0.9832, -1.5272]])
])
>>> pt_infer = torch.nested.to_padded_tensor(nt, 0.0)
tensor([[[ 1.6862, -1.1282,  1.1031,  0.0464, -1.3276],
         [-1.9967, -1.0054,  1.8972,  0.9174, -1.4995],
         [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000]],
        [[-1.8546, -0.7194, -0.2918, -0.1846,  0.0000],
         [ 0.2773,  0.8793, -0.5183, -0.6447,  0.0000],
         [ 1.8009,  1.8468, -0.9832, -1.5272,  0.0000]]])
>>> pt_large = torch.nested.to_padded_tensor(nt, 1.0, (2, 4, 6))
tensor([[[ 1.6862, -1.1282,  1.1031,  0.0464, -1.3276,  1.0000],
         [-1.9967, -1.0054,  1.8972,  0.9174, -1.4995,  1.0000],
         [ 1.0000,  1.0000,  1.0000,  1.0000,  1.0000,  1.0000],
         [ 1.0000,  1.0000,  1.0000,  1.0000,  1.0000,  1.0000]],
        [[-1.8546, -0.7194, -0.2918, -0.1846,  1.0000,  1.0000],
         [ 0.2773,  0.8793, -0.5183, -0.6447,  1.0000,  1.0000],
         [ 1.8009,  1.8468, -0.9832, -1.5272,  1.0000,  1.0000],
         [ 1.0000,  1.0000,  1.0000,  1.0000,  1.0000,  1.0000]]])
>>> pt_small = torch.nested.to_padded_tensor(nt, 2.0, (2, 2, 2))
RuntimeError: Value in output_size is less than NestedTensor padded size. Truncation is not supported.

支持的作¶

在本节中，我们总结了当前支持的作 NestedTensor 和它们具有的任何约束。

PyTorch作	约束
`torch.matmul()`	支持两个（>= 3d）嵌套张量之间的矩阵乘法，其中最后两个维度是矩阵维度，前导（批次）维度具有相同的大小（即尚无对 Batch 维度的广播支持）。
`torch.bmm()`	支持两个 3-d 嵌套张量的批量矩阵乘法。
`torch.nn.Linear()`	支持 3-d 嵌套输入和密集的 2-d 权重矩阵。
`torch.nn.functional.softmax()`	支持除 dim=0 之外的所有 dim 的 softmax。
`torch.nn.Dropout()`	行为与常规张量相同。
`torch.Tensor.masked_fill()`	行为与常规张量相同。
`torch.relu()`	行为与常规张量相同。
`torch.gelu()`	行为与常规张量相同。
`torch.silu()`	行为与常规张量相同。
`torch.abs()`	行为与常规张量相同。
`torch.sgn()`	行为与常规张量相同。
`torch.logical_not()`	行为与常规张量相同。
`torch.neg()`	行为与常规张量相同。
`torch.sub()`	支持两个嵌套张量的元素减法。
`torch.add()`	支持按元素添加两个嵌套张量。支持将标量添加到嵌套张量。
`torch.mul()`	支持两个嵌套张量的元素乘法。支持将嵌套张量乘以标量。
`torch.select()`	支持沿所有维度选择。
`torch.clone()`	行为与常规张量相同。
`torch.detach()`	行为与常规张量相同。
`torch.unbind()`	仅支持解除绑定。`dim=0`
`torch.reshape()`	支持以 reserved 的大小进行重塑（即无法更改嵌套的张量数量）。与常规张量不同，此处的大小意味着现有大小是继承的。具体而言，不规则维度的唯一有效大小是。大小推断尚未实现，因此对于新维度，大小不能为。`dim=0-1-1-1`
`torch.Tensor.reshape_as()`	与的约束类似。`reshape`
`torch.transpose()`	支持转置除 .`dim=0`
`torch.Tensor.view()`	新形状的规则类似于的规则。`reshape`
`torch.empty_like()`	行为类似于常规张量的行为;返回与输入的嵌套结构匹配的新空嵌套张量（即具有未初始化的值）。
`torch.randn_like()`	行为类似于常规张量的行为;返回一个新的嵌套张量，其值根据与输入的嵌套结构匹配的标准正态分布随机初始化。
`torch.zeros_like()`	行为类似于常规张量的行为;返回一个新的嵌套张量，其中所有零值都与输入的嵌套结构匹配。
`torch.nn.LayerNorm()`	该参数被限制为不扩展到 NestedTensor 的不规则维度。`normalized_shape`

torch.nested¶

介绍¶

建设¶

大小¶

㩠¶

嵌套张量构造函数和转换函数¶

支持的作¶

文档

教程

资源