torch.library¶

torch.library 是用于扩展 PyTorch 核心库的 API 集合的运算符。它包含用于测试自定义运算符、创建新的自定义运算符，以及使用 PyTorch 的 C++ 运算符定义的扩展运算符注册 API（例如 aten 运算符）。

有关有效使用这些 API 的详细指南，请参阅有关如何有效使用这些 API 的更多详细信息，请参阅 PyTorch 自定义运算符登录页面。

测试自定义作¶

用torch.library.opcheck()要测试自定义作是否错误地使用了 Python torch.library 和/或 C++ TORCH_LIBRARY API。此外，如果您的作员支持 training，使用torch.autograd.gradcheck()来测试梯度是否为数学上正确。

torch.library 中。opcheck（op， args， kwargs=无， *， test_utils=（'test_schema'， 'test_autograd_registration'， 'test_faketensor'， 'test_aot_dispatch_dynamic'）， raise_exception=True）[来源]¶

给定一个运算符和一些示例参数，测试运算符是否为已正确注册。

也就是说，当您使用 torch.library/TORCH_LIBRARY API 创建一个自定义运算，您指定了有关自定义运算的元数据（例如可变性信息）这些 API 要求您传递给它们的函数满足某些属性（例如，在 fake/meta/abstract 内核中没有数据指针访问）测试这些元数据和属性。opcheck

具体来说，我们测试以下内容：

test_schema：如果 schema 与运算符。例如：如果 schema 指定 Tensor 已更改，则然后我们检查 implementation 是否改变了 Tensor。如果架构指定我们返回一个新的 Tensor，然后我们检查 implementation 返回一个新的 Tensor（而不是现有的 Tensor 或现有视图）。
test_autograd_registration：如果作员支持培训（autograd）：我们检查其 autograd 公式是否通过 torch.library.register_autograd 或手动注册或更多 DispatchKey：：Autograd 键。任何其他基于 DispatchKey 的注册可能会导致未定义的行为。
test_faketensor：如果算子有 FakeTensor 内核（如果它是正确的）。FakeTensor 内核是必需的（但还不够），以便运算符使用 PyTorch 编译 API （torch.compile/export/FX）中。我们检查 FakeTensor 内核（有时也称为元内核）已注册为运算符，并且它是正确的。此测试采用在实际张量上运行算子以及运行 FakeTensors 上的运算符，并检查它们是否具有相同的张量元数据（sizes/strides/dtype/device/etc）。
test_aot_dispatch_dynamic：如果作员具有正确的行为使用 PyTorch 编译 API （torch.compile/export/FX）。这将检查输出（和梯度，如果适用）是否为在 eager-mode PyTorch 和 torch.compile 下相同。此测试是的超集，并且是 e2e 测试; 它测试的其他内容是 Operator 支持函数化，并且向后传递（如果存在）也支持 FakeTensor 和函数化。test_faketensor

为获得最佳效果，请使用代表性的输入集。如果您的运维支持 autograd，请与 input 一起使用; 如果您的运维支持多种设备（例如 CPU 和 CUDA），请在所有支持的设备上与 Inputs 一起使用。opcheckopcheckrequires_grad = Trueopcheck

参数

op （Union[OpOverload， OpOverloadPacket， CustomOpDef]） – 运算符。必须是用torch.library.custom_op()或 OpOverload/OpOverloadPacket 在 torch.ops.* 中找到（例如 torch.ops.aten.sin、torch.ops.mylib.foo）
args （Tuple[Any， ...]） – 运算符的 args
kwargs （Optional[Dict[str， Any]]） – 运算符的 kwargs
test_utils （Union[str， Sequence[str]]） – 我们应该运行的测试。默认值：all of them。示例：（“test_schema”， “test_faketensor”）
raise_exception （bool） – 如果我们应该在第一个错误。如果为 False，我们将返回一个包含信息 on 是否每个测试都通过。

返回类型

Dict[str， str]

警告

opcheck 和torch.autograd.gradcheck()测试不同的东西; opcheck 测试您对 torch.library API 的使用是否正确，而torch.autograd.gradcheck()检验 autograd 公式是否为数学上正确。使用两者来测试支持梯度计算。

例

>>> @torch.library.custom_op("mylib::numpy_mul", mutates_args=())
>>> def numpy_add(x: Tensor, y: float) -> Tensor:
>>>     x_np = x.numpy(force=True)
>>>     z_np = x_np + y
>>>     return torch.from_numpy(z_np).to(x.device)
>>>
>>> @numpy_sin.register_fake
>>> def _(x, y):
>>>     return torch.empty_like(x)
>>>
>>> def setup_context(ctx, inputs, output):
>>>     y, = inputs
>>>     ctx.y = y
>>>
>>> def backward(ctx, grad):
>>>     return grad * ctx.y, None
>>>
>>> numpy_sin.register_autograd(backward, setup_context=setup_context)
>>>
>>> sample_inputs = [
>>>     (torch.randn(3), 3.14),
>>>     (torch.randn(2, 3, device='cuda'), 2.718),
>>>     (torch.randn(1, 10, requires_grad=True), 1.234),
>>>     (torch.randn(64, 64, device='cuda', requires_grad=True), 90.18),
>>> ]
>>>
>>> for args in sample_inputs:
>>>     torch.library.opcheck(foo, args)

在 Python 中创建新的自定义运算¶

用torch.library.custom_op()以创建新的自定义作。

torch.library 中。custom_op（name， fn=None， /， *， mutates_args， device_types=None， schema=None)¶

将函数包装到自定义运算符中。

您可能希望创建自定义运算的原因包括： - 包装第三方库或自定义内核以使用 PyTorch Autograd 等子系统。 - 防止 torch.compile/export/FX 跟踪窥视您的函数。

此 API 用作函数的装饰器（请参阅示例）。提供的函数必须具有类型提示;这些是接口所必需的与 PyTorch 的各种子系统一起使用。

参数

name （str） – 自定义运算的名称，类似于 “{namespace}：：{name}”，例如 “mylib：：my_linear”。该名称用作 op 的稳定标识符在 PyTorch 子系统中（例如 torch.export、FX 图）。为避免名称冲突，请使用您的项目名称作为命名空间; 例如，PyTorch/FBGEMM 中的所有自定义作都使用“fbgemm”作为命名空间。
mutates_args （Iterable[str] or “unknown”） – 函数改变的 args 的名称。这必须是准确的，否则，行为是未定义的。如果为 “unknown”，它悲观地假设 Operator 的所有 Importing 都在改变。
device_types （无 | str |Sequence[str]） – 函数的设备类型有效。如果未提供设备类型，则函数用作所有设备类型的默认实现。示例：“cpu”、“cuda”。当为不接受 Tensor 的算子注册特定于设备的实现时，我们要求作员具有 “device： torch.device argument”。
schema （无 | str）– 运算符的架构字符串。如果没有（推荐）我们将从运算符的类型推断出其 schema 附注。我们建议让我们推断一个 schema，除非你有具体的原因不这样做。示例：“（Tensor x， int y） -> （Tensor， Tensor）”。

返回类型

调用

注意

我们建议不要传入 arg，而是让我们推断 it 来自类型注释。编写自己的架构很容易出错。您可能希望提供自己的架构，如果我们对 type 注解不是你想要的。有关如何编写架构字符串的更多信息，请参阅此处schema

例子：：

>>> import torch
>>> from torch import Tensor
>>> from torch.library import custom_op
>>> import numpy as np
>>>
>>> @custom_op("mylib::numpy_sin", mutates_args=())
>>> def numpy_sin(x: Tensor) -> Tensor:
>>>     x_np = x.cpu().numpy()
>>>     y_np = np.sin(x_np)
>>>     return torch.from_numpy(y_np).to(device=x.device)
>>>
>>> x = torch.randn(3)
>>> y = numpy_sin(x)
>>> assert torch.allclose(y, x.sin())
>>>
>>> # Example of a custom op that only works for one device type.
>>> @custom_op("mylib::numpy_sin_cpu", mutates_args=(), device_types="cpu")
>>> def numpy_sin_cpu(x: Tensor) -> Tensor:
>>>     x_np = x.numpy()
>>>     y_np = np.sin(x_np)
>>>     return torch.from_numpy(y_np)
>>>
>>> x = torch.randn(3)
>>> y = numpy_sin_cpu(x)
>>> assert torch.allclose(y, x.sin())
>>>
>>> # Example of a custom op that mutates an input
>>> @custom_op("mylib::numpy_sin_inplace", mutates_args={"x"}, device_types="cpu")
>>> def numpy_sin_inplace(x: Tensor) -> None:
>>>     x_np = x.numpy()
>>>     np.sin(x_np, out=x_np)
>>>
>>> x = torch.randn(3)
>>> expected = x.sin()
>>> numpy_sin_inplace(x)
>>> assert torch.allclose(x, expected)
>>>
>>> # Example of a factory function
>>> @torch.library.custom_op("mylib::bar", mutates_args={}, device_types="cpu")
>>> def bar(device: torch.device) -> Tensor:
>>>     return torch.ones(3)
>>>
>>> bar("cpu")

扩展自定义运算（从 Python 或 C++ 创建）¶

使用 register.* 方法，例如torch.library.register_kernel()和 func：torch.library.register_fake 添加实现对于任何运算符（它们可能是使用torch.library.custom_op()或通过 PyTorch 的 C++作员注册 API）。

torch.library 中。register_kernel（op， device_types， func=None， /， *， lib=None）[来源]¶

为此 Operator 的设备类型注册一个 implementation 。

一些有效的device_types是：“cpu”、“cuda”、“xla”、“mps”、“ipu”、“xpu”。此 API 可用作装饰器。

参数

fn （Callable） – 要注册为 implementation 的函数给定的设备类型。
device_types （无 | str |Sequence[str]） – 要将 impl 注册到的 device_types。如果为 None，我们将注册到所有设备类型 - 请仅使用如果您的实施确实与设备类型无关，则此选项。

例子：：

>>> import torch
>>> from torch import Tensor
>>> from torch.library import custom_op
>>> import numpy as np
>>>
>>> # Create a custom op that works on cpu
>>> @custom_op("mylib::numpy_sin", mutates_args=(), device_types="cpu")
>>> def numpy_sin(x: Tensor) -> Tensor:
>>>     x_np = x.numpy()
>>>     y_np = np.sin(x_np)
>>>     return torch.from_numpy(y_np)
>>>
>>> # Add implementations for the cuda device
>>> @torch.library.register_kernel("mylib::numpy_sin", "cuda")
>>> def _(x):
>>>     x_np = x.cpu().numpy()
>>>     y_np = np.sin(x_np)
>>>     return torch.from_numpy(y_np).to(device=x.device)
>>>
>>> x_cpu = torch.randn(3)
>>> x_cuda = x_cpu.cuda()
>>> assert torch.allclose(numpy_sin(x_cpu), x_cpu.sin())
>>> assert torch.allclose(numpy_sin(x_cuda), x_cuda.sin())

torch.library 中。register_autograd（op， backward， /， *， setup_context=None， lib=None）[来源]¶

为此自定义运算注册一个反向公式。

为了让作员使用 autograd，您需要注册一个倒退的公式： 1. 您必须告诉我们如何在向后传递期间计算梯度通过为我们提供 “backward” 函数。 2. 如果您需要任何从 forward 到 calculate gradients 的值，您可以使用 setup_context 保存 backward 的值。

backward在向后传球期间运行。它接受： - 是一个或多个渐变。梯度匹配的数量运算符的输出数。该对象与 ctx 使用的 ctx 对象相同(ctx, *grads)gradsctxtorch.autograd.Function.的语义是与backward_fntorch.autograd.Function.backward().

setup_context(ctx, inputs, output)在向前传球期间运行。请通过以下方式保存 backward 到 Object 上所需的数量也ctxtorch.autograd.function.FunctionCtx.save_for_backward()或将它们分配为的属性。如果您的自定义运算具有 kwarg-only 参数，我们希望的签名为。ctxsetup_contextsetup_context(ctx, inputs, keyword_only_inputs, output)

两者都必须可追溯。那是他们可能无法直接访问setup_context_fnbackward_fntorch.Tensor.data_ptr()他们必须不依赖于或更改全局状态。如果你需要一个不可追溯的向后，您可以将其设为在中调用的单独custom_op。backward_fn

例子

>>> import torch
>>> import numpy as np
>>> from torch import Tensor
>>>
>>> @torch.library.custom_op("mylib::numpy_sin", mutates_args=())
>>> def numpy_sin(x: Tensor) -> Tensor:
>>>     x_np = x.cpu().numpy()
>>>     y_np = np.sin(x_np)
>>>     return torch.from_numpy(y_np).to(device=x.device)
>>>
>>> def setup_context(ctx, inputs, output) -> Tensor:
>>>     x, = inputs
>>>     ctx.save_for_backward(x)
>>>
>>> def backward(ctx, grad):
>>>     x, = ctx.saved_tensors
>>>     return grad * x.cos()
>>>
>>> torch.library.register_autograd(
...     "mylib::numpy_sin", backward, setup_context=setup_context
... )
>>>
>>> x = torch.randn(3, requires_grad=True)
>>> y = numpy_sin(x)
>>> (grad_x,) = torch.autograd.grad(y, x, torch.ones_like(y))
>>> assert torch.allclose(grad_x, x.cos())
>>>
>>> # Example with a keyword-only arg
>>> @torch.library.custom_op("mylib::numpy_mul", mutates_args=())
>>> def numpy_mul(x: Tensor, *, val: float) -> Tensor:
>>>     x_np = x.cpu().numpy()
>>>     y_np = x_np * val
>>>     return torch.from_numpy(y_np).to(device=x.device)
>>>
>>> def setup_context(ctx, inputs, keyword_only_inputs, output) -> Tensor:
>>>     ctx.val = keyword_only_inputs["val"]
>>>
>>> def backward(ctx, grad):
>>>     return grad * ctx.val
>>>
>>> torch.library.register_autograd(
...     "mylib::numpy_mul", backward, setup_context=setup_context
... )
>>>
>>> x = torch.randn(3, requires_grad=True)
>>> y = numpy_mul(x, val=3.14)
>>> (grad_x,) = torch.autograd.grad(y, x, torch.ones_like(y))
>>> assert torch.allclose(grad_x, torch.full_like(x, 3.14))

torch.library 中。register_fake（op， func=无， /， *， lib=无， _stacklevel=1）[来源]¶

为此运算符注册一个 FakeTensor 实现（“fake impl”）。

有时也称为 “meta kernel”、“abstract impl”。

“FakeTensor implementation” 指定此运算符在不携带数据的张量（“FakeTensor”）。给定一些具有某些属性（sizes/strides/storage_offset/device）指定 output Tensor 的属性是什么。

FakeTensor 实现具有与 operator 相同的签名。它同时针对 FakeTensor 和 meta Tensors 运行。编写 FakeTensor implementation 中，假设算子的所有 Tensor input 都是常规 CPU/CUDA/Meta 张量，但它们没有存储空间，以及您正在尝试返回常规 CPU/CUDA/Meta 张量作为输出。 FakeTensor 实现必须仅包含 PyTorch作（并且不得直接访问任何输入或中间张量）。

此 API 可用作装饰器（参见示例）。

有关自定义作的详细指南，请参阅 https://pytorch.org/tutorials/advanced/custom_ops_landing_page.html

例子

>>> import torch
>>> import numpy as np
>>> from torch import Tensor
>>>
>>> # Example 1: an operator without data-dependent output shape
>>> @torch.library.custom_op("mylib::custom_linear", mutates_args=())
>>> def custom_linear(x: Tensor, weight: Tensor, bias: Tensor) -> Tensor:
>>>     raise NotImplementedError("Implementation goes here")
>>>
>>> @torch.library.register_fake("mylib::custom_linear")
>>> def _(x, weight, bias):
>>>     assert x.dim() == 2
>>>     assert weight.dim() == 2
>>>     assert bias.dim() == 1
>>>     assert x.shape[1] == weight.shape[1]
>>>     assert weight.shape[0] == bias.shape[0]
>>>     assert x.device == weight.device
>>>
>>>     return (x @ weight.t()) + bias
>>>
>>> with torch._subclasses.fake_tensor.FakeTensorMode():
>>>     x = torch.randn(2, 3)
>>>     w = torch.randn(3, 3)
>>>     b = torch.randn(3)
>>>     y = torch.ops.mylib.custom_linear(x, w, b)
>>>
>>> assert y.shape == (2, 3)
>>>
>>> # Example 2: an operator with data-dependent output shape
>>> @torch.library.custom_op("mylib::custom_nonzero", mutates_args=())
>>> def custom_nonzero(x: Tensor) -> Tensor:
>>>     x_np = x.numpy(force=True)
>>>     res = np.stack(np.nonzero(x_np), axis=1)
>>>     return torch.tensor(res, device=x.device)
>>>
>>> @torch.library.register_fake("mylib::custom_nonzero")
>>> def _(x):
>>> # Number of nonzero-elements is data-dependent.
>>> # Since we cannot peek at the data in an fake impl,
>>> # we use the ctx object to construct a new symint that
>>> # represents the data-dependent size.
>>>     ctx = torch.library.get_ctx()
>>>     nnz = ctx.new_dynamic_size()
>>>     shape = [nnz, x.dim()]
>>>     result = x.new_empty(shape, dtype=torch.int64)
>>>     return result
>>>
>>> from torch.fx.experimental.proxy_tensor import make_fx
>>>
>>> x = torch.tensor([0, 1, 2, 3, 4, 0])
>>> trace = make_fx(torch.ops.mylib.custom_nonzero, tracing_mode="symbolic")(x)
>>> trace.print_readable()
>>>
>>> assert torch.allclose(trace(x), torch.ops.mylib.custom_nonzero(x))

torch.library 中。register_vmap（op， func=None， /， *， lib=None）[来源]¶

注册一个 vmap 实现以支持torch.vmap()对于此自定义作。

此 API 可用作装饰器（参见示例）。

为了让作员使用torch.vmap()，您可能需要注册一个 vmap 实现：

vmap_func(info, in_dims: Tuple[Optional[int]], *args, **kwargs),

其中和是的参数和 kwargs 。我们不支持仅限 kwarg 的 Tensor args。*args**kwargsop

它指定了我们如何使用额外的维度（由指定）。opin_dims

对于中的每个 arg ，都有一个对应的。如果 arg 不是 Tensor 或 arg 没有被 vmap 覆盖，则它是一个整数指定要 vmap 的 Tensor 的哪个维度。argsin_dimsOptional[int]None

info是可能有用的其他元数据的集合：指定要 vmap 的维度的大小，而是传递给info.batch_sizeinfo.randomnessrandomnesstorch.vmap().

函数的返回值是的元组。与类似，应与每个输出具有相同的结构，并且每个输出包含一个，用于指定输出是否具有 vmapped 维度以及它位于哪个索引中。func(output, out_dims)in_dimsout_dimsoutputout_dim

例子

>>> import torch
>>> import numpy as np
>>> from torch import Tensor
>>> from typing import Tuple
>>>
>>> def to_numpy(tensor):
>>>     return tensor.cpu().numpy()
>>>
>>> lib = torch.library.Library("mylib", "FRAGMENT")
>>> @torch.library.custom_op("mylib::numpy_cube", mutates_args=())
>>> def numpy_cube(x: Tensor) -> Tuple[Tensor, Tensor]:
>>>     x_np = to_numpy(x)
>>>     dx = torch.tensor(3 * x_np ** 2, device=x.device)
>>>     return torch.tensor(x_np ** 3, device=x.device), dx
>>>
>>> def numpy_cube_vmap(info, in_dims, x):
>>>     result = numpy_cube(x)
>>>     return result, (in_dims[0], in_dims[0])
>>>
>>> torch.library.register_vmap(numpy_cube, numpy_cube_vmap)
>>>
>>> x = torch.randn(3)
>>> torch.vmap(numpy_cube)(x)
>>>
>>> @torch.library.custom_op("mylib::numpy_mul", mutates_args=())
>>> def numpy_mul(x: Tensor, y: Tensor) -> Tensor:
>>>     return torch.tensor(to_numpy(x) * to_numpy(y), device=x.device)
>>>
>>> @torch.library.register_vmap("mylib::numpy_mul")
>>> def numpy_mul_vmap(info, in_dims, x, y):
>>>     x_bdim, y_bdim = in_dims
>>>     x = x.movedim(x_bdim, -1) if x_bdim is not None else x.unsqueeze(-1)
>>>     y = y.movedim(y_bdim, -1) if y_bdim is not None else y.unsqueeze(-1)
>>>     result = x * y
>>>     result = result.movedim(-1, 0)
>>>     return result, 0
>>>
>>>
>>> x = torch.randn(3)
>>> y = torch.randn(3)
>>> torch.vmap(numpy_mul)(x, y)

注意

vmap 函数应该旨在保留整个自定义运算符的语义。也就是说，应该可以用 .grad(vmap(op))grad(map(op))

如果您的自定义运算符在向后传递中有任何自定义行为，请请记住这一点。

torch.library 中。impl_abstract（qualname， func=None， *， lib=None， _stacklevel=1）[来源]¶: 此 API 已重命名为torch.library.register_fake()在 PyTorch 2.4 中。请改用它。

torch.library 中。get_ctx（）[来源]¶

get_ctx（）返回当前的 AbstractImplCtx 对象。

调用仅在 fake impl 内部有效（参见get_ctx()torch.library.register_fake()了解更多使用详情。

返回类型: 假 ImplCtx

torch.library 中。register_torch_dispatch（op， torch_dispatch_class， func=None， /， *， lib=None）[来源]¶

为给定运算符和注册 torch_dispatch 规则。torch_dispatch_class

这允许打开注册来指定运算符之间的行为和 the 而无需直接修改 or 运算符。torch_dispatch_classtorch_dispatch_class

它是具有 Tensor 的子类，或者是 TorchDispatchMode 的 TorchDispatchMode 中。torch_dispatch_class__torch_dispatch__

如果它是一个 Tensor 子类，我们希望具有以下签名：func(cls, func: OpOverload, types: Tuple[type, ...], args, kwargs) -> Any

如果它是 TorchDispatchMode，我们希望具有以下签名：func(mode, func: OpOverload, types: Tuple[type, ...], args, kwargs) -> Any

args并且将以相同的方式进行标准化 in （请参阅 __torch_dispatch__ 调用约定）。kwargs__torch_dispatch__

例子

>>> import torch
>>>
>>> @torch.library.custom_op("mylib::foo", mutates_args={})
>>> def foo(x: torch.Tensor) -> torch.Tensor:
>>>     return x.clone()
>>>
>>> class MyMode(torch.utils._python_dispatch.TorchDispatchMode):
>>>     def __torch_dispatch__(self, func, types, args=(), kwargs=None):
>>>         return func(*args, **kwargs)
>>>
>>> @torch.library.register_torch_dispatch("mylib::foo", MyMode)
>>> def _(mode, func, types, args, kwargs):
>>>     x, = args
>>>     return x + 1
>>>
>>> x = torch.randn(3)
>>> y = foo(x)
>>> assert torch.allclose(y, x)
>>>
>>> with MyMode():
>>>     y = foo(x)
>>> assert torch.allclose(y, x + 1)

torch.library 中。infer_schema（prototype_function， /， *， mutates_args， op_name=无)¶

使用类型提示分析给定函数的架构。架构是从函数的类型提示，并可用于定义新的运算符。

我们做出以下假设：

任何 output 都不会别名任何 inputs 或彼此。
没有库规范的字符串类型注释 “device， dtype， Tensor， types” 是

假定为 torch.*。同样，字符串类型注释 “Optional， List， Sequence， Union”

没有库规范时，假定为 typing.*。
只有中列出的 args 被改变。如果为 “unknown”，mutates_argsmutates_args

它假设 Operator 的所有 Importing 都在 mutuates。

调用方（例如自定义作 API）负责检查这些假设。

参数

prototype_function （Callable） – 从其类型注释中推断架构的函数。
op_name （Optional[str]） – 架构中运算符的名称。如果为 None，则 name 不包含在推断的架构中。请注意，输入 schema to 需要运算符名称。nametorch.library.Library.define
mutates_args （“unknown” | Iterable[str]） – 函数中更改的参数。

返回

推断的架构。

返回类型

str

例

>>> def foo_impl(x: torch.Tensor) -> torch.Tensor:
>>>     return x.sin()
>>>
>>> infer_schema(foo_impl, op_name="foo", mutates_args={})
foo(Tensor x) -> Tensor
>>>
>>> infer_schema(foo_impl, mutates_args={})
(Tensor x) -> Tensor

类 torch._library.custom_ops。CustomOpDef（命名空间、名称、架构、fn）[来源]¶

CustomOpDef 是函数的包装器，可将其转换为自定义运算。

它有多种方法可以为此注册其他行为自定义作。

您不应直接实例化 CustomOpDef;相反，请使用torch.library.custom_op()应用程序接口。

set_kernel_enabled（device_type， enabled=True）[来源]¶

禁用或重新启用此自定义 Operator 的已注册内核。

如果内核已禁用/启用，则为 no-op。

注意

如果内核先被禁用，然后被注册，则它会被禁用，直到再次启用。

参数

device_type （str） – 要为其禁用/启用内核的设备类型。
disable （bool） – 是禁用还是启用内核。

例

>>> inp = torch.randn(1)
>>>
>>> # define custom op `f`.
>>> @custom_op("mylib::f", mutates_args=())
>>> def f(x: Tensor) -> Tensor:
>>>     return torch.zeros(1)
>>>
>>> print(f(inp))  # tensor([0.]), default kernel
>>>
>>> @f.register_kernel("cpu")
>>> def _(x):
>>>     return torch.ones(1)
>>>
>>> print(f(inp))  # tensor([1.]), CPU kernel
>>>
>>> # temporarily disable the CPU kernel
>>> with f.set_kernel_enabled("cpu", enabled = False):
>>>     print(f(inp))  # tensor([0.]) with CPU kernel disabled

低级 API¶

以下 API 是 PyTorch 的 C++ 低级 API 的直接绑定作员注册 API。

警告

低级作员注册 API 和 PyTorch Dispatcher 是一个复杂的 PyTorch 概念。我们建议您使用上述更高级别的 API （不需要 torch.library.Library 对象）。这篇博文 <http://blog.ezyang.com/2020/09/lets-talk-about-the-pytorch-dispatcher/>'_ 是了解 PyTorch Dispatcher 的良好起点。

Google Colab 上提供了指导您完成有关如何使用此 API 的一些示例的教程。

类 torch.library 中。库（ns， kind， dispatch_key=''）[来源]¶

用于创建可用于注册新运算符或 override Python 中现有库中的运算符。如果用户只想注册，则可以选择传入 dispatch keyname kernels 只对应于一个特定的 dispatch key。

要创建一个库来覆盖现有库（名称为 ns）中的运算符，请将 kind 设置为 “IMPL”。要创建一个新库（名称为 ns）来注册新的运算符，请将 kind 设置为 “DEF”。要创建可能存在的库的片段以注册运算符（并绕过给定命名空间只有一个库的限制），将 kind 设置为 “片段”。

参数

NS – 库名称
kind – “DEF”， “IMPL” （默认： “IMPL”）， “FRAGMENT”
dispatch_key – PyTorch 调度密钥（默认值：“”）

define（schema， alias_analysis=''， *， tags=（）））[来源]¶

在 ns 命名空间中定义 new 运算符及其语义。

参数

schema – 用于定义新运算符的函数 schema。
alias_analysis （可选） – 指示运算符参数的别名属性是否可以从架构（默认行为）推断出（“CONSERVATIVE”）或非（“CONSERVATIVE”）。
标签（Tag | Sequence[Tag]） – 一个或多个火把。应用于此的标记算子。标记运算符会更改运算符的行为在各种 PyTorch 子系统下;请阅读 torch。应用前请仔细标记。

返回

从架构推断的运算符的名称。

例：：

>>> my_lib = Library("mylib", "DEF")
>>> my_lib.define("sum(Tensor self) -> Tensor")

fallback（fn， dispatch_key=''， *， with_keyset=False）[来源]¶

将函数实现注册为给定键的回退。

此函数仅适用于具有全局命名空间（“_”）的库。

参数

fn – 用作给定 dispatch key 的回退的函数，或fallthrough_kernel()以注册 fallthrough。
dispatch_key – 应为其注册输入函数的 dispatch 键。默认情况下，它使用创建库时使用的 Dispatch 键。
with_keyset – 控制是否应将当前 Dispatcher 调用键集作为第一个参数传递的标志 to 时调用。这应该用于为 redispatch 调用创建适当的 keyset。fn

例：：

>>> my_lib = Library("_", "IMPL")
>>> def fallback_kernel(op, *args, **kwargs):
>>>     # Handle all autocast ops generically
>>>     # ...
>>> my_lib.fallback(fallback_kernel, "Autocast")

impl（op_name， fn， dispatch_key=''， *， with_keyset=False）[来源]¶

为库中定义的运算符注册函数实现。

参数

op_name – 运算符名称（以及重载）或 OpOverload 对象。
fn – 作为 Input Dispatch Key 的运算符实现的函数，或fallthrough_kernel()以注册 fallthrough。
dispatch_key – 应为其注册输入函数的 dispatch 键。默认情况下，它使用创建库时使用的 Dispatch 键。
with_keyset – 控制是否应将当前 Dispatcher 调用键集作为第一个参数传递的标志 to 时调用。这应该用于为 redispatch 调用创建适当的 keyset。fn

例：：

>>> my_lib = Library("aten", "IMPL")
>>> def div_cpu(self, other):
>>>     return self * (1 / other)
>>> my_lib.impl("div.Tensor", div_cpu, "CPU")

torch.library 中。fallthrough_kernel（）[来源]¶: 一个虚拟函数，用于注册 fallthrough。Library.impl

torch.library 中。define（qualname， schema， *， lib=None， tags=（））[来源]¶

torch.library 中。define（lib，架构， alias_analysis='')

定义 new 运算符。

在 PyTorch 中，定义 op（“operator”的缩写）是一个两步过程： - 我们需要定义 OP（通过提供 Operator Name 和 schema） - 我们需要实现 Operator 如何与各种 PyTorch 子系统，如 CPU/CUDA 张量、Autograd 等。

此入口点定义自定义运算符（第一步）然后，您必须通过调用各种 API 来执行第二步，例如impl_*torch.library.impl()或torch.library.register_fake().

参数

qualname （str） – 运算符的限定名称。应该是一个看起来像 “namespace：：name” 的字符串，例如 “aten：：sin”。 PyTorch 中的 Operator 需要一个命名空间来避免名称冲突;给定的运算符只能创建一次。如果您正在编写 Python 库，我们建议将命名空间设置为是顶级模块的名称。
schema （str） – 运算符的架构。例如“（Tensor x） -> Tensor” 对于接受一个 Tensor 并返回一个 Tensor 的运算。它确实不包含运算符名称（在中传递）。qualname
lib （Optional[Library]） – 如果提供，则此运算符的生命周期将与 Library 对象的生命周期相关联。
标签（Tag | Sequence[Tag]） – 一个或多个火把。应用于此的标记算子。标记运算符会更改运算符的行为在各种 PyTorch 子系统下;请阅读 torch。应用前请仔细标记。

例：：

>>> import torch
>>> import numpy as np
>>>
>>> # Define the operator
>>> torch.library.define("mylib::sin", "(Tensor x) -> Tensor")
>>>
>>> # Add implementations for the operator
>>> @torch.library.impl("mylib::sin", "cpu")
>>> def f(x):
>>>     return torch.from_numpy(np.sin(x.numpy()))
>>>
>>> # Call the new operator from torch.ops.
>>> x = torch.randn(3)
>>> y = torch.ops.mylib.sin(x)
>>> assert torch.allclose(y, x.sin())

torch.library 中。impl（qualname， types， func=无， *， lib=无）[来源]¶

torch.library 中。impl（lib， name， dispatch_key='')

为此 Operator 的设备类型注册一个 implementation 。

你可以传递 “default” for 将此实现注册为 default 实现。请仅在实现真正支持所有设备类型时才使用此项; 例如，如果它是内置 PyTorch 运算符的组合，则为 true。types

一些有效的类型是： “cpu”， “cuda”， “xla”， “mps”， “ipu”， “xpu”。

参数

qualname （str） – 应为类似于 “namespace：：operator_name” 的字符串。
类型（str | Sequence[str]） – 要将 impl 注册到的设备类型。
lib （Optional[Library]） – 如果提供，则此注册的生命周期将与 Library 对象的生命周期相关联。

例子

>>> import torch
>>> import numpy as np
>>>
>>> # Define the operator
>>> torch.library.define("mylib::mysin", "(Tensor x) -> Tensor")
>>>
>>> # Add implementations for the cpu device
>>> @torch.library.impl("mylib::mysin", "cpu")
>>> def f(x):
>>>     return torch.from_numpy(np.sin(x.numpy()))
>>>
>>> x = torch.randn(3)
>>> y = torch.ops.mylib.mysin(x)
>>> assert torch.allclose(y, x.sin())

torch.library¶

测试自定义作¶

在 Python 中创建新的自定义运算¶

扩展自定义运算（从 Python 或 C++ 创建）¶

低级 API¶

文档

教程

资源