torch.nn.functional¶
Convolution functions¶
Applies a 1D convolution over an input signal composed of several input planes. |
|
Applies a 2D convolution over an input image composed of several input planes. |
|
Applies a 3D convolution over an input image composed of several input planes. |
|
Applies a 1D transposed convolution operator over an input signal composed of several input planes, sometimes also called "deconvolution". |
|
Applies a 2D transposed convolution operator over an input image composed of several input planes, sometimes also called "deconvolution". |
|
Applies a 3D transposed convolution operator over an input image composed of several input planes, sometimes also called "deconvolution" |
|
Extract sliding local blocks from a batched input tensor. |
|
Combine an array of sliding local blocks into a large containing tensor. |
Pooling functions¶
Applies a 1D average pooling over an input signal composed of several input planes. |
|
Applies 2D average-pooling operation in regions by step size steps. |
|
Applies 3D average-pooling operation in regions by step size steps. |
|
Applies a 1D max pooling over an input signal composed of several input planes. |
|
Applies a 2D max pooling over an input signal composed of several input planes. |
|
Applies a 3D max pooling over an input signal composed of several input planes. |
|
Compute a partial inverse of |
|
Compute a partial inverse of |
|
Compute a partial inverse of |
|
Apply a 1D power-average pooling over an input signal composed of several input planes. |
|
Apply a 2D power-average pooling over an input signal composed of several input planes. |
|
Apply a 3D power-average pooling over an input signal composed of several input planes. |
|
Applies a 1D adaptive max pooling over an input signal composed of several input planes. |
|
Applies a 2D adaptive max pooling over an input signal composed of several input planes. |
|
Applies a 3D adaptive max pooling over an input signal composed of several input planes. |
|
Applies a 1D adaptive average pooling over an input signal composed of several input planes. |
|
Apply a 2D adaptive average pooling over an input signal composed of several input planes. |
|
Apply a 3D adaptive average pooling over an input signal composed of several input planes. |
|
Applies 2D fractional max pooling over an input signal composed of several input planes. |
|
Applies 3D fractional max pooling over an input signal composed of several input planes. |
Attention Mechanisms¶
The torch.nn.attention.bias
module contains attention_biases that are designed to be used with
scaled_dot_product_attention.
Computes scaled dot product attention on query, key and value tensors, using an optional attention mask if passed, and applying dropout if a probability greater than 0.0 is specified. |
Non-linear activation functions¶
Apply a threshold to each element of the input Tensor. |
|
In-place version of |
|
Applies the rectified linear unit function element-wise. |
|
In-place version of |
|
Applies the HardTanh function element-wise. |
|
In-place version of |
|
Apply hardswish function, element-wise. |
|
Applies the element-wise function . |
|
Apply the Exponential Linear Unit (ELU) function element-wise. |
|
In-place version of |
|
Applies element-wise, , with and . |
|
Applies element-wise, . |
|
Applies element-wise, |
|
In-place version of |
|
Applies element-wise the function where weight is a learnable parameter. |
|
Randomized leaky ReLU. |
|
In-place version of |
|
The gated linear unit. |
|
When the approximate argument is 'none', it applies element-wise the function |
|
Applies element-wise |
|
Applies the hard shrinkage function element-wise |
|
Applies element-wise, |
|
Applies element-wise, the function |
|
Applies element-wise, the function . |
|
Apply a softmin function. |
|
Apply a softmax function. |
|
Applies the soft shrinkage function elementwise |
|
Sample from the Gumbel-Softmax distribution (Link 1 Link 2) and optionally discretize. |
|
Apply a softmax followed by a logarithm. |
|
Applies element-wise, |
|
Applies the element-wise function |
|
Apply the Hardsigmoid function element-wise. |
|
Apply the Sigmoid Linear Unit (SiLU) function, element-wise. |
|
Apply the Mish function, element-wise. |
|
Apply Batch Normalization for each channel across a batch of data. |
|
Apply Group Normalization for last certain number of dimensions. |
|
Apply Instance Normalization independently for each channel in every data sample within a batch. |
|
Apply Layer Normalization for last certain number of dimensions. |
|
Apply local response normalization over an input signal. |
|
Apply Root Mean Square Layer Normalization. |
|
Perform normalization of inputs over specified dimension. |
Linear functions¶
Applies a linear transformation to the incoming data: . |
|
Applies a bilinear transformation to the incoming data: |
Dropout functions¶
During training, randomly zeroes some elements of the input tensor with probability |
|
Apply alpha dropout to the input. |
|
Randomly masks out entire channels (a channel is a feature map). |
|
Randomly zero out entire channels (a channel is a 1D feature map). |
|
Randomly zero out entire channels (a channel is a 2D feature map). |
|
Randomly zero out entire channels (a channel is a 3D feature map). |
Sparse functions¶
Generate a simple lookup table that looks up embeddings in a fixed dictionary and size. |
|
Compute sums, means or maxes of bags of embeddings. |
|
Takes LongTensor with index values of shape |
Distance functions¶
See |
|
Returns cosine similarity between |
|
Computes the p-norm distance between every pair of row vectors in the input. |
Loss functions¶
Measure Binary Cross Entropy between the target and input probabilities. |
|
Calculate Binary Cross Entropy between target and input logits. |
|
Poisson negative log likelihood loss. |
|
See |
|
Compute the cross entropy loss between input logits and target. |
|
Apply the Connectionist Temporal Classification loss. |
|
Gaussian negative log likelihood loss. |
|
See |
|
Compute the KL Divergence loss. |
|
Function that takes the mean element-wise absolute value difference. |
|
Measures the element-wise mean squared error. |
|
See |
|
See |
|
See |
|
See |
|
Compute the negative log likelihood loss. |
|
Compute the Huber loss. |
|
Compute the Smooth L1 loss. |
|
See |
|
Compute the triplet loss between given input tensors and a margin greater than 0. |
|
Compute the triplet margin loss for input tensors using a custom distance function. |
Vision functions¶
Rearranges elements in a tensor of shape to a tensor of shape , where r is the |
|
Reverses the |
|
Pads tensor. |
|
Down/up samples the input. |
|
Upsample input. |
|
Upsamples the input, using nearest neighbours' pixel values. |
|
Upsamples the input, using bilinear upsampling. |
|
Compute grid sample. |
|
Generate 2D or 3D flow field (sampling grid), given a batch of affine matrices |