概率分布 - torch.distributions

该包包含可参数化的概率分布和采样函数。这允许构建随机计算用于优化的图形和随机梯度估计器。此软件包通常遵循 TensorFlow Distributions 软件包的设计。distributions

无法直接通过随机样本进行反向传播。然而创建代理函数有两种主要方法，可以是反向传播通过。这些是分数函数估计量/似然比 estimator/REINFORCE 和 pathwise 导数估计器。REINFORCE 通常被视为强化学习中策略梯度方法的基础，而 Pathwise Derivative Estimator 常见于 reparameterization 技巧在变分自动编码器中。而 score 函数只需要样本数量 $f(x)$ ，路径导数需要 $f'(x)$ .接下来的章节将讨论强化学习中的这两者例。有关更多详细信息，请参阅 Gradient Estimation Using Stochastic Computation Graphs 。

Score 函数

当概率密度函数相对于其 parameters 中，我们只需要和实现 REINFORCE：sample()log_prob()

\Delta\theta = \alpha r \frac{\partial\log p(a|\pi^\theta(s))}{\partial\theta}

哪里 $\theta$ 是参数， $\alpha$ 是学习率， $r$ 是奖励，而 $p(a|\pi^\theta(s))$ 是采取行动 $a$ 在状态 $s$ 给定策略 $\pi^\theta$ .

在实践中，我们会从网络的输出中采样一个作，应用这个 action 中，然后使用 API 来构造等效的 loss 函数。请注意，我们使用负数，因为优化器使用 gradient descent，而上述规则假定 Gradient Ascent。使用分类 policy，则实施 REINFORCE 的代码如下：log_prob

probs = policy_network(state)
# Note that this is equivalent to what used to be called multinomial
m = Categorical(probs)
action = m.sample()
next_state, reward = env.step(action)
loss = -m.log_prob(action) * reward
loss.backward()

Pathwise 导数

实现这些随机/策略梯度的另一种方法是使用 reparameterization 技巧，其中 parameterized random variable 可以通过 parameterized 无参数随机变量的确定性函数。重新参数化的因此，样本变得可微分。用于实现 pathwise 的代码衍生数将如下所示：rsample()

params = policy_network(state)
m = Normal(*params)
# Any distribution with .has_rsample == True could work based on the application
action = m.rsample()
next_state, reward = env.step(action)  # Assuming that reward is differentiable
loss = -reward
loss.backward()

分配

类 torch.distributions.distribution 中。distribution（batch_shape=torch 的Size（[]）， event_shape=Torch。Size（[]）， validate_args=无）[来源]

基地：object

Distribution 是概率分布的抽象基类。

属性arg_constraints：Dict[str， Constraint]: 返回从参数名称到Constraint对象应该由此 distribution 的每个参数满足。Args 的不是张量，不需要出现在这个 dict 中。

属性 batch_shape： Size: 返回对其参数进行批处理的形状。

cdf（值）[来源]

返回计算值为的累积密度/质量函数。

参数: value （张量） –
返回类型: 张肌

entropy（）[来源]

返回 batch_shape 分批处理的分布熵。

返回: 形状为 batch_shape 的张量。
返回类型: 张肌

enumerate_support（expand=True）[来源]

返回包含 discrete 支持的所有值的张量分配。结果将在维度 0 上枚举，因此形状的结果将是（cardinality，） + batch_shape + event_shape （其中 event_shape = （）对于单变量分布）。

请注意，这将枚举锁步 [[0， 0]， [1， 1]， ...] 中的所有批处理张量。使用 expand=False 时，将发生枚举沿 Dim 0，但其余批次维度为单例维度， [[0]， [1]， ...

要迭代完整的笛卡尔积，请使用 itertools.product（m.enumerate_support（））。

参数: expand （bool） – 是否扩展对批量变暗以匹配分配的batch_shape。
返回: Tensor 迭代维度 0。
返回类型: 张肌

属性 event_shape：大小: 返回单个样本的形状（无批处理）。

expand（batch_shape， _instance=无）[来源]

返回新的分发实例（或填充现有实例由派生类提供），并将批次维度扩展为 batch_shape。此方法调用expand上分配的参数。因此，这不会分配新的 memory 的 memory 进行扩展的分发实例。此外这不会在首次创建实例时在 __init__.py 中重复任何 ARGS 检查或参数广播。

参数

batch_shape（Torch。Size） – 所需的扩展大小。
_instance – 由子类提供的新实例需要覆盖 .expand。

返回

批次维度扩展为 batch_size 的新分配实例。

icdf（值）[来源]

返回在 value 处计算的逆累积密度/质量函数。

参数: value （张量） –
返回类型: 张肌

log_prob（value）[来源]

返回在 value 处计算的概率密度/质量函数的对数。

参数: value （张量） –
返回类型: 张肌

property mean：张量: 返回分布的平均值。

property mode：张量: 返回分布的模式。

perplexity（）[来源]

返回分batch_shape批处理的分布的困惑度。

返回: 形状为 batch_shape 的张量。
返回类型: 张肌

rsample（sample_shape=Torch。Size（[]））[来源]

生成 sample_shape 形状的重新参数化样品或sample_shape 如果分布参数进行批处理。

返回类型: 张肌

sample（sample_shape=torch 的Size（[]））[来源]

生成 sample_shape 形样品或 sample_shape 形批次 samples （如果分布参数是批处理的）。

返回类型: 张肌

sample_n（n）[来源]

如果分布参数是批处理的。

返回类型: 张肌

static set_default_validate_args（value）[来源]

设置是启用还是禁用验证。

默认行为模仿 Python 的语句：validation 默认为 on，但如果 Python 以优化模式运行，则将其禁用（通过）。验证可能很昂贵，因此您可能需要一旦模型开始工作，就禁用它。assertpython -O

参数: value （bool） – 是否启用验证。

property stddev：张量: 返回分布的标准差。

属性支持：Optional[Any]: 返回一个Constraint对象表示此发行版的支持。

property variance：张量: 返回分布的方差。

指数系列

torch.distributions.exp_family 类。ExponentialFamily（batch_shape=Torch.Size（[]）， event_shape=Torch。Size（[]）， validate_args=无）[来源]

基地：Distribution

ExponentialFamily 是属于指数族，其概率质量/密度函数的形式定义如下

p_{F}(x; \theta) = \exp(\langle t(x), \theta\rangle - F(\theta) + k(x))

哪里 $\theta$ 表示自然参数， $t(x)$ 表示足够的统计数据， $F(\theta)$ 是给定系列的对数归一化器函数， $k(x)$ 是运维量。

注意

这个类是 Distribution 类和属于到指数族，主要是为了检查 .entropy（）和解析 KL 的正确性 Divergence 方法。我们使用这个类来计算使用 AD 的熵和 KL 散度框架和 Bregman 散度（由：Frank Nielsen 和 Richard Nock、Entropies 和指数族的交叉熵）。

entropy（）[来源]: 使用对数归一化器的 Bregman 散度计算熵的方法。

伯努利

类 torch.distributions.bernoulli 中。伯努利（probs=无，logits=无，validate_args=无）[来源]

基地：ExponentialFamily

创建参数化为probs或logits（但不能两者兼而有之）。

样本是二进制的（0 或 1）。他们取值 1 的概率 p，取值 0 的概率 1 - p。

例：

>>> m = Bernoulli(torch.tensor([0.3]))
>>> m.sample()  # 30% chance 1; 70% chance 0
tensor([ 0.])

参数

probs （Number， Tensor） – 采样概率 1
logits （Number， Tensor） – 采样 1 的对数几率

arg_constraints = {'logits'： Real（）， 'probs'：区间（lower_bound=0.0， upper_bound=1.0）}

entropy（）[来源]

enumerate_support（expand=True）[来源]

expand（batch_shape， _instance=无）[来源]

has_enumerate_support = 真

log_prob（value）[来源]

Property Logits 属性

属性平均值

property mode

属性param_shape

属性问题

sample（sample_shape=torch 的Size（[]））[来源]

support = 布尔值（）

属性差异

试用版

类 torch.distributions.beta 中。Beta（浓度1，浓度0，validate_args=无）[来源]

基地：ExponentialFamily

Beta 分布参数化为concentration1和concentration0.

例：

>>> m = Beta(torch.tensor([0.5]), torch.tensor([0.5]))
>>> m.sample()  # Beta distributed with concentration concentration1 and concentration0
tensor([ 0.1046])

参数

concentration1 （float 或 Tensor） – 分布的第 1 个浓度参数（通常称为 Alpha）
concentration0 （float 或 Tensor） – 分布的第 2 个浓度参数（通常称为 beta）

arg_constraints = {'concentration0'： GreaterThan（lower_bound=0.0）， 'concentration1'： GreaterThan（lower_bound=0.0）}

性状浓度0

属性浓度1

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

log_prob（value）[来源]

属性平均值

property mode

rsample（sample_shape=（））[来源]

返回类型: 张肌

support = 区间（lower_bound=0.0， upper_bound=1.0）

属性差异

二项式

类 torch.distributions.binomial 中。二项式（total_count=1， probs=无， logits=无， validate_args=无）[来源]

基地：Distribution

创建参数化为和的二项分布也total_countprobs或logits（但不能两者兼而有之）。必须是 broadcastable 替换为total_countprobs/logits.

例：

>>> m = Binomial(100, torch.tensor([0 , .2, .8, 1]))
>>> x = m.sample()
tensor([   0.,   22.,   71.,  100.])

>>> m = Binomial(torch.tensor([[5.], [10.]]), torch.tensor([0.5, 0.8]))
>>> x = m.sample()
tensor([[ 4.,  5.],
        [ 7.,  6.]])

参数

total_count （int 或 Tensor） – 伯努利试验数
probs （Tensor） - 事件概率
logits （Tensor） – 事件对数几率

arg_constraints = {'logits'： Real（）， 'probs'： Interval（lower_bound=0.0， upper_bound=1.0）， 'total_count'： IntegerGreaterThan（lower_bound=0）}

entropy（）[来源]

enumerate_support（expand=True）[来源]

expand（batch_shape， _instance=无）[来源]

has_enumerate_support = 真

log_prob（value）[来源]

Property Logits 属性

属性平均值

property mode

属性param_shape

属性问题

sample（sample_shape=torch 的Size（[]））[来源]

Property 支持

属性差异

分类

类 torch.distributions.categorical。Categorical（probs=无，logits=无，validate_args=无）[来源]

基地：Distribution

创建分类分布，参数化为probs或logits（但不能两者兼而有之）。

注意

它等效于以下分布torch.multinomial()样本来自。

样本是来自 $\{0, \ldots, K-1\}$ 其中 K 是。probs.size(-1)

如果 probs 是长度为 K 的一维，则每个元素都是相对概率的 Sampling of the Class.

如果 probs 是 N 维的，则前 N-1 维将被视为一批相对概率向量。

注意

probs 参数必须是非负的、有限的，并且具有非零和，并且它将沿最后一个维度归一化为 sum 为 1。probs将返回此标准化值。 logits 参数将被解释为非规范化对数概率，因此可以是任何实数。它同样将被标准化，以便结果概率沿最后一个维度总和 1。logits将返回此标准化值。

另请参阅：torch.multinomial()

例：

>>> m = Categorical(torch.tensor([ 0.25, 0.25, 0.25, 0.25 ]))
>>> m.sample()  # equal probability of 0, 1, 2, 3
tensor(3)

参数

probs （Tensor） - 事件概率
logits （Tensor） – 事件对数概率（未归一化）

arg_constraints = {'logits'： IndependentConstraint（Real（）， 1）， 'probs'：单纯形（）}

entropy（）[来源]

enumerate_support（expand=True）[来源]

expand（batch_shape， _instance=无）[来源]

has_enumerate_support = 真

log_prob（value）[来源]

Property Logits 属性

属性平均值

property mode

属性param_shape

属性问题

sample（sample_shape=torch 的Size（[]））[来源]

Property 支持

属性差异

柯西

类 torch.distributions.cauchy 的Cauchy（loc， scale， validate_args=None）[来源]

基地：Distribution

来自 Cauchy （Lorentz）分布的样本。比率的分布均值为 0 的独立正态分布随机变量遵循 a 柯西分布。

例：

>>> m = Cauchy(torch.tensor([0.0]), torch.tensor([1.0]))
>>> m.sample()  # sample from a Cauchy distribution with loc=0 and scale=1
tensor([ 2.3214])

参数

loc （float or Tensor） - 分布的众数或中位数。
scale （float or Tensor） - 半宽在半最大值。

arg_constraints = {'loc'： Real（）， 'scale'： GreaterThan（lower_bound=0.0）}

cdf（值）[来源]

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

icdf（值）[来源]

log_prob（value）[来源]

属性平均值

property mode

rsample（sample_shape=Torch。Size（[]））[来源]

返回类型: 张肌

支持 = Real（）

属性差异

气2

类 torch.distributions.chi2 中。Chi2（df， validate_args=无）[来源]

基地：Gamma

创建由形状参数参数化的卡方分布df. 这完全等同于Gamma(alpha=0.5*df, beta=0.5)

例：

>>> m = Chi2(torch.tensor([1.0]))
>>> m.sample()  # Chi2 distributed with shape df=1
tensor([ 0.1046])

参数: df （float or Tensor） – 分布的形状参数

arg_constraints = {'df'：大于（lower_bound=0.0）}

属性 df

expand（batch_shape， _instance=无）[来源]

连续式伯努利

torch.distributions.continuous_bernoulli 类。连续伯努利（probs=无，logits=无，lims=（0.499,0.501），validate_args=无）[来源]

基地：ExponentialFamily

创建连续伯努利分布，参数化为probs或logits（但不能两者兼而有之）。

该分布在 [0， 1] 中受支持，并由 'probs' 参数化（在（0,1））或 'logits' （实值）。请注意，与伯努利不同，“probs” 不对应于概率，并且 'logits' 不对应于 log-odds，但由于与伯努利。有关详细信息，请参见 [1]。

例：

>>> m = ContinuousBernoulli(torch.tensor([0.3]))
>>> m.sample()
tensor([ 0.2538])

参数

probs （Number， Tensor） – （0,1）值参数
logits （Number， Tensor） – sigmoid 与 'probs' 匹配的实值参数

[1] 连续伯努利：修复变分中的普遍误差自动编码器，Loaiza-Ganem G 和 Cunningham JP，NeurIPS 2019。https://arxiv.org/abs/1907.06845

arg_constraints = {'logits'： Real（）， 'probs'：区间（lower_bound=0.0， upper_bound=1.0）}

cdf（值）[来源]

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

icdf（值）[来源]

log_prob（value）[来源]

Property Logits 属性

属性平均值

属性param_shape

属性问题

rsample（sample_shape=Torch。Size（[]））[来源]

返回类型: 张肌

sample（sample_shape=torch 的Size（[]））[来源]

属性 stddev

support = 区间（lower_bound=0.0， upper_bound=1.0）

属性差异

狄里克莱

类 torch.distributions.dirichlet 中。狄利克雷（浓度，validate_args=无）[来源]

基地：ExponentialFamily

创建按浓度参数化的狄利克雷分布。concentration

例：

>>> m = Dirichlet(torch.tensor([0.5, 0.5]))
>>> m.sample()  # Dirichlet distributed with concentration [0.5, 0.5]
tensor([ 0.1046,  0.8954])

参数: concentration （Tensor） – 分布的浓度参数（通常称为 Alpha）

arg_constraints = {'concentration'： IndependentConstraint（GreaterThan（lower_bound=0.0）， 1）}

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

log_prob（value）[来源]

属性平均值

property mode

rsample（sample_shape=（））[来源]

返回类型: 张肌

支持 = 单工（）

属性差异

指数

类 torch.distributions.exponential 中。指数（rate， validate_args=None）[来源]

基地：ExponentialFamily

创建一个参数化为的指数分布。rate

例：

>>> m = Exponential(torch.tensor([1.0]))
>>> m.sample()  # Exponential distributed with rate=1
tensor([ 0.1046])

参数: rate （float 或 Tensor） – rate = 1 / 分布规模

arg_constraints = {'rate'： GreaterThan（lower_bound=0.0）}

cdf（值）[来源]

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

icdf（值）[来源]

log_prob（value）[来源]

属性平均值

property mode

rsample（sample_shape=Torch。Size（[]））[来源]

返回类型: 张肌

属性 stddev

支持 = GreaterThanEq（lower_bound=0.0）

属性差异

费舍尔斯内德科

类 torch.distributions.fishersnedecor 中。FisherSnedecor（df1， df2， validate_args=无）[来源]

基地：Distribution

创建由和参数化的 Fisher-Snedecor 分布。df1df2

例：

>>> m = FisherSnedecor(torch.tensor([1.0]), torch.tensor([2.0]))
>>> m.sample()  # Fisher-Snedecor-distributed with df1=1 and df2=2
tensor([ 0.2453])

参数

df1 （float 或 Tensor） – 自由度参数 1
df2 （float 或 Tensor） – 自由度参数 2

arg_constraints = {'df1'： GreaterThan（lower_bound=0.0）， 'df2'： GreaterThan（lower_bound=0.0）}

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

log_prob（value）[来源]

属性平均值

property mode

rsample（sample_shape=Torch。Size（[]））[来源]

返回类型: 张肌

support = 大于（lower_bound=0.0）

属性差异

伽马

类 torch.distributions.gamma 中。Gamma（浓度、速率、validate_args=无）[来源]

基地：ExponentialFamily

创建由 shape 和参数化的 Gamma 分布。concentrationrate

例：

>>> m = Gamma(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # Gamma distributed with concentration=1 and rate=1
tensor([ 0.1046])

参数

concentration （float 或 Tensor） – 分布的形状参数（通常称为 Alpha）
rate （float 或 Tensor） – rate = 1 / 分布规模（通常称为 beta）

arg_constraints = {'concentration'： GreaterThan（lower_bound=0.0）， 'rate'： GreaterThan（lower_bound=0.0）}

cdf（值）[来源]

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

log_prob（value）[来源]

属性平均值

property mode

rsample（sample_shape=Torch。Size（[]））[来源]

返回类型: 张肌

支持 = GreaterThanEq（lower_bound=0.0）

属性差异

几何

类 torch.distributions.geometric 的 Package。几何（probs=无，logits=无，validate_args=无）[来源]

基地：Distribution

创建参数化为probs, 哪里probs是伯努利试验成功的概率。

P(X=k) = (1-p)^{k} p, k = 0, 1, ...

注意

torch.distributions.geometric.Geometric() $(k+1)$ -第 1 次试验是第一次成功因此在 $\{0, 1, \ldots\}$ 而torch.Tensor.geometric_() 第 k 次试验是第一次成功，因此抽取了样本 $\{1, 2, \ldots\}$ .

例：

>>> m = Geometric(torch.tensor([0.3]))
>>> m.sample()  # underlying Bernoulli has 30% chance 1; 70% chance 0
tensor([ 2.])

参数

probs （Number， Tensor） - 采样 1 的概率。必须在（0， 1] 范围内
logits （Number， Tensor） - 采样 1 的对数几率。

arg_constraints = {'logits'： Real（）， 'probs'：区间（lower_bound=0.0， upper_bound=1.0）}

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

log_prob（value）[来源]

Property Logits 属性

属性平均值

property mode

属性问题

sample（sample_shape=torch 的Size（[]））[来源]

支持 = IntegerGreaterThan（lower_bound=0）

属性差异

甘贝

类 torch.distributions.gumbel 中。Gumbel（loc， scale， validate_args=None）[来源]

基地：TransformedDistribution

来自 Gumbel 分布的样本。

例子：

>>> m = Gumbel(torch.tensor([1.0]), torch.tensor([2.0]))
>>> m.sample()  # sample from Gumbel distribution with loc=1, scale=2
tensor([ 1.0124])

参数

loc （float 或 Tensor） – 分布的位置参数
scale （float 或 Tensor） - 分布的 scale 参数

arg_constraints： dict[str， constraint] = {'loc'： Real（）， 'scale'： GreaterThan（lower_bound=0.0）}

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

log_prob（value）[来源]

属性平均值

property mode

属性 stddev

支持 = Real（）

属性差异

半柯西

torch.distributions.half_cauchy 类。HalfCauchy（scale， validate_args=None）[来源]

基地：TransformedDistribution

创建按尺度参数化的半柯西分布，其中：

X ~ Cauchy(0, scale)
Y = |X| ~ HalfCauchy(scale)

例：

>>> m = HalfCauchy(torch.tensor([1.0]))
>>> m.sample()  # half-cauchy distributed with scale=1
tensor([ 2.3214])

参数: scale （float 或 Tensor） – 完整 Cauchy 分布的规模

arg_constraints： dict[str， constraint] = {'scale'： GreaterThan（lower_bound=0.0）}

cdf（值）[来源]

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

ICDF（prob）[来源]

log_prob（value）[来源]

属性平均值

property mode

物业规模

支持 = GreaterThanEq（lower_bound=0.0）

属性差异

半法线

torch.distributions.half_normal 类。HalfNormal（scale， validate_args=None）[来源]

基地：TransformedDistribution

创建按尺度参数化的半正态分布，其中：

X ~ Normal(0, scale)
Y = |X| ~ HalfNormal(scale)

例：

>>> m = HalfNormal(torch.tensor([1.0]))
>>> m.sample()  # half-normal distributed with scale=1
tensor([ 0.1046])

参数: scale （float 或 Tensor） – 完整 Normal 分布的尺度

arg_constraints： dict[str， constraint] = {'scale'： GreaterThan（lower_bound=0.0）}

cdf（值）[来源]

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

ICDF（prob）[来源]

log_prob（value）[来源]

属性平均值

property mode

物业规模

支持 = GreaterThanEq（lower_bound=0.0）

属性差异

独立

类 torch.distributions.Independent 中。独立（base_distribution， reinterpreted_batch_ndims， validate_args=无）[来源]

基地：Distribution

将分布的某些批次维度重新解释为事件维度。

这主要用于更改log_prob().例如，要创建对角线正态分布的形状与多元正态分布相同（因此它们是可互换），您可以：

>>> from torch.distributions.multivariate_normal import MultivariateNormal
>>> from torch.distributions.normal import Normal
>>> loc = torch.zeros(3)
>>> scale = torch.ones(3)
>>> mvn = MultivariateNormal(loc, scale_tril=torch.diag(scale))
>>> [mvn.batch_shape, mvn.event_shape]
[torch.Size([]), torch.Size([3])]
>>> normal = Normal(loc, scale)
>>> [normal.batch_shape, normal.event_shape]
[torch.Size([3]), torch.Size([])]
>>> diagn = Independent(normal, 1)
>>> [diagn.batch_shape, diagn.event_shape]
[torch.Size([]), torch.Size([3])]

参数

base_distribution （torch.distributions.distribution.Distribution） – 一个基地分布
reinterpreted_batch_ndims （int） – 批量暗淡到的数量重新解释为事件变暗

arg_constraints： Dict[str， constraint] = {}

entropy（）[来源]

enumerate_support（expand=True）[来源]

expand（batch_shape， _instance=无）[来源]

属性 has_enumerate_support

属性 has_rsample

log_prob（value）[来源]

属性平均值

property mode

rsample（sample_shape=Torch。Size（[]））[来源]

返回类型: 张肌

sample（sample_shape=torch 的Size（[]））[来源]

Property 支持

属性差异

逆伽玛

torch.distributions.inverse_gamma 类。InverseGamma（浓度、速率、validate_args=无）[来源]

基地：TransformedDistribution

创建参数化为concentration和rate哪里：

X ~ Gamma(concentration, rate)
Y = 1 / X ~ InverseGamma(concentration, rate)

例：

>>> m = InverseGamma(torch.tensor([2.0]), torch.tensor([3.0]))
>>> m.sample()
tensor([ 1.2953])

参数

concentration （float 或 Tensor） – 分布的形状参数（通常称为 Alpha）
rate （float 或 Tensor） – rate = 1 / 分布规模（通常称为 beta）

arg_constraints： dict[str， constraint] = {'concentration'： GreaterThan（lower_bound=0.0）， 'rate'： GreaterThan（lower_bound=0.0）}

属性集中度

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

属性平均值

property mode

物业费率

support = 大于（lower_bound=0.0）

属性差异

库马拉斯瓦米

类 torch.distributions.kumaraswamy 中。Kumaraswamy（浓度1，浓度0，validate_args=无）[来源]

基地：TransformedDistribution

来自 Kumaraswamy 分布的样本。

例：

>>> m = Kumaraswamy(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # sample from a Kumaraswamy distribution with concentration alpha=1 and beta=1
tensor([ 0.1729])

参数

concentration1 （float 或 Tensor） – 分布的第 1 个浓度参数（通常称为 Alpha）
concentration0 （float 或 Tensor） – 分布的第 2 个浓度参数（通常称为 beta）

arg_constraints： dict[str， constraint] = {'concentration0'： GreaterThan（lower_bound=0.0）， 'concentration1'： GreaterThan（lower_bound=0.0）}

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

属性平均值

property mode

support = 区间（lower_bound=0.0， upper_bound=1.0）

属性差异

LKJCholesky

torch.distributions.lkj_cholesky 类。LKJCholesky（dim， concentration=1.0， validate_args=None）[来源]

基地：Distribution

相关矩阵的较低 Cholesky 因子的 LKJ 分布。分布由参数控制concentration $\eta$ 制作相关矩阵的概率 $M$ 生成自与 $\det(M)^{\eta - 1}$ .正因为如此，当时，我们在 Cholesky 上呈均匀分布相关矩阵的因子：concentration == 1

L ~ LKJCholesky(dim, concentration)
X = L @ L' ~ LKJCorr(dim, concentration)

请注意，此发行版对相关矩阵的 Cholesky 因子，而不是相关矩阵本身，因此与 [1] 中的推导略有不同，因为 LKJCorr 发行版。对于采样，这将使用 Onion 方法从 [1] 第 3 节。

例：

>>> l = LKJCholesky(3, 0.5)
>>> l.sample()  # l @ l.T is a sample of a correlation 3x3 matrix
tensor([[ 1.0000,  0.0000,  0.0000],
        [ 0.3516,  0.9361,  0.0000],
        [-0.1899,  0.4748,  0.8593]])

参数

dimension （dim） – 矩阵的维度
concentration （float or Tensor） – 浓度/形状参数分发（通常称为 ETA）

引用

[1] 基于 vines 和扩展洋葱法生成随机相关矩阵（2009），丹尼尔·莱万多夫斯基、多罗塔·库罗维卡、哈里·乔。多变量分析杂志。100. 10.1016/j.jmva.2009.04.008

arg_constraints = {'concentration'： GreaterThan（lower_bound=0.0）}

expand（batch_shape， _instance=无）[来源]

log_prob（value）[来源]

sample（sample_shape=torch 的Size（[]））[来源]

支持 = CorrCholesky（）

拉普拉斯

类 torch.distributions.laplace 中。拉普拉斯（loc， scale， validate_args=None）[来源]

基地：Distribution

创建由和参数化的拉普拉斯分布。locscale

例：

>>> m = Laplace(torch.tensor([0.0]), torch.tensor([1.0]))
>>> m.sample()  # Laplace distributed with loc=0, scale=1
tensor([ 0.1046])

参数

loc （float 或 Tensor） – 分布的平均值
scale （float 或 Tensor） – 分布的规模

arg_constraints = {'loc'： Real（）， 'scale'： GreaterThan（lower_bound=0.0）}

cdf（值）[来源]

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

icdf（值）[来源]

log_prob（value）[来源]

属性平均值

property mode

rsample（sample_shape=Torch。Size（[]））[来源]

返回类型: 张肌

属性 stddev

支持 = Real（）

属性差异

对数

类 torch.distributions.log_normal 中。LogNormal（loc， scale， validate_args=None）[来源]

基地：TransformedDistribution

创建一个对数正态分布，参数化为loc和scale哪里：

X ~ Normal(loc, scale)
Y = exp(X) ~ LogNormal(loc, scale)

例：

>>> m = LogNormal(torch.tensor([0.0]), torch.tensor([1.0]))
>>> m.sample()  # log-normal distributed with mean=0 and stddev=1
tensor([ 0.1046])

参数

loc （float 或 Tensor） – 分布对数的平均值
scale （float 或 Tensor） – 分布对数的标准差

arg_constraints： dict[str， constraint] = {'loc'： Real（）， 'scale'： GreaterThan（lower_bound=0.0）}

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

Property LOC （属性位置）

属性平均值

property mode

物业规模

support = 大于（lower_bound=0.0）

属性差异

LowRankMultivariateNormal

torch.distributions.lowrank_multivariate_normal 类。LowRankMultivariateNormal（loc， cov_factor， cov_diag， validate_args=None）[来源]

基地：Distribution

创建协方差矩阵具有低秩形式的多元正态分布参数化为和：cov_factorcov_diag

covariance_matrix = cov_factor @ cov_factor.T + cov_diag

例

>>> m = LowRankMultivariateNormal(torch.zeros(2), torch.tensor([[1.], [0.]]), torch.ones(2))
>>> m.sample()  # normally distributed with mean=`[0,0]`, cov_factor=`[[1],[0]]`, cov_diag=`[1,1]`
tensor([-0.2102, -0.5429])

参数

loc （Tensor） – 形状为 batch_shape + event_shape 的分布平均值
cov_factor （Tensor） – 形状为 batch_shape + event_shape + （rank，）的协方差矩阵的低秩形式的因子部分
cov_diag （Tensor） – 形状为 batch_shape + event_shape 的协方差矩阵的低秩形式的对角线部分

注意

由于 Woodbury 矩阵恒等式和矩阵行列式引理，当 cov_factor.shape[1] << cov_factor.shape[0] 时，避免了行列式和协方差矩阵逆矩阵的计算。多亏了这些公式，我们只需要计算小尺寸 “电容” 矩阵：

capacitance = I + cov_factor.T @ inv(cov_diag) @ cov_factor

arg_constraints = {'cov_diag'： IndependentConstraint（GreaterThan（lower_bound=0.0）， 1）， 'cov_factor'： IndependentConstraint（Real（）， 2）， 'loc'： IndependentConstraint（Real（）， 1）}

属性 covariance_matrix

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

log_prob（value）[来源]

属性平均值

property mode

属性 precision_matrix

rsample（sample_shape=Torch。Size（[]））[来源]

返回类型: 张肌

属性 scale_tril

支持 = IndependentConstraint（Real（）， 1）

属性差异

MixtureSameFamily

torch.distributions.mixture_same_family 类。MixtureSameFamily（mixture_distribution， component_distribution， validate_args=无）[来源]

基地：Distribution

MixtureSameFamily 分布实现一个（批次的）混合物分布，其中所有组件都来自不同的参数化相同的分发类型。它由一个分类 “选择分布” （超过 k 个分量）和一个分量参数化 Distribution，即具有最右侧 Batch 形状的 Distribution （等于 [k]），它为每个（批次的）组件编制索引。

例子：

>>> # Construct Gaussian Mixture Model in 1D consisting of 5 equally
>>> # weighted normal distributions
>>> mix = D.Categorical(torch.ones(5,))
>>> comp = D.Normal(torch.randn(5,), torch.rand(5,))
>>> gmm = MixtureSameFamily(mix, comp)

>>> # Construct Gaussian Mixture Model in 2D consisting of 5 equally
>>> # weighted bivariate normal distributions
>>> mix = D.Categorical(torch.ones(5,))
>>> comp = D.Independent(D.Normal(
...          torch.randn(5,2), torch.rand(5,2)), 1)
>>> gmm = MixtureSameFamily(mix, comp)

>>> # Construct a batch of 3 Gaussian Mixture Models in 2D each
>>> # consisting of 5 random weighted bivariate normal distributions
>>> mix = D.Categorical(torch.rand(3,5))
>>> comp = D.Independent(D.Normal(
...         torch.randn(3,5,2), torch.rand(3,5,2)), 1)
>>> gmm = MixtureSameFamily(mix, comp)

参数

mixture_distribution – torch.distributions.Categorical-like 实例。管理选择零部件的概率。类别数必须与最右侧的批次匹配维度的 component_distribution。必须具有标量batch_shape或匹配batch_shape component_distribution.batch_shape[：-1]
component_distribution – 类似 torch.distributions.Distribution 实例。最右侧的批处理维度索引组件。

arg_constraints： Dict[str， constraint] = {}

cdf（x）[来源]

属性 component_distribution

expand（batch_shape， _instance=无）[来源]

has_rsample = False

log_prob（x）[来源]

属性平均值

属性 mixture_distribution

sample（sample_shape=torch 的Size（[]））[来源]

Property 支持

属性差异

多项式

类 torch.distributions.multinomial 中。多项式（total_count=1， probs=None， logits=None， validate_args=None）[来源]

基地：Distribution

创建一个多项式分布，参数化为total_count和也probs或logits（但不能两者兼而有之）。的最内层维度probs类别的索引。所有其他维度对批次进行索引。

请注意，total_count如果只有log_prob()是 called（见下面的示例）

注意

probs 参数必须是非负的、有限的，并且具有非零和，并且它将沿最后一个维度归一化为 sum 为 1。probs将返回此标准化值。 logits 参数将被解释为非规范化对数概率，因此可以是任何实数。它同样将被标准化，以便结果概率沿最后一个维度总和 1。logits将返回此标准化值。

sample()需要所有人的单个共享total_count 参数和样本。
log_prob()允许对每个参数进行不同的total_count，并且样本。

例：

>>> m = Multinomial(100, torch.tensor([ 1., 1., 1., 1.]))
>>> x = m.sample()  # equal probability of 0, 1, 2, 3
tensor([ 21.,  24.,  30.,  25.])

>>> Multinomial(probs=torch.tensor([1., 1., 1., 1.])).log_prob(x)
tensor([-4.1338])

参数

total_count （int） – 试验次数
probs （Tensor） - 事件概率
logits （Tensor） – 事件对数概率（未归一化）

arg_constraints = {'logits'： IndependentConstraint（Real（）， 1）， 'probs'：单纯形（）}

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

log_prob（value）[来源]

Property Logits 属性

属性平均值

属性param_shape

属性问题

sample（sample_shape=torch 的Size（[]））[来源]

Property 支持

total_count： int

属性差异

MultivariateNormal

torch.distributions.multivariate_normal 类。MultivariateNormal（loc， covariance_matrix=无， precision_matrix=无， scale_tril=无， validate_args=没有）[来源]

基地：Distribution

创建多元正态（也称为高斯）分布由均值向量和协方差矩阵参数化。

多元正态分布可以参数化就正定协方差矩阵而言 $\mathbf{\Sigma}$ 或正定精度矩阵 $\mathbf{\Sigma}^{-1}$ 或下三角矩阵 $\mathbf{L}$ 为正值对角线条目，使得 $\mathbf{\Sigma} = \mathbf{L}\mathbf{L}^\top$ .这个三角矩阵可以通过例如协方差的 Cholesky 分解获得。

例

>>> m = MultivariateNormal(torch.zeros(2), torch.eye(2))
>>> m.sample()  # normally distributed with mean=`[0,0]` and covariance_matrix=`I`
tensor([-0.2102, -0.5429])

参数

loc （Tensor） – 分布的平均值
covariance_matrix （Tensor） – 正定协方差矩阵
precision_matrix （Tensor） – 正定精度矩阵
scale_tril （Tensor） – 协方差的下三角因子，对角线为正值

注意

只有其中之一covariance_matrix或precision_matrix或scale_tril可以指定。

用scale_tril将更高效：所有计算都在内部进行基于scale_tril.如果covariance_matrix或precision_matrix，则它仅用于计算使用 Cholesky 分解的相应下三角矩阵。

arg_constraints = {'covariance_matrix'： PositiveDefinite（）， 'loc'： IndependentConstraint（Real（）， 1）， 'precision_matrix'： PositiveDefinite（）， 'scale_tril'： LowerCholesky（）}

属性 covariance_matrix

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

log_prob（value）[来源]

属性平均值

property mode

属性 precision_matrix

rsample（sample_shape=Torch。Size（[]））[来源]

返回类型: 张肌

属性 scale_tril

支持 = IndependentConstraint（Real（）， 1）

属性差异

负双项式

torch.distributions.negative_binomial 类。NegativeBinomial（total_count， probs=None， logits=None， validate_args=None）[来源]

基地：Distribution

创建负二项式分布，即分布成功的独立和相同的伯努利试验的数量在实现失败之前。概率每次伯努利试验的成功率为total_countprobs.

参数

total_count （float 或 Tensor） – 负伯努利的非负数 Trials to Stop，尽管该发行版仍然对 Real 有效值计数
probs （Tensor） - 半开区间 [0， 1] 中成功的事件概率
logits （Tensor） - 成功概率的事件对数几率

arg_constraints = {'logits'： Real（）， 'probs'： HalfOpenInterval（lower_bound=0.0， upper_bound=1.0）， 'total_count'： GreaterThanEq（lower_bound=0）}

expand（batch_shape， _instance=无）[来源]

log_prob（value）[来源]

Property Logits 属性

属性平均值

property mode

属性param_shape

属性问题

sample（sample_shape=torch 的Size（[]））[来源]

支持 = IntegerGreaterThan（lower_bound=0）

属性差异

正常

类 torch.distributions.normal 中。正常（loc， scale， validate_args=None）[来源]

基地：ExponentialFamily

创建由和参数化的正态（也称为高斯）分布。locscale

例：

>>> m = Normal(torch.tensor([0.0]), torch.tensor([1.0]))
>>> m.sample()  # normally distributed with loc=0 and scale=1
tensor([ 0.1046])

参数

loc （float 或 Tensor） – 分布的平均值（通常称为 mu）
scale （float 或 Tensor） – 分布的标准差（通常称为 Sigma）

arg_constraints = {'loc'： Real（）， 'scale'： GreaterThan（lower_bound=0.0）}

cdf（值）[来源]

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

icdf（值）[来源]

log_prob（value）[来源]

属性平均值

property mode

rsample（sample_shape=Torch。Size（[]））[来源]

返回类型: 张肌

sample（sample_shape=torch 的Size（[]））[来源]

属性 stddev

支持 = Real（）

属性差异

OneHotCategorical 餐厅

torch.distributions.one_hot_categorical 类。OneHotCategorical（probs=无，logits=无，validate_args=无）[来源]

基地：Distribution

创建一个 one-hot 分类分布，参数化为probs或logits.

样本是大小为的 one-hot 编码向量。probs.size(-1)

注意

probs 参数必须是非负的、有限的，并且具有非零和，并且它将沿最后一个维度归一化为 sum 为 1。probs将返回此标准化值。 logits 参数将被解释为非规范化对数概率，因此可以是任何实数。它同样将被标准化，以便结果概率沿最后一个维度总和 1。logits将返回此标准化值。

另请参阅：有关torch.distributions.Categorical()probs和logits.

例：

>>> m = OneHotCategorical(torch.tensor([ 0.25, 0.25, 0.25, 0.25 ]))
>>> m.sample()  # equal probability of 0, 1, 2, 3
tensor([ 0.,  0.,  0.,  1.])

参数

probs （Tensor） - 事件概率
logits （Tensor） – 事件对数概率（未归一化）

arg_constraints = {'logits'： IndependentConstraint（Real（）， 1）， 'probs'：单纯形（）}

entropy（）[来源]

enumerate_support（expand=True）[来源]

expand（batch_shape， _instance=无）[来源]

has_enumerate_support = 真

log_prob（value）[来源]

Property Logits 属性

属性平均值

property mode

属性param_shape

属性问题

sample（sample_shape=torch 的Size（[]））[来源]

支持 = OneHot（）

属性差异

帕累托

类 torch.distributions.pareto 中。帕累托图（比例、alpha、validate_args=无）[来源]

基地：TransformedDistribution

来自 Pareto Type 1 分布的样本。

例：

>>> m = Pareto(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # sample from a Pareto distribution with scale=1 and alpha=1
tensor([ 1.5623])

参数

scale （float 或 Tensor） - 分布的 scale 参数
alpha （float 或 Tensor） - 分布的形状参数

arg_constraints： dict[str， constraint] = {'alpha'： GreaterThan（lower_bound=0.0）， 'scale'： GreaterThan（lower_bound=0.0）}

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

属性平均值

property mode

Property 支持

属性差异

泊松

类 torch.distributions.poisson 中。泊松（rate， validate_args=None）[来源]

基地：ExponentialFamily

创建由 rate 参数参数化的泊松分布。rate

样本是非负整数，其中 pmf 由下式给出

\mathrm{rate}^k \frac{e^{-\mathrm{rate}}}{k!}

例：

>>> m = Poisson(torch.tensor([4]))
>>> m.sample()
tensor([ 3.])

参数: rate （Number， Tensor） – rate 参数

arg_constraints = {'rate'： GreaterThanEq（lower_bound=0.0）}

expand（batch_shape， _instance=无）[来源]

log_prob（value）[来源]

属性平均值

property mode

sample（sample_shape=torch 的Size（[]））[来源]

支持 = IntegerGreaterThan（lower_bound=0）

属性差异

松弛伯努利

torch.distributions.relaxed_bernoulli 类。松弛伯努利（温度，probs=无，logits=无，validate_args=无）[来源]

基地：TransformedDistribution

创建一个 RelaxedBernoulli 分布，参数化为temperature和probs或logits（但不能两者兼而有之）。这是伯努利分布的简化版本，所以值在（0， 1）中，并且具有可重新参数化的样本。

例：

>>> m = RelaxedBernoulli(torch.tensor([2.2]),
...                      torch.tensor([0.1, 0.2, 0.3, 0.99]))
>>> m.sample()
tensor([ 0.2951,  0.3442,  0.8918,  0.9021])

参数

temperature （Tensor） – 松弛温度
probs （Number， Tensor） – 采样概率 1
logits （Number， Tensor） – 采样 1 的对数几率

arg_constraints： dict[str， constraint] = {'logits'： real（）， 'probs'： interval（lower_bound=0.0， upper_bound=1.0）}

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

Property Logits 属性

属性问题

support = 区间（lower_bound=0.0， upper_bound=1.0）

性状温度

LogitRelaxedBernoulli

torch.distributions.relaxed_bernoulli 类。LogitRelaxedBernoulli（温度，probs=无，logits=无，validate_args=无）[来源]

基地：Distribution

创建一个 LogitRelaxedBernoulli 分布，参数化为probs或logits（但不能两者兼而有之），这是 RelaxedBernoulli 的 logit 分配。

样本是（0， 1）中值的对数。有关详细信息，请参见 [1]。

参数

temperature （Tensor） – 松弛温度
probs （Number， Tensor） – 采样概率 1
logits （Number， Tensor） – 采样 1 的对数几率

[1] 具体分布：离散随机的连续松弛变量（Maddison等人，2017 年）

[2] 使用 Gumbel-Softmax 进行分类参数化（Jang等人，2017 年）

arg_constraints = {'logits'： Real（）， 'probs'：区间（lower_bound=0.0， upper_bound=1.0）}

expand（batch_shape， _instance=无）[来源]

log_prob（value）[来源]

Property Logits 属性

属性param_shape

属性问题

rsample（sample_shape=Torch。Size（[]））[来源]

返回类型: 张肌

支持 = Real（）

RelaxedOneHotCategorical

torch.distributions.relaxed_categorical 类。RelaxedOneHotCategorical（温度，概率=无，logits=无，validate_args=无）[来源]

基地：TransformedDistribution

创建一个 RelaxedOneHotCategorical 分布，其参数化为temperature和probs或logits. 这是发行版的轻松版本，因此它的样本是单纯形法，并且可以重新参数化。OneHotCategorical

例：

>>> m = RelaxedOneHotCategorical(torch.tensor([2.2]),
...                              torch.tensor([0.1, 0.2, 0.3, 0.4]))
>>> m.sample()
tensor([ 0.1294,  0.2324,  0.3859,  0.2523])

参数

temperature （Tensor） – 松弛温度
probs （Tensor） - 事件概率
logits （Tensor） – 每个事件的非标准化对数概率

arg_constraints： dict[str， constraint] = {'logits'： IndependentConstraint（Real（）， 1）， 'probs'： Simplex（）}

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

Property Logits 属性

属性问题

支持 = 单工（）

性状温度

学生T

类 torch.distributions.studentT 的StudentT（df， loc=0.0， scale=1.0， validate_args=None）[来源]

基地：Distribution

创建按度数参数化的 Student t 分布自由、中庸和规模。dflocscale

例：

>>> m = StudentT(torch.tensor([2.0]))
>>> m.sample()  # Student's t-distributed with degrees of freedom=2
tensor([ 0.1046])

参数

df （float 或 Tensor） – 自由度
loc （float 或 Tensor） – 分布的平均值
scale （float 或 Tensor） – 分布的规模

arg_constraints = {'df'： GreaterThan（lower_bound=0.0）， 'loc'： Real（）， 'scale'： GreaterThan（lower_bound=0.0）}

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

log_prob（value）[来源]

属性平均值

property mode

rsample（sample_shape=Torch。Size（[]））[来源]

返回类型: 张肌

支持 = Real（）

属性差异

TransformedDistribution

torch.distributions.transformed_distribution 类。TransformedDistribution（base_distribution，转换， validate_args=无）[来源]

基地：Distribution

Distribution 类的扩展，它应用一系列 Transform 添加到基本分配。设 f 为应用的转换的组合：

X ~ BaseDistribution
Y = f(X) ~ TransformedDistribution(BaseDistribution, f)
log p(Y) = log p(X) + log |det (dX/dY)|

请注意，的.event_shapeTransformedDistribution是其 base 分布及其转换的最大形状，因为 Transforms 可以引入事件之间的关联。

使用TransformedDistribution将：

# Building a Logistic Distribution
# X ~ Uniform(0, 1)
# f = a + b * logit(X)
# Y ~ f(X) ~ Logistic(a, b)
base_distribution = Uniform(0, 1)
transforms = [SigmoidTransform().inv, AffineTransform(loc=a, scale=b)]
logistic = TransformedDistribution(base_distribution, transforms)

arg_constraints： Dict[str， constraint] = {}

cdf（值）[来源]: 通过反转 transform（s）并计算碱基分布的分数。

expand（batch_shape， _instance=无）[来源]

属性 has_rsample

icdf（值）[来源]: 计算逆累积分布函数 transform（s）并计算碱基分布的分数。

log_prob（value）[来源]: 通过反转转换并计算分数来对样本进行评分使用基本分布和对数 abs det jacobian 的分数。

rsample（sample_shape=Torch。Size（[]））[来源]

生成 sample_shape 形状的重新参数化样品或sample_shape 如果分布参数进行批处理。首先从基本分布中采样，然后对列表中的每个转换应用 transform（）。

返回类型: 张肌

sample（sample_shape=torch 的Size（[]））[来源]: 生成 sample_shape 形样品或 sample_shape 形批次 samples （如果分布参数是批处理的）。样本 first from base 分布，并对列表。

Property 支持

均匀

类 torch.distributions.uniform 的 Uniform 类。Uniform（低、高、validate_args=无）[来源]

基地：Distribution

从半开区间生成均匀分布的随机样本。[low, high)

例：

>>> m = Uniform(torch.tensor([0.0]), torch.tensor([5.0]))
>>> m.sample()  # uniformly distributed in the range [0.0, 5.0)
tensor([ 2.3418])

参数

low （float 或 Tensor） – 下限范围（包括）。
high （float 或 Tensor） – 上限（不包括）。

arg_constraints = {'high'： dependent（）， 'low'： dependent（）}

cdf（值）[来源]

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

icdf（值）[来源]

log_prob（value）[来源]

属性平均值

property mode

rsample（sample_shape=Torch。Size（[]））[来源]

返回类型: 张肌

属性 stddev

Property 支持

属性差异

冯米塞斯

torch.distributions.von_mises 类。VonMises（loc， concentration， validate_args=None）[来源]

基地：Distribution

循环 von Mises 分布。

此实现使用极坐标。和 args 可以是任何实数（以便于不受约束的优化），但解释为 angles modulo 2 pi。locvalue

例：：

>>> m = VonMises(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # von Mises distributed with loc=1 and concentration=1
tensor([1.9777])

参数

loc （Torch.Tensor） – 以弧度为单位的角度。
浓度（Torch。Tensor） – 浓度参数

arg_constraints = {'concentration'： GreaterThan（lower_bound=0.0）， 'loc'： Real（）}

expand（batch_shape）[来源]

has_rsample = False

log_prob（value）[来源]

属性平均值: 提供的平均值是循环平均值。

property mode

sample（sample_shape=torch 的Size（[]））[来源]

von Mises 分布的抽样算法基于以下论文：D.J. Best 和 N.I. Fisher，“高效模拟 von Mises 分布。应用统计学（1979）：152-157。

采样始终在内部以双精度完成，以避免挂起 in _rejection_sample（）表示浓度值较小，其中对于 1e-4 左右的单精度开始发生（请参阅问题 #88443）。

支持 = Real（）

属性差异: 提供的差异是循环的差异。

威布尔

类 torch.distributions.weibull 的Weibull（量表，浓度，validate_args=无）[来源]

基地：TransformedDistribution

来自双参数 Weibull 分布的样本。

例

>>> m = Weibull(torch.tensor([1.0]), torch.tensor([1.0]))
>>> m.sample()  # sample from a Weibull distribution with scale=1, concentration=1
tensor([ 0.4784])

参数

scale （float 或 Tensor） – 分布的 scale 参数（lambda）。
concentration （float 或 Tensor） - 分布的浓度参数（k/shape）。

arg_constraints： dict[str， constraint] = {'concentration'： GreaterThan（lower_bound=0.0）， 'scale'： GreaterThan（lower_bound=0.0）}

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

属性平均值

property mode

support = 大于（lower_bound=0.0）

属性差异

威沙特

类 torch.distributions.wishart 中。Wishart（df， covariance_matrix=无， precision_matrix=无， scale_tril=无， validate_args=无）[来源]

基地：ExponentialFamily

创建由对称正定矩阵参数化的 Wishart 分布 $\Sigma$ , 或其 Cholesky 分解 $\mathbf{\Sigma} = \mathbf{L}\mathbf{L}^\top$

例

>>> m = Wishart(torch.Tensor([2]), covariance_matrix=torch.eye(2))
>>> m.sample()  # Wishart distributed with mean=`df * I` and
>>>             # variance(x_ij)=`df` for i != j and variance(x_ij)=`2 * df` for i == j

参数

df （float 或 Tensor） – 大于（平方矩阵的维度） - 1 的实值参数
covariance_matrix （Tensor） – 正定协方差矩阵
precision_matrix （Tensor） – 正定精度矩阵
scale_tril （Tensor） – 协方差的下三角因子，对角线为正值

注意

只有其中之一covariance_matrix或precision_matrix或scale_tril可以指定。用scale_tril将更高效：所有计算都在内部进行基于scale_tril.如果covariance_matrix或precision_matrix，则它仅用于计算使用 Cholesky 分解的相应下三角矩阵。 'torch.distributions.LKJCholesky' 是受限制的 Wishart 发行版。[1]

引用

[1] Wang， Z.， Wu， Y. 和 Chu， H.，2018 年。关于 LKJ 分销和受限 Wishart 分销的等效性。 [2] 索耶，S.，2007 年。Wishart 分布和逆 Wishart 采样。 [3] 安德森，TW，2003 年。多元统计分析简介（第 3 版）。 [4] Odell， P. L. & Feiveson， A. H.，1966年。用于生成 SampleCovariance 矩阵的数值过程。贾萨，61（313）：199-203。 [5] 库，Y.-C.& Bloomfield， P.， 2010.在 OX 中生成具有分数自由度的随机 Wishart 矩阵。

arg_constraints = {'covariance_matrix'： PositiveDefinite（）， 'df'： GreaterThan（lower_bound=0）， 'precision_matrix'： PositiveDefinite（）， 'scale_tril'： LowerCholesky（）}

属性 covariance_matrix

entropy（）[来源]

expand（batch_shape， _instance=无）[来源]

has_rsample = 真

log_prob（value）[来源]

属性平均值

property mode

属性 precision_matrix

rsample（sample_shape=Torch。Size（[]）， max_try_correction=无）[来源]

警告

在某些情况下，基于 Bartlett 分解的采样算法可能会返回奇异矩阵样本。默认情况下，会执行多次尝试更正奇异样本，但最终可能会返回奇异矩阵样本。单个样本可能会在 .log_prob（）中返回 -inf 值。在这些情况下，用户应验证样本，并相应地修复 df 的值或调整 .rsample 中参数max_try_correction值。

返回类型: 张肌

属性 scale_tril

支持 = PositiveDefinite（）

属性差异

KL Divergence

torch.distributions.kl 中。kl_divergence（p， q）[来源]

Compute Kullback-Leibler divergence $KL(p \| q)$ 在两个分配之间。

KL(p \| q) = \int p(x) \log\frac {p(x)} {q(x)} \,dx

参数

p （Distribution）（分配） – 对象。Distribution
q （Distribution）（分配） – 对象。Distribution

返回

形状为 batch_shape 的一批 KL 发散。

返回类型

张肌

提高

NotImplementedError – 如果分配类型尚未通过register_kl().

KL 分歧目前对以下通讯对实施：

Bernoulli和Bernoulli
Bernoulli和Poisson
Beta和Beta
Beta和ContinuousBernoulli
Beta和Exponential
Beta和Gamma
Beta和Normal
Beta和Pareto
Beta和Uniform
Binomial和Binomial
Categorical和Categorical
Cauchy和Cauchy
ContinuousBernoulli和ContinuousBernoulli
ContinuousBernoulli和Exponential
ContinuousBernoulli和Normal
ContinuousBernoulli和Pareto
ContinuousBernoulli和Uniform
Dirichlet和Dirichlet
Exponential和Beta
Exponential和ContinuousBernoulli
Exponential和Exponential
Exponential和Gamma
Exponential和Gumbel
Exponential和Normal
Exponential和Pareto
Exponential和Uniform
ExponentialFamily和ExponentialFamily
Gamma和Beta
Gamma和ContinuousBernoulli
Gamma和Exponential
Gamma和Gamma
Gamma和Gumbel
Gamma和Normal
Gamma和Pareto
Gamma和Uniform
Geometric和Geometric
Gumbel和Beta
Gumbel和ContinuousBernoulli
Gumbel和Exponential
Gumbel和Gamma
Gumbel和Gumbel
Gumbel和Normal
Gumbel和Pareto
Gumbel和Uniform
HalfNormal和HalfNormal
Independent和Independent
Laplace和Beta
Laplace和ContinuousBernoulli
Laplace和Exponential
Laplace和Gamma
Laplace和Laplace
Laplace和Normal
Laplace和Pareto
Laplace和Uniform
LowRankMultivariateNormal和LowRankMultivariateNormal
LowRankMultivariateNormal和MultivariateNormal
MultivariateNormal和LowRankMultivariateNormal
MultivariateNormal和MultivariateNormal
Normal和Beta
Normal和ContinuousBernoulli
Normal和Exponential
Normal和Gamma
Normal和Gumbel
Normal和Laplace
Normal和Normal
Normal和Pareto
Normal和Uniform
OneHotCategorical和OneHotCategorical
Pareto和Beta
Pareto和ContinuousBernoulli
Pareto和Exponential
Pareto和Gamma
Pareto和Normal
Pareto和Pareto
Pareto和Uniform
Poisson和Bernoulli
Poisson和Binomial
Poisson和Poisson
TransformedDistribution和TransformedDistribution
Uniform和Beta
Uniform和ContinuousBernoulli
Uniform和Exponential
Uniform和Gamma
Uniform和Gumbel
Uniform和Normal
Uniform和Pareto
Uniform和Uniform

torch.distributions.kl 中。register_kl（type_p， type_q）[来源]

用于注册成对函数的 Decoratorkl_divergence(). 用法：

@register_kl(Normal, Normal)
def kl_normal_normal(p, q):
    # insert implementation here

Lookup 返回按子类排序的最具体的（type，type）匹配项。如果匹配项不明确，则会引发 RuntimeWarning。例如，将解决模棱两可的情况：

@register_kl(BaseP, DerivedQ)
def kl_version1(p, q): ...
@register_kl(DerivedP, BaseQ)
def kl_version2(p, q): ...

您应该注册第三个最具体的 implementation，例如：

register_kl(DerivedP, DerivedQ)(kl_version1)  # Break the tie.

参数

type_p （type） – 的子类。Distribution
type_q （type） – 的子类。Distribution

变换

类 torch.distributions.transforms。AbsTransform（cache_size=0）[来源]: 通过映射进行变换 $y = |x|$ .

类 torch.distributions.transforms。AffineTransform（loc， scale， event_dim=0， cache_size=0）[来源]

通过逐点仿射映射进行变换 $y = \text{loc} + \text{scale} \times x$ .

参数

loc （Tensor or float） - 位置参数。
scale （Tensor or float） - Scale 参数。
event_dim （int） – event_shape的可选大小。此值应为零对于单变量随机变量，1 表示向量上的分布， 2 表示矩阵上的分布，等等。

类 torch.distributions.transforms。CatTransform（tseq， dim=0， lengths=None， cache_size=0）[来源]

Transform functor，它按组件将一系列转换 tseq 应用于 dim 处的每个子矩阵，长度为 lengths[dim]，以兼容torch.cat().

例：

x0 = torch.cat([torch.range(1, 10), torch.range(1, 10)], dim=0)
x = torch.cat([x0, x0], dim=0)
t0 = CatTransform([ExpTransform(), identity_transform], dim=0, lengths=[10, 10])
t = CatTransform([t0, t0], dim=0, lengths=[20, 20])
y = t(x)

类 torch.distributions.transforms。ComposeTransform（parts， cache_size=0）[来源]

在一个链中组合多个转换。正在组合的转换负责缓存。

参数

零件（列表Transform） – 要组合的转换列表。
cache_size （int） – 缓存的大小。如果为零，则不执行缓存。如果 1 个，则缓存最新的单个值。仅支持 0 和 1。

类 torch.distributions.transforms。CorrCholeskyTransform（cache_size=0）[来源]

变换一个未约束的实向量 $x$ 带 length $D*(D-1)/2$ 到 D 维相关矩阵的 Cholesky 因子。这个 Cholesky 因子较低具有正对角线的三角形矩阵，每行的单位为欧几里得范数。转换的处理方式如下：

首先，我们将 x 按行顺序转换为较低的三角矩阵。

对于每行 $X_i$ 的下部三角形部分，我们应用类StickBreakingTransform变换 $X_i$ 转换为单位欧几里得长度向量，使用以下步骤： - 缩放到间隔 $(-1, 1)$ 域： $r_i = \tanh(X_i)$ . - 转换为未签名的域： $z_i = r_i^2$ . -适用 $s_i = StickBreakingTransform(z_i)$ . - 转换回签名域： $y_i = sign(r_i) * \sqrt{s_i}$ .

类 torch.distributions.transforms。CumulativeDistributionTransform（distribution， cache_size=0）[来源]

通过概率分布的累积分布函数进行变换。

参数: distribution （Distribution） – 要用于其累积分布函数的分布转变。

例：

# Construct a Gaussian copula from a multivariate normal.
base_dist = MultivariateNormal(
    loc=torch.zeros(2),
    scale_tril=LKJCholesky(2).sample(),
)
transform = CumulativeDistributionTransform(Normal(0, 1))
copula = TransformedDistribution(base_dist, [transform])

类 torch.distributions.transforms。ExpTransform（cache_size=0）[来源]: 通过映射进行变换 $y = \exp(x)$ .

类 torch.distributions.transforms。IndependentTransform（base_transform， reinterpreted_batch_ndims， cache_size=0）[来源]

包装另一个转换，将 -many extra 的右侧最维度视为依靠。这对向前或向后转换没有影响，但求和 - 许多最右边的维度在。reinterpreted_batch_ndimsreinterpreted_batch_ndimslog_abs_det_jacobian()

参数

base_transform (Transform） – 基本转换。
reinterpreted_batch_ndims （int） – 最右边的额外数量要视为从属的维度。

类 torch.distributions.transforms。LowerCholeskyTransform（cache_size=0）[来源]

从无约束矩阵变换为下三角矩阵非负对角线条目。

这对于根据他们的 Cholesky 因式分解。

类 torch.distributions.transforms。PositiveDefiniteTransform（cache_size=0）[来源]: 从无约束矩阵变换为正定矩阵。

类 torch.distributions.transforms。PowerTransform（指数，cache_size=0）[来源]: 通过映射进行变换 $y = x^{\text{exponent}}$ .

类 torch.distributions.transforms。ReshapeTransform（in_shape， out_shape， cache_size=0）[来源]

Unit Jacobian 变换来重塑张量的最右侧部分。

请注意，和必须具有相同的元素，就像in_shapeout_shapetorch.Tensor.reshape().

参数

in_shape（Torch。Size） – 输入事件形状。
out_shape（Torch。Size） – 输出事件形状。

类 torch.distributions.transforms。SigmoidTransform（cache_size=0）[来源]: 通过映射进行变换 $y = \frac{1}{1 + \exp(-x)}$ 和 $x = \text{logit}(y)$ .

类 torch.distributions.transforms。SoftplusTransform（cache_size=0）[来源]: 通过映射进行变换 $\text{Softplus}(x) = \log(1 + \exp(x))$ . 当 $x > 20$ .

类 torch.distributions.transforms。TanhTransform（cache_size=0）[来源]

通过映射进行变换 $y = \tanh(x)$ .

它等效于但是，这可能在数值上不稳定，因此建议使用 TanhTransform。` ComposeTransform([AffineTransform(0., 2.), SigmoidTransform(), AffineTransform(-1., 2.)]) `

请注意，当涉及到 NaN/Inf 值时，应使用 cache_size=1。

类 torch.distributions.transforms。SoftmaxTransform（cache_size=0）[来源]

通过从无约束空间变换为单纯形 $y = \exp(x)$ 然后正火。

这不是双射的，不能用于 HMC。然而，这主要起作用坐标方向（最终归一化除外），因此适用于坐标优化算法。

类 torch.distributions.transforms。StackTransform（tseq， dim=0， cache_size=0）[来源]

Transform functor，它以兼容的方式将一系列 transforms tseq 组件应用于 dim 处的每个子矩阵torch.stack().

例：

x = torch.stack([torch.range(1, 10), torch.range(1, 10)], dim=1)
t = StackTransform([ExpTransform(), identity_transform], dim=1)
y = t(x)

类 torch.distributions.transforms。StickBreakingTransform（cache_size=0）[来源]

从不受约束的空间变换到一个附加空间的单纯形尺寸。

此转换作为折杆中的迭代 sigmoid 转换出现狄利克雷分布的构造：第一个 logit 是通过 sigmoid 转换为第一个概率和其他所有内容，然后进程递归。

这是双射的，适合在 HMC 中使用;然而，它混合在一起坐标在一起，不太适合进行优化。

类 torch.distributions.transforms。Transform（cache_size=0）[来源]

具有可计算对数的可逆转换的抽象类雅各布人。它们主要用于。torch.distributions.TransformedDistribution

缓存对于其 inverse 为 cost 或数值不稳定。请注意，必须注意记忆值因为 autograd 图表可能是相反的。例如，虽然以下使用或不使用缓存：

y = t(x)
t.log_abs_det_jacobian(x, y).backward()  # x will receive gradients.

但是，由于依赖关系反转，在缓存时将出现以下错误：

y = t(x)
z = t.inv(y)
grad(z.sum(), [y])  # error because z is x

派生类应实现或中的一个或两个。设置 bijective=True 的派生类也应实现_call()_inverse()log_abs_det_jacobian().

参数

cache_size （int） – 缓存的大小。如果为零，则不执行缓存。如果 1 个，则缓存最新的单个值。仅支持 0 和 1。

变量

域 (Constraint） – 表示此转换的有效输入的约束。
共域 (Constraint） – 表示此转换的有效输出的约束它们是逆变换的输入。
bijective （bool） - 此变换是否为双射变换。转换是双射的 iff 和 for each in the domain 和 in 共域。非 bijective 的 transform 至少应保持较弱的伪逆性质和。tt.inv(t(x)) == xt(t.inv(y)) == yxyt(t.inv(t(x)) == t(x)t.inv(t(t.inv(y))) == t.inv(y)
sign （int or Tensor） – 对于双射单变量变换，此应为 +1 或 -1，具体取决于 transform 是否为单调增加或减少。

Property Inv 属性: 返回倒数Transform的这种转变。这应该满足。t.inv.inv is t

属性标志: 返回雅可比行列式的行列式的符号（如果适用）。通常，这仅对 bijective 转换有意义。

log_abs_det_jacobian（x， y）[来源]: 计算给定输入和输出的 log det jacobian log dy/dx。

forward_shape（形状）[来源]: 在给定输入形状的情况下，推断前向计算的形状。默认为保留形状。

inverse_shape（形状）[来源]: 在给定输出形状的情况下，推断逆计算的形状。默认为保留形状。

约束

实施了以下约束：

constraints.boolean
constraints.cat
constraints.corr_cholesky
constraints.dependent
constraints.greater_than(lower_bound)
constraints.greater_than_eq(lower_bound)
constraints.independent(constraint, reinterpreted_batch_ndims)
constraints.integer_interval(lower_bound, upper_bound)
constraints.interval(lower_bound, upper_bound)
constraints.less_than(upper_bound)
constraints.lower_cholesky
constraints.lower_triangular
constraints.multinomial
constraints.nonnegative
constraints.nonnegative_integer
constraints.one_hot
constraints.positive_integer
constraints.positive
constraints.positive_semidefinite
constraints.positive_definite
constraints.real_vector
constraints.real
constraints.simplex
constraints.symmetric
constraints.stack
constraints.square
constraints.symmetric
constraints.unit_interval

类 torch.distributions.constraints。约束[source]

约束的抽象基类。

constraint 对象表示变量有效的区域，例如，可以在其中优化变量。

变量

is_discrete （bool） - 约束空间是否为离散空间。默认为 False。
event_dim （int） – 共同定义的最右侧的维度数一个事件。这check()方法将删除这些维度计算有效性时。

check（value）[来源]: 返回表示 value 中的每个事件是否满足此约束。sample_shape + batch_shape

torch.distributions.constraints中。猫: 别名为_Cat

torch.distributions.constraints中。dependent_property: 别名为_DependentProperty

torch.distributions.constraints中。greater_than: 别名为_GreaterThan

torch.distributions.constraints中。greater_than_eq: 别名为_GreaterThanEq

torch.distributions.constraints中。独立: 别名为_IndependentConstraint

torch.distributions.constraints中。integer_interval: 别名为_IntegerInterval

torch.distributions.constraints中。间隔: 别名为_Interval

torch.distributions.constraints中。half_open_interval: 别名为_HalfOpenInterval

torch.distributions.constraints中。is_dependent（约束）[来源]

检查是否为对象。constraint_Dependent

参数: constraint – 对象。Constraint
返回: 如果可以细化为类型，则为 True，否则为 False。constraint_Dependent
返回类型: bool

例子

>>> import torch
>>> from torch.distributions import Bernoulli
>>> from torch.distributions.constraints import is_dependent

>>> dist = Bernoulli(probs = torch.tensor([0.6], requires_grad=True))
>>> constraint1 = dist.arg_constraints["probs"]
>>> constraint2 = dist.arg_constraints["logits"]

>>> for constraint in [constraint1, constraint2]:
>>>     if is_dependent(constraint):
>>>         continue

torch.distributions.constraints中。less_than: 别名为_LessThan

torch.distributions.constraints中。多项式: 别名为_Multinomial

torch.distributions.constraints中。叠: 别名为_Stack

约束注册表

PyTorch 提供了两个全局ConstraintRegistry链接的对象Constraint对象设置为Transform对象。这些对象都 input constraints 和 return 转换，但它们对双射度。

biject_to(constraint)查找 bijectiveTransformfrom 到给定的 .返回的转换保证具有并且应该实现。constraints.realconstraint.bijective = True.log_abs_det_jacobian()
transform_to(constraint)查找不一定的 bijectiveTransformfrom 到给定的 .返回的转换不能保证实现。constraints.realconstraint.log_abs_det_jacobian()

注册表可用于执行 unconstrained 对概率分布的约束参数进行优化，这些参数是由每个分配的 dict 指示。这些转换通常过度参数化 space 以避免旋转;因此，他们是适用于像 Adam 这样的坐标优化算法：transform_to().arg_constraints

loc = torch.zeros(100, requires_grad=True)
unconstrained = torch.zeros(100, requires_grad=True)
scale = transform_to(Normal.arg_constraints['scale'])(unconstrained)
loss = -Normal(loc, scale).log_prob(data).sum()

注册表对于哈密顿蒙特卡洛很有用，其中来自具有约束的概率分布的样本为在不受约束的空间中传播，算法通常是旋转不变量。：biject_to().support

dist = Exponential(rate)
unconstrained = torch.zeros(100, requires_grad=True)
sample = biject_to(dist.support)(unconstrained)
potential_energy = -dist.log_prob(sample).sum()

注意

其中和 differ 的示例为：返回一个transform_tobiject_toconstraints.simplextransform_to(constraints.simplex)SoftmaxTransform那只是对其输入进行指数化和归一化;这是一个便宜且大部分适合 SVI 等算法的坐标运算。在 contrast 的biject_to(constraints.simplex)StickBreakingTransform那将其输入向下喷射到少一维的空间;这个更多成本高昂、数值稳定性较低但算法需要的变换就像 HMC 一样。

和对象可以通过用户定义的 constraints 和 transform，使用它们的方法作为函数 on singleton constraints：biject_totransform_to.register()

transform_to.register(my_constraint, my_transform)

或作为参数化约束的装饰器：

@transform_to.register(MyConstraintClass)
def my_factory(constraint):
    assert isinstance(constraint, MyConstraintClass)
    return MyTransform(constraint.param1, constraint.param2)

您可以通过创建新的ConstraintRegistry对象。

torch.distributions.constraint_registry 类。ConstraintRegistry[来源]

Registry 将约束链接到转换。

register（constraint， factory=None）[来源]

注册一个Constraint子类。用法：

@my_registry.register(MyConstraintClass)
def construct_transform(constraint):
    assert isinstance(constraint, MyConstraint)
    return MyTransform(constraint.arg_constraints)

参数

constraint（的子类Constraint） – 的子类Constraint或所需类的 Singleton 对象。
factory （Callable） – 一个可调用对象，它输入一个约束对象并返回一个Transform对象。

概率分布 - torch.distributions

Score 函数

Pathwise 导数

分配

指数系列

伯努利

试用版

二项式

分类

柯 西

气2

连续式伯努利

狄里克莱

指数

费舍尔斯内德科

伽马

几何

甘贝

半柯西

半法线

独立

逆伽玛

库马拉斯瓦米

LKJCholesky

拉普拉斯

对数

LowRankMultivariateNormal

MixtureSameFamily

多项式

MultivariateNormal

负双项式

正常

OneHotCategorical 餐厅

帕 累 托

泊 松

松弛伯努利

LogitRelaxedBernoulli

RelaxedOneHotCategorical

学生T

TransformedDistribution

均匀

冯米塞斯

威布尔

威沙特

KL Divergence

变换

约束

约束注册表

文档

教程

资源

APP信息

柯西

帕累托

泊松