目录

torchaudio.prototype.pipelines

The pipelines subpackage contains APIs to models with pretrained weights and relevant utilities.

RNN-T Streaming/Non-Streaming ASR

Pretrained Models

EMFORMER_RNNT_BASE_MUSTC

Pre-trained Emformer-RNNT-based ASR pipeline capable of performing both streaming and non-streaming inference.

EMFORMER_RNNT_BASE_TEDLIUM3

Pre-trained Emformer-RNNT-based ASR pipeline capable of performing both streaming and non-streaming inference.

HiFiGAN Vocoder

Interface

HiFiGANVocoderBundle defines HiFiGAN Vocoder pipeline capable of transforming mel spectrograms into waveforms.

HiFiGANVocoderBundle

Data class that bundles associated information to use pretrained HiFiGANVocoder.

Pretrained Models

HIFIGAN_VOCODER_V3_LJSPEECH

HiFiGAN Vocoder pipeline, trained on The LJ Speech Dataset [Ito and Johnson, 2017].

VGGish

Interface

VGGishBundle

VGGish [Hershey et al., 2017] inference pipeline ported from torchvggish and tensorflow-models.

VGGishBundle.VGGish

Implementation of VGGish model [Hershey et al., 2017].

VGGishBundle.VGGishInputProcessor

Converts raw waveforms to batches of examples to use as inputs to VGGish.

Pretrained Models

VGGISH

Pre-trained VGGish [Hershey et al., 2017] inference pipeline ported from torchvggish and tensorflow-models.

文档

访问 PyTorch 的全面开发人员文档

查看文档

教程

获取面向初学者和高级开发人员的深入教程

查看教程

资源

查找开发资源并解答您的问题

查看资源