目录

TorchX

TorchX is an SDK for quickly building and deploying ML applications from R&D to production. It offers various builtin components that encode MLOps best practices and make advanced features like distributed training and hyperparameter optimization accessible to all. Users can get started with TorchX with no added setup cost since it supports popular ML schedulers and pipeline orchestrators that are already widely adopted and deployed in production.

No two production environments are the same. To comply with various use cases, TorchX’s core APIs allow tons of customization at well-defined extension points so that even the most unique applications can be serviced without customizing the whole vertical stack.

GETTING STARTED? First learn the basic concepts and follow the quickstart guide.

_images/torchx_index_diag.png

In 1-2-3

01 DEFINE OR CHOOSE Start by writing a component – a python function that returns an AppDef object for your application. Or you can choose one of the builtin components.

02 RUN AS A JOB Once you’ve defined or chosen a component, you can run it by submitting it as a job in one of the supported Schedulers. TorchX supports several popular ones, such as Kubernetes and SLURM out of the box.

03 CONVERT TO PIPELINE In production, components are often run as a workflow (aka pipeline). TorchX components can be converted to pipeline stages by passing them through the torchx.pipelines adapter. Pipelines lists the pipeline orchestrators supported out of the box.

Documentation

Components Library

Runtime Library

Application (Runtime)

Works With

Schedulers

Pipelines

Reference

Experimental

Experimental Features

文档

访问 PyTorch 的全面开发人员文档

查看文档

教程

获取面向初学者和高级开发人员的深入教程

查看教程

资源

查找开发资源并解答您的问题

查看资源