自定义组件¶
这是一个关于如何构建一个简单的应用程序和自定义组件规范并通过两种不同的调度器发布它的指南。
请参阅快速入门指南以获取安装和基本使用方法。
世界你好¶
让我们从编写一个简单的“Hello World” Python应用程序开始。这只是一个普通的Python程序,可以包含你想要的任何内容。
注意
此示例使用 Jupyter Notebook %%writefile 创建本地文件,以作示范用途。在正常用法中,这些文件应为独立文件。
[1]:
%%writefile my_app.py
import sys
import argparse
def main(user: str) -> None:
print(f"Hello, {user}!")
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Hello world app"
)
parser.add_argument(
"--user",
type=str,
help="the person to greet",
required=True,
)
args = parser.parse_args(sys.argv[1:])
main(args.user)
Overwriting my_app.py
现在我们有了一个应用,可以编写它的组件文件了。这个功能使我们能够以用户友好的方式重用和共享我们的应用。
我们可以从 torchx 命令行界面 (cli) 或者作为管道的一部分进行程序化使用。
[2]:
%%writefile my_component.py
import torchx.specs as specs
def greet(user: str, image: str = "my_app:latest") -> specs.AppDef:
return specs.AppDef(
name="hello_world",
roles=[
specs.Role(
name="greeter",
image=image,
entrypoint="python",
args=[
"-m", "my_app",
"--user", user,
],
)
],
)
Overwriting my_component.py
我们可以通过torchx run执行我们的组件。The local_cwd调度程序根据当前目录执行组件。
[3]:
%%sh
torchx run --scheduler local_cwd my_component.py:greet --user "your name"
torchx 2023-04-04 01:08:09 INFO loaded configs from /home/ubuntu/rsync/torchx/docs/source/.torchxconfig
torchx 2023-04-04 01:08:09 INFO Tracker configurations: {}
torchx 2023-04-04 01:08:09 INFO Log directory not set in scheduler cfg. Creating a temporary log dir that will be deleted on exit. To preserve log directory set the `log_dir` cfg option
torchx 2023-04-04 01:08:09 INFO Log directory is: /tmp/torchx_aa14fi1s
torchx 2023-04-04 01:08:09 INFO Waiting for the app to finish...
greeter/0 Hello, your name!
torchx 2023-04-04 01:08:10 INFO Job finished: SUCCEEDED
local_cwd://torchx/hello_world-qtk1nrbcghqcsc
如果我们希望在其他环境中运行,我们可以构建一个 Docker 容器,这样我们就可以在支持 Docker 的环境中运行我们的组件,例如 Kubernetes,或者通过本地的 Docker 调度程序运行。
注意
这需要安装Docker,并且在Google Colab等环境中无法使用。如果你尚未完成,请按照以下地址的安装说明进行操作:https://docs.docker.com/get-docker/
[4]:
%%writefile Dockerfile.custom
FROM ghcr.io/pytorch/torchx:0.1.0rc1
ADD my_app.py .
Overwriting Dockerfile.custom
一旦我们创建了 Dockerfile,就可以创建我们的 docker 镜像。
[5]:
%%sh
docker build -t my_app:latest -f Dockerfile.custom .
Step 1/2 : FROM ghcr.io/pytorch/torchx:0.1.0rc1
---> 3dbec59e8049
Step 2/2 : ADD my_app.py .
---> Using cache
---> a3515c76f647
Successfully built a3515c76f647
Successfully tagged my_app:latest
然后我们可以在本地调度器上启动它。
[6]:
%%sh
torchx run --scheduler local_docker my_component.py:greet --image "my_app:latest" --user "your name"
torchx 2023-04-04 01:08:11 INFO loaded configs from /home/ubuntu/rsync/torchx/docs/source/.torchxconfig
torchx 2023-04-04 01:08:11 INFO Tracker configurations: {}
torchx 2023-04-04 01:08:11 INFO Checking for changes in workspace `file:///home/ubuntu/rsync/torchx/docs/source`...
torchx 2023-04-04 01:08:11 INFO To disable workspaces pass: --workspace="" from CLI or workspace=None programmatically.
torchx 2023-04-04 01:08:11 INFO Workspace `file:///home/ubuntu/rsync/torchx/docs/source` resolved to filesystem path `/home/ubuntu/rsync/torchx/docs/source`
torchx 2023-04-04 01:08:12 WARNING failed to pull image my_app:latest, falling back to local: 404 Client Error for http+docker://localhost/v1.41/images/create?tag=latest&fromImage=my_app: Not Found ("pull access denied for my_app, repository does not exist or may require 'docker login': denied: requested access to the resource is denied")
torchx 2023-04-04 01:08:12 INFO Building workspace docker image (this may take a while)...
torchx 2023-04-04 01:08:13 INFO Built new image `sha256:f196aed28eab30f8d6ad39f2fcd6d11d7795b37e2e9378fcb401c85fd99041f0` based on original image `my_app:latest` and changes in workspace `file:///home/ubuntu/rsync/torchx/docs/source` for role[0]=greeter.
torchx 2023-04-04 01:08:13 INFO Waiting for the app to finish...
greeter/0 Hello, your name!
torchx 2023-04-04 01:08:14 INFO Job finished: SUCCEEDED
local_docker://torchx/hello_world-w7g3mtpg2vwlq
如果您有一个Kubernetes集群,您可以使用Kubernetes调度器在集群上启动此任务。
$ docker push my_app:latest
$ torchx run --scheduler kubernetes my_component.py:greet --image "my_app:latest" --user "your name"
内置函数¶
TorchX 还提供了许多内置组件,附带预制的图片。你可以通过以下方式发现它们:
[7]:
%%sh
torchx builtins
Found 11 builtin components:
1. dist.ddp
2. dist.spmd
3. metrics.tensorboard
4. serve.torchserve
5. utils.binary
6. utils.booth
7. utils.copy
8. utils.echo
9. utils.python
10. utils.sh
11. utils.touch
你可以像使用其他组件一样,通过命令行接口、管道或编程方式来使用这些功能。
[8]:
%%sh
torchx run utils.echo --msg "Hello :)"
torchx 2023-04-04 01:08:16 INFO loaded configs from /home/ubuntu/rsync/torchx/docs/source/.torchxconfig
torchx 2023-04-04 01:08:16 INFO Tracker configurations: {}
torchx 2023-04-04 01:08:16 INFO Checking for changes in workspace `file:///home/ubuntu/rsync/torchx/docs/source`...
torchx 2023-04-04 01:08:16 INFO To disable workspaces pass: --workspace="" from CLI or workspace=None programmatically.
torchx 2023-04-04 01:08:16 INFO Workspace `file:///home/ubuntu/rsync/torchx/docs/source` resolved to filesystem path `/home/ubuntu/rsync/torchx/docs/source`
torchx 2023-04-04 01:08:17 INFO Building workspace docker image (this may take a while)...
torchx 2023-04-04 01:08:17 INFO Built new image `sha256:f196aed28eab30f8d6ad39f2fcd6d11d7795b37e2e9378fcb401c85fd99041f0` based on original image `ghcr.io/pytorch/torchx:0.5.0` and changes in workspace `file:///home/ubuntu/rsync/torchx/docs/source` for role[0]=echo.
torchx 2023-04-04 01:08:17 INFO Waiting for the app to finish...
torchx 2023-04-04 01:08:17 INFO Job finished: SUCCEEDED
echo/0 Hello :)
local_docker://torchx/echo-dlpgsqs33bh1hc