Train¶
Training machine learning models often requires custom train loop and custom code. As such, we don’t provide an out of the box training loop app. We do however have examples for how you can construct your training app as well as generic components you can use to run your custom training app.
Note
Follow Prerequisites of running examples before running the examples
Check out the code for Trainer App Example. You can try it out by running a single trainer example on your desktop:
python torchx/examples/apps/lightning_classy_vision/train.py
Torchx simplifies application execution by providing a simple to use APIs that standardize application execution on local or remote environments. It does this by introducing a concept of a Component.
Each user application should be accompanied with the corresponding component. Check out the single node trainer code: Trainer Component
Try it out yourself:
torchx run -s local_cwd ./torchx/examples/apps/lightning_classy_vision/component.py:trainer
The code above will execute a single trainer on a user desktop. If you have docker installed on your laptop you can running the same single trainer via the following cmd:
torchx run -s local_docker ./torchx/examples/apps/lightning_classy_vision/component.py:trainer
You can learn more about authoring your own components: torchx.components
Torchx has great support for simplifying execution of distributed jobs, that you can learn more
torchx.components.dist