Skip to main content
Two pages take you from a fresh machine to a running RL training job:
  • Installation — Docker (recommended), pip, or building from source on NVIDIA / AMD.
  • Quick Startdocker pull to a GRPO run on Qwen3-4B in under an hour, on a single 8-GPU node.
After the loop is running, Models covers the family-specific recipes and the User Guide walks through concepts, data, monitoring, and customization.