Example launchers
The canonical LoRA recipes live underexamples/lora/ in
the miles repo:
examples/lora/run-qwen2.5-0.5B-megatron-lora.sh— small dense model, single GPU.examples/lora/run-qwen3-4B-megatron-lora.sh— Qwen3-4B, RL with LoRA.examples/lora/run-gpt-oss-20B-megatron-moe-lora.sh— MoE example.
Key flags
| Flag | Purpose |
|---|---|
--lora-rank | LoRA rank. Typical values: 8, 16, 32, 64. |
--lora-alpha | LoRA alpha. Usually 2 x rank. |
--lora-dropout | Dropout on the LoRA path. Set to 0.0 for RL training. |
--lora-type | LoRA variant: lora (merged QKV / gated-MLP) or canonical_lora (split Q / K / V). Default lora. |
--target-modules | Which linear layers receive adapters. Required when --lora-rank > 0. Accepts all-linear or a comma-separated list (HF names like q_proj,k_proj,v_proj,o_proj or Megatron names like linear_qkv,linear_proj). |
--exclude-modules | Comma-separated names to subtract from --target-modules. |
--lora-adapter-path | Path to a pre-trained adapter to resume from. |
--lora-sync-from-tensor | Sync adapter weights to SGLang via in-memory tensors instead of a file round-trip. |
MoE
For MoE models, attach LoRA to the FFN expert projections and switch the SGLang LoRA backend to triton:Current LoRA backend does not support LoRA on MoE layers; skipping MoE layer,
which means the expert adapters get silently dropped at inference time. The
GPT-OSS-20B example launcher sets --sglang-lora-backend triton for this
reason.
Compatibility and limitations
- Training backend: Megatron only. The FSDP backend does not have a LoRA path yet.
- Rollout topology: colocate only. Distributed / PD-disaggregated rollout
raises
NotImplementedErrorat weight-sync time when LoRA is enabled. - Algorithms: orthogonal to the advantage estimator; the GRPO recipes in
examples/lora/carry straight over to PPO and any other algorithm that drivestrain.py. - Low-precision training: the LoRA branch follows the surrounding precision, so block-wise FP8, MXFP8, and INT4 QAT recipes are compatible. See Low Precision RL and INT4 QAT.
- Target modules:
--target-modulesis required whenever--lora-rank > 0. There is no auto-detection; the launcher asserts at startup. - Single adapter per run: only one set of
--lora-*arguments is honored per training job. Training multiple LoRA adapters in parallel within a singletrain.pyrun is not implemented today — run separate jobs if you need multiple adapters.
Internals
The bridge between Megatron’s LoRA path and SGLang adapter loading is in:miles/backends/megatron_utils/lora_utils.py— argument parsing helpers, LoRA detection (is_lora_enabled,is_lora_model), and HF ↔ Megatron module-name conversion for both theloraandcanonical_loravariants.miles/backends/megatron_utils/bridge_lora_helpers.py— the Megatron-Bridge PEFT hook that wraps the model with LoRA layers before training.miles/backends/megatron_utils/checkpoint.py— adapter-aware save and load.miles/backends/megatron_utils/update_weight/update_weight_from_tensor.py— colocate-mode weight sync from the trainer’s LoRA tensors into the SGLang rollout engine. Disaggregate-mode weight sync is not supported yet.

