LoRA Training and Serving

Miles supports LoRA adapters for both SFT and RL recipes. Adapters trained by miles load directly into SGLang for rollout, so there is no separate merge or conversion step in the training-serving loop.

Example launchers

The canonical LoRA recipes live under examples/lora/ in the miles repo:

examples/lora/run-qwen2.5-0.5B-megatron-lora.sh — small dense model, single GPU.
examples/lora/run-qwen3-4B-megatron-lora.sh — Qwen3-4B, RL with LoRA.
examples/lora/run-gpt-oss-20B-megatron-moe-lora.sh — MoE example.

Key flags

Flag	Purpose
`--lora-rank`	LoRA rank. Typical values: 8, 16, 32, 64.
`--lora-alpha`	LoRA alpha. Usually 2 x rank.
`--lora-dropout`	Dropout on the LoRA path. Set to `0.0` for RL training.
`--lora-type`	LoRA variant: `lora` (merged QKV / gated-MLP) or `canonical_lora` (split Q / K / V). Default `lora`.
`--target-modules`	Which linear layers receive adapters. Required when `--lora-rank > 0`. Accepts `all-linear` or a comma-separated list (HF names like `q_proj,k_proj,v_proj,o_proj` or Megatron names like `linear_qkv,linear_proj`).
`--exclude-modules`	Comma-separated names to subtract from `--target-modules`.
`--lora-adapter-path`	Path to a pre-trained adapter to resume from.
`--lora-sync-from-tensor`	Sync adapter weights to SGLang via in-memory tensors instead of a file round-trip.

Two existing arguments are easy to miss when configuring LoRA:

--megatron-to-hf-mode bridge is required. The LoRA path goes through Megatron-Bridge’s PEFT integration; the default raw converter does not understand LoRA layers.
--colocate is required. Distributed (PD-disaggregated) rollout with LoRA is not supported today.

MoE

For MoE models, attach LoRA to the FFN expert projections and switch the SGLang LoRA backend to triton:

LORA_ARGS=(
   --lora-rank 32
   --lora-alpha 32
   --lora-dropout 0.0
   --target-modules "gate_proj,up_proj,down_proj"
   --sglang-lora-backend triton  # required for MoE LoRA
   --megatron-to-hf-mode bridge
)

The default SGLang LoRA backend skips MoE layers and logs Current LoRA backend does not support LoRA on MoE layers; skipping MoE layer, which means the expert adapters get silently dropped at inference time. The GPT-OSS-20B example launcher sets --sglang-lora-backend triton for this reason.

Compatibility and limitations

Training backend: Megatron only. The FSDP backend does not have a LoRA path yet.
Rollout topology: colocate only. Distributed / PD-disaggregated rollout raises NotImplementedError at weight-sync time when LoRA is enabled.
Algorithms: orthogonal to the advantage estimator; the GRPO recipes in examples/lora/ carry straight over to PPO and any other algorithm that drives train.py.
Low-precision training: the LoRA branch follows the surrounding precision, so block-wise FP8, MXFP8, and INT4 QAT recipes are compatible. See Low Precision RL and INT4 QAT.
Target modules: --target-modules is required whenever --lora-rank > 0. There is no auto-detection; the launcher asserts at startup.
Single adapter per run: only one set of --lora-* arguments is honored per training job. Training multiple LoRA adapters in parallel within a single train.py run is not implemented today — run separate jobs if you need multiple adapters.

Internals

The bridge between Megatron’s LoRA path and SGLang adapter loading is in:

miles/backends/megatron_utils/lora_utils.py — argument parsing helpers, LoRA detection (is_lora_enabled, is_lora_model), and HF ↔ Megatron module-name conversion for both the lora and canonical_lora variants.
miles/backends/megatron_utils/bridge_lora_helpers.py — the Megatron-Bridge PEFT hook that wraps the model with LoRA layers before training.
miles/backends/megatron_utils/checkpoint.py — adapter-aware save and load.
miles/backends/megatron_utils/update_weight/update_weight_from_tensor.py — colocate-mode weight sync from the trainer’s LoRA tensors into the SGLang rollout engine. Disaggregate-mode weight sync is not supported yet.

A worked tutorial covering checkpoint conversion, SGLang adapter loading, and LoRA-specific evaluation will land here in a future doc pass.

​Example launchers

​Key flags

​MoE

​Compatibility and limitations

​Internals

Example launchers

Key flags

MoE

Compatibility and limitations

Internals