ᚹ LoRA Training Guide
Complete walkthrough for training character LoRA models in the Yggdrasil ecosystem. This guide covers the Ehyra Z-Image pipeline using ai-toolkit on RTX 3060 12GB.
Prerequisites
ᛁNiflheimBefore starting, ensure you have:
- GPU: NVIDIA RTX 3060 12GB or better (VRAM is the bottleneck)
- Python: 3.10+ with venv
- ai-toolkit: Cloned and installed (
~/comfy/ai-toolkit/) - Training images: 20-40 high quality reference images
- Disk space: At least 10GB free for checkpoints and output
:::warning Resources Training uses ~8-9 GB VRAM on a 12GB card. Close other GPU processes (ComfyUI, games) before starting. :::
Step 1: Prepare Training Data
ᛏMuspelheimImage Requirements
- Count: 20-40 images (more = better consistency, less flexibility)
- Resolution: 512×512 minimum, 768×768 preferred
- Variety: Different poses, expressions, lighting, angles
- Quality: Clean, high-resolution, no watermarks or text overlays
- Format: PNG or JPG
Directory Structure
~/comfy/ai-toolkit/
├── input/
│ └── ehyra_zimage_lora_v1/
│ ├── img_001.png
│ ├── img_002.png
│ └── ... (20-40 images)
├── output/
│ └── ehyra_zimage_lora_v1/
│ └── (checkpoints saved here)
└── config/
└── ehyra_zimage_lora_v1.yaml
:::tip Image Selection Focus on images that clearly show the character's defining features. Remove images with heavy occlusion or unrelated backgrounds. :::
Captioning
Each image needs a .txt caption file with the same filename. For DiT.2 (Z-Image Base):
ehyra_001.png.txt → "anime girl with long white hair, cyan glowing eyes, circuit neon armor, dynamic pose, cyberpunk city background..."
Important: DiT.2 triggers need 30+ character descriptions of visual features, NOT artist names.
Step 2: Configure Training YAML
ᛊAsgardCreate your training config at config/your_lora_name.yaml:
# Ehyra Z-Image LoRA Training Config
task: text-to-image-lora
base_model: "Z-Image-Base/DiT.2"
output_dir: "output/ehyra_zimage_lora_v1"
save_dtype: bf16
# Network
network:
type: lora
rank: 32
alpha: 16
dropout: 0.1
# Training
train:
batch_size: 1
steps: 3000
learning_rate: 1e-4
lr_scheduler: cosine
warmup_steps: 100
gradient_checkpointing: true
mixed_precision: bf16
# Data
data:
dataset_type: image_folder
dataset_img_dir: "input/ehyra_zimage_lora_v1"
caption_extension: .txt
resolution: [768, 768]
# Saving
save:
save_every: 500
save_format: safetensors
# Resume from checkpoint (uncomment to resume)
# resume: "output/ehyra_zimage_lora_v1/ehyra_zimage_lora_v1_2000.safetensors"
Key Parameters Explained
| Parameter | Recommended | Notes |
|---|---|---|
rank | 16-64 | Higher = more detail, slower training |
alpha | rank/2 | Controls learning magnitude |
steps | 2000-4000 | More steps ≠ better (watch for overfitting) |
learning_rate | 1e-4 to 5e-5 | Lower for fine adjustments |
batch_size | 1-2 | 1 for 12GB VRAM, 2 for 24GB+ |
resolution | 768×768 | Higher quality but more VRAM |
Step 3: Run Training
ᛏMuspelheim# Activate the environment
cd ~/comfy/ai-toolkit
source venv/bin/activate
# Start training
python run_lora.py --config config/ehyra_zimage_lora_v1.yaml
Monitoring
- Loss progression: Should decrease steadily, normal range 0.2-0.8
- VRAM usage: ~8-9 GB on RTX 3060
- Time estimate: ~2-3 hours for 3000 steps on RTX 3060
Resuming Training
If training is interrupted, resume from the last checkpoint:
# Add to your config YAML
resume: "output/ehyra_zimage_lora_v1/ehyra_zimage_lora_v1_2000.safetensors"
:::realm Resuming
Always use the .safetensors checkpoint path. The training will continue from exactly where it left off, maintaining learning rate schedule.
:::
Step 4: Evaluate and Iterate
ᛉVanaheimTesting Your LoRA
- Copy the final
.safetensorsfile to your ComfyUI models directory - Use a testing workflow in ComfyUI with the LoRA loader node
- Test with various prompts at different LoRA strengths (0.6-1.0)
- Check for: character consistency, artifact reduction, style flexibility
Common Issues
| Issue | Likely Cause | Fix |
|---|---|---|
| Burned/overexposed output | Too many steps | Reduce to 2000-2500 |
| Character not recognizable | Too few steps or low rank | Increase steps or rank |
| Artifacts in details | Training data quality | Clean up dataset |
| LoRA too rigid | Overfitting | Reduce steps, increase dropout |
| VRAM overflow | Resolution too high | Drop to 512×512 or reduce batch |
Pipeline Architecture
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Reference │───▶│ Captioning │───▶│ Training YAML │
│ Images │ │ (.txt files) │ │ Configuration │
│ (20-40) │ │ │ │ │
└─────────────┘ └──────────────┘ └────────┬────────┘
│
▼
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Final │◀───│ Evaluation │◀───│ ai-toolkit │
│ LoRA │ │ (ComfyUI) │ │ Training Loop │
│ (.safetensors)│ │ │ │ (3000 steps) │
└─────────────┘ └──────────────┘ └─────────────────┘
:::neon Pro Tip Keep your best checkpoints! Save every 500 steps so you can pick the optimal one — the last step isn't always the best. :::