Skip to main content

LoRA Training Guide

Complete walkthrough for training character LoRA models in the Yggdrasil ecosystem. This guide covers the Ehyra Z-Image pipeline using ai-toolkit on RTX 3060 12GB.

Prerequisites

Niflheim

Before starting, ensure you have:

  • GPU: NVIDIA RTX 3060 12GB or better (VRAM is the bottleneck)
  • Python: 3.10+ with venv
  • ai-toolkit: Cloned and installed (~/comfy/ai-toolkit/)
  • Training images: 20-40 high quality reference images
  • Disk space: At least 10GB free for checkpoints and output

:::warning Resources Training uses ~8-9 GB VRAM on a 12GB card. Close other GPU processes (ComfyUI, games) before starting. :::

Step 1: Prepare Training Data

Muspelheim

Image Requirements

  • Count: 20-40 images (more = better consistency, less flexibility)
  • Resolution: 512×512 minimum, 768×768 preferred
  • Variety: Different poses, expressions, lighting, angles
  • Quality: Clean, high-resolution, no watermarks or text overlays
  • Format: PNG or JPG

Directory Structure

~/comfy/ai-toolkit/
├── input/
│ └── ehyra_zimage_lora_v1/
│ ├── img_001.png
│ ├── img_002.png
│ └── ... (20-40 images)
├── output/
│ └── ehyra_zimage_lora_v1/
│ └── (checkpoints saved here)
└── config/
└── ehyra_zimage_lora_v1.yaml

:::tip Image Selection Focus on images that clearly show the character's defining features. Remove images with heavy occlusion or unrelated backgrounds. :::

Captioning

Each image needs a .txt caption file with the same filename. For DiT.2 (Z-Image Base):

ehyra_001.png.txt → "anime girl with long white hair, cyan glowing eyes, circuit neon armor, dynamic pose, cyberpunk city background..."

Important: DiT.2 triggers need 30+ character descriptions of visual features, NOT artist names.

Step 2: Configure Training YAML

Asgard

Create your training config at config/your_lora_name.yaml:

# Ehyra Z-Image LoRA Training Config
task: text-to-image-lora
base_model: "Z-Image-Base/DiT.2"
output_dir: "output/ehyra_zimage_lora_v1"
save_dtype: bf16

# Network
network:
type: lora
rank: 32
alpha: 16
dropout: 0.1

# Training
train:
batch_size: 1
steps: 3000
learning_rate: 1e-4
lr_scheduler: cosine
warmup_steps: 100
gradient_checkpointing: true
mixed_precision: bf16

# Data
data:
dataset_type: image_folder
dataset_img_dir: "input/ehyra_zimage_lora_v1"
caption_extension: .txt
resolution: [768, 768]

# Saving
save:
save_every: 500
save_format: safetensors

# Resume from checkpoint (uncomment to resume)
# resume: "output/ehyra_zimage_lora_v1/ehyra_zimage_lora_v1_2000.safetensors"

Key Parameters Explained

ParameterRecommendedNotes
rank16-64Higher = more detail, slower training
alpharank/2Controls learning magnitude
steps2000-4000More steps ≠ better (watch for overfitting)
learning_rate1e-4 to 5e-5Lower for fine adjustments
batch_size1-21 for 12GB VRAM, 2 for 24GB+
resolution768×768Higher quality but more VRAM

Step 3: Run Training

Muspelheim
# Activate the environment
cd ~/comfy/ai-toolkit
source venv/bin/activate

# Start training
python run_lora.py --config config/ehyra_zimage_lora_v1.yaml

Monitoring

  • Loss progression: Should decrease steadily, normal range 0.2-0.8
  • VRAM usage: ~8-9 GB on RTX 3060
  • Time estimate: ~2-3 hours for 3000 steps on RTX 3060

Resuming Training

If training is interrupted, resume from the last checkpoint:

# Add to your config YAML
resume: "output/ehyra_zimage_lora_v1/ehyra_zimage_lora_v1_2000.safetensors"

:::realm Resuming Always use the .safetensors checkpoint path. The training will continue from exactly where it left off, maintaining learning rate schedule. :::

Step 4: Evaluate and Iterate

Vanaheim

Testing Your LoRA

  1. Copy the final .safetensors file to your ComfyUI models directory
  2. Use a testing workflow in ComfyUI with the LoRA loader node
  3. Test with various prompts at different LoRA strengths (0.6-1.0)
  4. Check for: character consistency, artifact reduction, style flexibility

Common Issues

IssueLikely CauseFix
Burned/overexposed outputToo many stepsReduce to 2000-2500
Character not recognizableToo few steps or low rankIncrease steps or rank
Artifacts in detailsTraining data qualityClean up dataset
LoRA too rigidOverfittingReduce steps, increase dropout
VRAM overflowResolution too highDrop to 512×512 or reduce batch

Pipeline Architecture

┌─────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Reference │───▶│ Captioning │───▶│ Training YAML │
│ Images │ │ (.txt files) │ │ Configuration │
│ (20-40) │ │ │ │ │
└─────────────┘ └──────────────┘ └────────┬────────┘


┌─────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Final │◀───│ Evaluation │◀───│ ai-toolkit │
│ LoRA │ │ (ComfyUI) │ │ Training Loop │
│ (.safetensors)│ │ │ │ (3000 steps) │
└─────────────┘ └──────────────┘ └─────────────────┘

:::neon Pro Tip Keep your best checkpoints! Save every 500 steps so you can pick the optimal one — the last step isn't always the best. :::