Stop Training Your LoRAs from Scratch... You're Making It Harder Than It Needs to Be

Aitrepreneur 870 words 52:11

What You'll Learn

✓ building on foundations

✓ craft mastery

✓ working with existing knowledge

✓ precision over brute force

✓ simplification

✓ testing assumptions

✓ practical expertise

Ideas Connected

8 connected articles

ULTIMATE SDXL LORA Training! Get THE BEST RESULTS!

Hundreds of hours of testing. Hundreds of dollars in cloud compute. Conversations with the Stability AI team themselves. And the single biggest insight from all of it? Most people are overcomplicating LoRA training by ignoring what the model already knows.

The Old Way Is the Wrong Way

If you've trained a LoRA before, you probably used a rare token like `ohwx` as your instance prompt. It's what 99% of tutorials teach. The logic sounds clean... pick a meaningless word, train the model to associate it with your character, start fresh.

Except you're not starting fresh. You're starting from nothing. And nothing is a terrible foundation.

After extensive testing and direct consultation with the Stability AI team, creator Aitrepreneur discovered something that flips the entire process: use the name of a celebrity the model already recognizes.

Stable Diffusion XL already carries knowledge. It already has a rough sense of thousands of faces, styles, and compositions baked into its weights. When you train a LoRA on a rare token, you're asking it to build understanding from zero. When you train on a celebrity name the model already knows... you're giving it a running start.

Every single model trained on a real celebrity token outperformed models trained on rare tokens. Every. Single. One.

But What If the Model Doesn't Know Your Person?

This is where it gets practical.

Say you want to train a LoRA on someone the model has never seen... yourself, a friend, an actor from a newer show. You prompt their name into SDXL and get back... nothing recognizable. No resemblance. No foundation to build on.

The solution is beautifully simple. Find a celebrity who looks like your target character. Use a tool like starbyface.com to upload a photo, and it spits back the closest celebrity match. Then verify that SDXL actually recognizes that celebrity by generating a few test images. If the resemblance is there... that's your instance prompt.

You're not stealing someone's likeness. You're borrowing a starting position. The training images you provide will push the model toward your actual target. But you've given LoRA training a massive head start by saying "start here" instead of "start nowhere."

Stop Cropping Your Images

Here's the second piece of conventional wisdom that testing demolished.

Traditional advice says crop your training images to 1024×1024 squares. Sites like birme.net exist specifically for this. Sounds reasonable... give the model clean, uniform inputs.

But the data says otherwise. Uncropped, high-resolution images consistently produced better models. Better resemblance. More flexibility in outputs. More natural compositions.

The Kohya SS GUI handles variable aspect ratios through its bucketing system. That's the technical reason it works... the software sorts your images into resolution buckets and trains accordingly. The practical takeaway? Spend your energy finding sharp, high-resolution, varied images. Skip the cropping entirely.

What Actually Matters in Your Dataset

Minimum 10 images. That's the floor for a solid LoRA. But quality beats quantity every time.

What to look for:

- High resolution — Use Google's advanced image search to filter by size. Four megapixels or larger. - Variation — Different angles, expressions, lighting, backgrounds. You're teaching the model a person, not a single photograph. - Isolation — One subject per image. No group shots. No ambiguity about who the model should learn. - Sharpness — No blur. No compression artifacts. Clean, crisp source material.

For captioning, BLIP auto-captioning through the Kohya SS GUI gives you a solid foundation. But don't trust it blindly. Review each caption manually. Fix errors. Add detail where BLIP was vague. Your captions are instructions to the model... make them accurate.

The Parameters That Survived Testing

A few settings worth understanding:

Batch size of 1 for character training. Slower, yes. But more precise. Higher batch sizes (like 5) create smoother gradient transitions... better for style training, not faces.

Regularization images matching your class prompt (man, woman, person, style) prevent overfitting and keep the model's general knowledge intact. Skip these and your model might nail your character but forget how to draw anyone else.

Learning rate and epochs work together. The presets shared in the tutorial reflect hundreds of hours of optimization. If you're starting out... trust tested parameters before experimenting.

The Deeper Lesson

This entire tutorial is really about one principle: work with what already exists.

Don't train from nothing when you can train from something. Don't crop images into artificial constraints when the software handles reality just fine. Don't guess at parameters when someone has already burned the hours finding what works.

The best builders don't start from raw materials every time. They know what foundations already exist... and they build on top of them.

Training a LoRA isn't magic. It's craft. And like any craft, the difference between frustration and flow comes down to understanding your materials before you start shaping them. The model already knows things. Your images already contain information. The parameters already have tested ranges. Your job isn't to reinvent the wheel... it's to point the wheel in the right direction and let momentum do what momentum does. Start where the knowledge already lives. Build from there. 💙

--- Source: https://www.youtube.com/watch?v=N_zhQSx2Q3c

From TIG's Notebook

Thoughts that surfaced while watching this.

What I put into my mouth affects mostly me,

— TIG's Notebook — New Captures

Schedule love. Because when someone needs you, it's never convenient.

— TIG's Notebook — Core Principles

title: Quotes & Stats - TIG izms

Echoes

Wisdom from across the constellation that resonates with this article.

This mix triggers a reaction that creates light. But the exact chemicals vary animal to animal, meaning this ability to glow evolved tons of different times independently. Over 25 times in fish alone!

— Cleo Abram | How Do Deep Ocean Fish GLOW? media

Default to building prototypes instead of writing decks or PRDs

— Nate B Jones | THIS is Why You're Still Slow Even With AI (The Bottleneck Moved--Here's What to Do About It) community

Track hunger on a 1-10 scale as biofeedback

— Simon Sinek | Stop Guessing. Start Knowing. A Real Guide to Finding Your Calorie Deficit. community