Creating my First AI Image Generation LoRA

about 2 months ago

The past few days I was spending time with AI, learning a few things, and playing with it. One of the things I tried is making my own image generation LoRA!

Natsuki Subaru LoRA is an adapter for Anima Preview 3 to generate Subaru from Re:Zero more accurately. You can download it here!

LoRA (short for Low Rank Adaptation) is a technique that trains AI models on new data while keeping what the model already knows. LoRAs must be added to the model on inference time to activate, and in image generation, LoRAs are great tools to teach the base model new characters, styles or concepts.

I trained this LoRA using this Google Collab by the CivitAI's user CitronLegacy. I used the default settings, changing the number of Epochs to 8 and the number of repeats to 4. The whole run took about 4 hours.

The dataset 27 images focusing on Natsuki Subaru taken from anime screenshots and official art I took from ReZero's Wiki.

Observations

The result is better than I expected but worse than I hoped. This LoRA makes generating Subaru's eyes closer to the anime and way more consistent. Without a LoRA, Subaru's eyes look more generic, so I'm happy I got at least one thing right.

Even with this LoRA, the model has tendency to give him feminine features. I get better results when I specify the 1boy tag...

Creating this takes all of Google Colab's Free plan's usage. If I'm gonna create a better LoRA, I must master the dataset tagging, and even then, there's a limit of achievable quality using the free tools. I guess paying money is unavoidable if I want to make something worth using.

Dataset

Disclaimer: I didn't own any of the images I used to create the LoRA, so I won't be sharing the dataset, but here's an some examples of the dataset tags I used:

subaru_rz, natsuki subaru, 1boy, solo, male focus, anime, full body, front view, flexing pose, right arm raised, fist clenched, left hand on hip, smiling expression, looking at viewer, pose anchor: flexing front with bag

tracksuit, hoodie, white and black zip-up jacket, orange trim, N logo on chest, black pants, anime clothing, casual wear, full outfit, fabric folds, visible zipper

outdoor terrace background, stone tiles, low wall, green foliage, distant town with red roofs, hillside houses, blue sky, white clouds, daytime, outdoor, scenic view, depth of field

anime style, clean lines, cel shaded, soft shading, full body shot, character focus, expressive face, cinematic lighting, pose anchor: flexing front with bag, angle: front view

What do you think?

I don't know if I'll be making LoRAs for other characters or a better version of this one, but I wanted to share this one and my process doing this. Hope you enjoyed reading this as much as I enjoyed going through this journey.

Posted Using INLEO

hive-163521 generativeai imagegen lora inleo cent ctp hivepizza slothbuzz hivehustlers

0.000

0 comments