Notebook

In [ ]:

%matplotlib inline

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

LoRA 這個技術出自 Microsoft 這篇論文。想解決的是現在 AI 模型一個比一個大, 我們想要再加上自己數據再訓練一下, 一方面原本的模型那麼大一般的機器可能根本跑不動; 另一方面也怕新的資料動到原本模型的參數, 可能會破壞原來訓練不錯的地方。

LoRA 就是完全不動原本的參數, 訓練出另一組小很多的參數, 再融入原本的參數中。訓練的時候因為小很多, 於是比訓練原本模型容易很多。又因為沒有動到原本的參數, 也就不太需要擔心毀了原本的模型。更厲害的是, 它也不一定要融入原本使用的 base 模型! 只是自然原則上融入原本的那個模型效果可能比較好, 也比較不會出亂子。

所以 LoRA 的核心就是一組參數, 一般是用 safetensors 的格式儲存。我們順便介紹一些名詞, 不管 LoRA 或是一個完整的模型、參數 (我們常常稱一個 checkpoint), 儲存的方式基本上是:

safetensors: 一個新的安全存 tensor 的格式
ckpt: 也就是 checkpoint

你會看到完整模型通常兩種存法都有, 而 LoRA 比較常用 safetensors 這種格式。尤其是模型, 不論哪種格式, 都可以完整一個檔案存起來。順道一提的 Hugging Face 的 diffusers 套件不是直接用任何一種格式, 需要轉換。轉換的 script 在 diffusers 的 GitHub 中有提供。

壞消息是, diffusers 其實沒有正式支援 LoRA。不過好消息是我們也只需要自己把一個 LoRA 融到我們系統就可以了。因此請大家去找個 LoRA, 我們等一下會以 Stable Diffusion 1.5 示範, 因此最好找 base 是 Stable Diffusion 1.5 版的 (好在這種 LoRA 最多)。

【注意】¶

請在自己的 Google Drive 建立一個叫 Lora 的資料夾, 把你找到以 .safetensors 格式的 LoRA 檔案放進去。雖然我們可以每次由自己電腦把 LoRA 檔拉進 Colab 的暫時儲存空間, 但你應該不會想每次都這樣做。

如果希望和範例一模一樣, 我們示範是用 civita.com 的 epi-noise-offset-v2 這個 LoRA, 請下載 epiNoiseoffset_v2.safetensors 這個檔案, 放到你的 Lora 資料夾中。

1. 安裝必要套件¶

你會發現這次我們多裝了 safetensors 套件, 顯然就是為了讀入我們的 LoRA 用的。

In [ ]:

!pip install transformers
!pip install diffusers["torch"]
!pip install sentencepiece
!pip install safetensors

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.26.1-py3-none-any.whl (6.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 55.8 MB/s eta 0:00:00
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.8/dist-packages (from transformers) (1.22.4)
Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.8/dist-packages (from transformers) (4.64.1)
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.6/7.6 MB 67.0 MB/s eta 0:00:00
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.8/dist-packages (from transformers) (23.0)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.8/dist-packages (from transformers) (2022.6.2)
Collecting huggingface-hub<1.0,>=0.11.0
  Downloading huggingface_hub-0.12.1-py3-none-any.whl (190 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 190.3/190.3 KB 15.2 MB/s eta 0:00:00
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.8/dist-packages (from transformers) (6.0)
Requirement already satisfied: requests in /usr/local/lib/python3.8/dist-packages (from transformers) (2.25.1)
Requirement already satisfied: filelock in /usr/local/lib/python3.8/dist-packages (from transformers) (3.9.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.8/dist-packages (from huggingface-hub<1.0,>=0.11.0->transformers) (4.5.0)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (2.10)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (1.26.14)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (2022.12.7)
Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (4.0.0)
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.12.1 tokenizers-0.13.2 transformers-4.26.1
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting diffusers[torch]
  Downloading diffusers-0.14.0-py3-none-any.whl (737 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 737.4/737.4 KB 2.6 MB/s eta 0:00:00
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.8/dist-packages (from diffusers[torch]) (6.0.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.8/dist-packages (from diffusers[torch]) (1.22.4)
Requirement already satisfied: huggingface-hub>=0.10.0 in /usr/local/lib/python3.8/dist-packages (from diffusers[torch]) (0.12.1)
Requirement already satisfied: filelock in /usr/local/lib/python3.8/dist-packages (from diffusers[torch]) (3.9.0)
Requirement already satisfied: Pillow in /usr/local/lib/python3.8/dist-packages (from diffusers[torch]) (8.4.0)
Requirement already satisfied: requests in /usr/local/lib/python3.8/dist-packages (from diffusers[torch]) (2.25.1)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.8/dist-packages (from diffusers[torch]) (2022.6.2)
Requirement already satisfied: torch>=1.4 in /usr/local/lib/python3.8/dist-packages (from diffusers[torch]) (1.13.1+cu116)
Collecting accelerate>=0.11.0
  Downloading accelerate-0.16.0-py3-none-any.whl (199 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 199.7/199.7 KB 17.3 MB/s eta 0:00:00
Requirement already satisfied: pyyaml in /usr/local/lib/python3.8/dist-packages (from accelerate>=0.11.0->diffusers[torch]) (6.0)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.8/dist-packages (from accelerate>=0.11.0->diffusers[torch]) (23.0)
Requirement already satisfied: psutil in /usr/local/lib/python3.8/dist-packages (from accelerate>=0.11.0->diffusers[torch]) (5.4.8)
Requirement already satisfied: tqdm>=4.42.1 in /usr/local/lib/python3.8/dist-packages (from huggingface-hub>=0.10.0->diffusers[torch]) (4.64.1)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.8/dist-packages (from huggingface-hub>=0.10.0->diffusers[torch]) (4.5.0)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.8/dist-packages (from importlib-metadata->diffusers[torch]) (3.15.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests->diffusers[torch]) (2022.12.7)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests->diffusers[torch]) (1.26.14)
Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests->diffusers[torch]) (4.0.0)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests->diffusers[torch]) (2.10)
Installing collected packages: accelerate, diffusers
Successfully installed accelerate-0.16.0 diffusers-0.14.0
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting sentencepiece
  Downloading sentencepiece-0.1.97-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 24.0 MB/s eta 0:00:00
Installing collected packages: sentencepiece
Successfully installed sentencepiece-0.1.97
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting safetensors
  Downloading safetensors-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 28.2 MB/s eta 0:00:00
Installing collected packages: safetensors
Successfully installed safetensors-0.3.0

2. 讀入 Stable Diffusion 1.5¶

我們這次用 Stable Diffusion 1.5, 並採用 DDIM sampler (scheduler)。這是依 LoRA 原作者建議的。

In [ ]:

import torch
from safetensors.torch import load_file
from diffusers import StableDiffusionPipeline
from diffusers import DDIMScheduler

In [ ]:

model_id = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionPipeline.from_pretrained(model_id,torch_dtype=torch.float32)
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)

Fetching 15 files:   0%|          | 0/15 [00:00<?, ?it/s]

/usr/local/lib/python3.8/dist-packages/transformers/models/clip/feature_extraction_clip.py:28: FutureWarning: The class CLIPFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use CLIPImageProcessor instead.
  warnings.warn(

3. 連上你的 Google Drive, 讀入 LoRA¶

執行下面這一段, 會要請你以 Google 帳號認證, 連上 Google Drive。

In [ ]:

from google.colab import drive

drive.mount('/content/drive/')

Mounted at /content/gdrive/

這時你的 Lora 資料夾路徑是:

/content/drive/MyDrive/Lora/

這裡的 epiNoiseoffset_v2.safetensors 當然要是你要用的 LoRA 的檔名。

In [ ]:

lora_folder = "/content/drive/MyDrive/Lora/"
model_path = lora_folder + "epiNoiseoffset_v2.safetensors"

state_dict = load_file(model_path)

LORA_PREFIX_UNET = 'lora_unet'
LORA_PREFIX_TEXT_ENCODER = 'lora_te'

alpha = 0.75

visited = []

4. 把 LoRA 融入原本的模型中¶

我們要把 LoRA 融入我們的模型中。等等看來有點長、有點可怕的一段, 其實就是把 LoRA 的權重找出來, 加到原本 base 模型裡。

這段完全是由 Lora for Diffusers 這個 GitHub 照抄來的。順道一提, 這個 GitHub 說明不是很長, 但很清楚, 提供很多訊息, 包括之後你想把 .safetensors 或 .checkpoint 儲存的模型, 轉成 diffusers 用的格式要怎麼做。

In [ ]:

# directly update weight in diffusers model
for key in state_dict:
    
    # it is suggested to print out the key, it usually will be something like below
    # "lora_te_text_model_encoder_layers_0_self_attn_k_proj.lora_down.weight"
    
    # as we have set the alpha beforehand, so just skip
    if '.alpha' in key or key in visited:
        continue
        
    if 'text' in key:
        layer_infos = key.split('.')[0].split(LORA_PREFIX_TEXT_ENCODER+'_')[-1].split('_')
        curr_layer = pipe.text_encoder
    else:
        layer_infos = key.split('.')[0].split(LORA_PREFIX_UNET+'_')[-1].split('_')
        curr_layer = pipe.unet

    # find the target layer
    temp_name = layer_infos.pop(0)
    while len(layer_infos) > -1:
        try:
            curr_layer = curr_layer.__getattr__(temp_name)
            if len(layer_infos) > 0:
                temp_name = layer_infos.pop(0)
            elif len(layer_infos) == 0:
                break
        except Exception:
            if len(temp_name) > 0:
                temp_name += '_'+layer_infos.pop(0)
            else:
                temp_name = layer_infos.pop(0)
    
    # org_forward(x) + lora_up(lora_down(x)) * multiplier
    pair_keys = []
    if 'lora_down' in key:
        pair_keys.append(key.replace('lora_down', 'lora_up'))
        pair_keys.append(key)
    else:
        pair_keys.append(key)
        pair_keys.append(key.replace('lora_up', 'lora_down'))
    
    # update weight
    if len(state_dict[pair_keys[0]].shape) == 4:
        weight_up = state_dict[pair_keys[0]].squeeze(3).squeeze(2).to(torch.float32)
        weight_down = state_dict[pair_keys[1]].squeeze(3).squeeze(2).to(torch.float32)
        curr_layer.weight.data += alpha * torch.mm(weight_up, weight_down).unsqueeze(2).unsqueeze(3)
    else:
        weight_up = state_dict[pair_keys[0]].to(torch.float32)
        weight_down = state_dict[pair_keys[1]].to(torch.float32)
        curr_layer.weight.data += alpha * torch.mm(weight_up, weight_down)
        
     # update visited list
    for item in pair_keys:
        visited.append(item)

In [ ]:

pipe = pipe.to(torch.float16).to("cuda")
pipe.safety_checker = lambda images, clip_input: (images, False)

5. 生成準備工作¶

這裡我們生一組亂數 seed 出來, 以便於控制。另外寫一個 combine_imgs 函數, 讓我們一次生 4 張圖可以同時顯示出來。

In [ ]:

seeds = np.random.randint(0, 100000, 4)
seeds = [int(i) for i in seeds]

In [ ]:

seeds

Out[ ]:

[81647, 53384, 22009, 8159]

In [ ]:

from PIL import Image

In [ ]:

def combine_imgs(images):
    width, height = images[0].size
    new_img = Image.new('RGB', (width, height))

    w = int(width/2)
    h = int(height/2)

    new_img.paste(images[0].resize((w,h)), (0, 0))
    new_img.paste(images[1].resize((w,h)), (w, 0))
    new_img.paste(images[2].resize((w,h)), (0, h))
    new_img.paste(images[3].resize((w,h)), (w, h))
    return new_img

6. 正式開始使用!¶

首先, 因為我們用 epiNoiseoffset_v2 這個 LoRA, 通常會設一下使用的「強度」。一般原作都會有個建議值, 大家也可以自己試試:

<lora:epiNoiseoffset_v2:1>

當然這段常常是放在後面的, 不過我們為了介紹就放這裡了。

再來我們順便介紹生成時進一步的控制:

height, weight: 這很明顯就是圖的高度和寬度
num_inference_steps: 生成去 noise 要進行幾次
guidance_scale: 電腦要多貼近我們的 prompts

In [ ]:

prompt = "<lora:epiNoiseoffset_v2:1> product shot of ultra realistic juicy cheeseburger against a dark background, two tone lighting, advertisement, octane, unreal"
negative_prompt = "text, error, cropped, duplicate, morbid, mutilated, out of frame, username, watermark, signature"

generator = [torch.Generator().manual_seed(i) for i in seeds]

images = []
num_of_imgs = 4

for i in range(num_of_imgs):
    img = pipe(prompt, negative_prompt=negative_prompt, generator=generator[i],
               height=640, width=480,
               num_inference_steps=30,
               guidance_scale=5
               ).images[0]
    images.append(img)
    
combine_imgs(images)

  0%|          | 0/30 [00:00<?, ?it/s]

  0%|          | 0/30 [00:00<?, ?it/s]

  0%|          | 0/30 [00:00<?, ?it/s]

  0%|          | 0/30 [00:00<?, ?it/s]

Out[ ]:

你可以看看覺得最滿意的, 當然也可以繼續修。

In [ ]:

images[0]

Out[ ]:

In [ ]: