!git clone https://github.com/BlinkDL/RWKV-LM.git
#!wget https://huggingface.co/BlinkDL/rwkv-4-pile-3b/resolve/main/RWKV-4-Pile-3B-20220915-1207.pth -O ./RWKV-LM/RWKV-v4/500.pth 3B needs more vram then google offers
!wget https://huggingface.co/BlinkDL/rwkv-4-pile-1b5/resolve/main/RWKV-4-Pile-1B5-20220903-8040.pth -O ./RWKV-LM/RWKV-v4/500.pth
Cloning into 'RWKV-LM'... remote: Enumerating objects: 1092, done. remote: Counting objects: 100% (211/211), done. remote: Compressing objects: 100% (81/81), done. remote: Total 1092 (delta 132), reused 185 (delta 130), pack-reused 881 Receiving objects: 100% (1092/1092), 5.97 MiB | 30.87 MiB/s, done. Resolving deltas: 100% (663/663), done. --2022-09-21 06:57:37-- https://huggingface.co/BlinkDL/rwkv-4-pile-1b5/resolve/main/RWKV-4-Pile-1B5-20220903-8040.pth Resolving huggingface.co (huggingface.co)... 54.173.5.192, 44.195.102.200, 52.5.62.33, ... Connecting to huggingface.co (huggingface.co)|54.173.5.192|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://cdn-lfs.huggingface.co/repos/d6/95/d69583b06567422d104d5413e7926ae97bcf0d541619db6e61fe10133d91582d/4e215be3b4f86dc2f145835b47a2c432306c373cbf625375b7721bb474512bad?response-content-disposition=attachment%3B%20filename%3D%22RWKV-4-Pile-1B5-20220903-8040.pth%22 [following] --2022-09-21 06:57:37-- https://cdn-lfs.huggingface.co/repos/d6/95/d69583b06567422d104d5413e7926ae97bcf0d541619db6e61fe10133d91582d/4e215be3b4f86dc2f145835b47a2c432306c373cbf625375b7721bb474512bad?response-content-disposition=attachment%3B%20filename%3D%22RWKV-4-Pile-1B5-20220903-8040.pth%22 Resolving cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)... 13.226.52.13, 13.226.52.128, 13.226.52.14, ... Connecting to cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)|13.226.52.13|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 3030279587 (2.8G) [application/zip] Saving to: ‘./RWKV-LM/RWKV-v4/500.pth’ ./RWKV-LM/RWKV-v4/5 100%[===================>] 2.82G 81.0MB/s in 35s 2022-09-21 06:58:12 (82.9 MB/s) - ‘./RWKV-LM/RWKV-v4/500.pth’ saved [3030279587/3030279587]
%cd ./RWKV-LM/RWKV-v4/
/content/RWKV-LM/RWKV-v4
!pip install transformers
!pip install ninja
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/ Collecting transformers Downloading transformers-4.22.1-py3-none-any.whl (4.9 MB) |████████████████████████████████| 4.9 MB 4.0 MB/s Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.7/dist-packages (from transformers) (6.0) Collecting huggingface-hub<1.0,>=0.9.0 Downloading huggingface_hub-0.9.1-py3-none-any.whl (120 kB) |████████████████████████████████| 120 kB 66.4 MB/s Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from transformers) (2.23.0) Requirement already satisfied: filelock in /usr/local/lib/python3.7/dist-packages (from transformers) (3.8.0) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.7/dist-packages (from transformers) (21.3) Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/dist-packages (from transformers) (4.12.0) Collecting tokenizers!=0.11.3,<0.13,>=0.11.1 Downloading tokenizers-0.12.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.6 MB) |████████████████████████████████| 6.6 MB 48.2 MB/s Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.7/dist-packages (from transformers) (1.21.6) Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.7/dist-packages (from transformers) (4.64.1) Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.7/dist-packages (from transformers) (2022.6.2) Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.7/dist-packages (from huggingface-hub<1.0,>=0.9.0->transformers) (4.1.1) Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/dist-packages (from packaging>=20.0->transformers) (3.0.9) Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata->transformers) (3.8.1) Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->transformers) (2.10) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->transformers) (2022.6.15) Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->transformers) (1.24.3) Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->transformers) (3.0.4) Installing collected packages: tokenizers, huggingface-hub, transformers Successfully installed huggingface-hub-0.9.1 tokenizers-0.12.1 transformers-4.22.1 Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/ Collecting ninja Downloading ninja-1.10.2.3-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) |████████████████████████████████| 108 kB 4.0 MB/s Installing collected packages: ninja Successfully installed ninja-1.10.2.3
########################################################################################################
# The RWKV Language Model - https://github.com/BlinkDL/RWKV-LM
########################################################################################################
import numpy as np
import math, os
import time
import types
import copy
import torch
from torch.nn import functional as F
from src.utils import TOKENIZER, Dataset
torch.backends.cudnn.benchmark = True
torch.backends.cudnn.allow_tf32 = True
torch.backends.cuda.matmul.allow_tf32 = True
np.set_printoptions(precision=4, suppress=True, linewidth=200)
########################################################################################################
# Step 1: set model
#
# Set TOKEN_MODE to 'char' or 'bpe' if the model is trained by 'train.py' from scratch.
#
# Set TOKEN_MODE to 'pile' if you want to test pre-trained pile models.
########################################################################################################
TOKEN_MODE = 'pile' # char / bpe / pile
n_layer = 6
n_embd = 512
ctx_len = 10024
if TOKEN_MODE == 'char':
MODEL_NAME = 'trained-500' # your trained model
WORD_NAME = 'vocab' # the .json vocab (generated by train.py)
# set UNKNOWN_CHAR to the rarest token in your vocab.json, and all unknown tokens in your prompt will be denoted by it
UNKNOWN_CHAR = ' ' # here we just set it to ' ' for simplicity
elif TOKEN_MODE == 'bpe':
MODEL_NAME = 'trained-500' # your trained model
WORD_NAME = ['model-vocab.json', 'model-merges.txt'] # [vocab, merge] for your BPE model
UNKNOWN_CHAR = None
elif TOKEN_MODE == 'pile':
WORD_NAME = ['20B_tokenizer.json', '20B_tokenizer.json']
UNKNOWN_CHAR = None
#---> you can set MODEL_NAME to your fine-tuned model <---
MODEL_NAME = '500'
# for 3b
#n_layer = 32
#n_embd = 2560
#ctx_len = 10024
# for 1b5'
n_layer = 24
n_embd = 2048
ctx_len = 1024
os.environ['RWKV_FLOAT_MODE'] = 'bf16' # 'bf16' / 'fp16' / 'fp32' (note: only using fp32 at this moment)
os.environ['RWKV_RUN_DEVICE'] = 'cuda' # 'cpu' (already very fast) or 'cuda'
model_type = 'RWKV' # 'RWKV' or 'RWKV-ffnPre'
########################################################################################################
# Step 2: set prompt & sampling stuffs
########################################################################################################
# context = 'A'
# context = "\nIn the"
# context = '\nSugar:'
NUM_TRIALS = 5
LENGTH_PER_TRIAL = 3330
DEBUG_DEBUG = False # True False --> show softmax output
########################################################################################################
print(f'Loading {MODEL_NAME}...')
from src.model_run import RWKV_RNN
model = RWKV_RNN(MODEL_NAME, os.environ['RWKV_RUN_DEVICE'], model_type, n_layer, n_embd, ctx_len)
tokenizer = TOKENIZER(WORD_NAME, UNKNOWN_CHAR=UNKNOWN_CHAR)
Loading 500... RWKV_HEAD_QK_DIM 0 Using /root/.cache/torch_extensions/py37_cu113 as PyTorch extensions root... Creating extension directory /root/.cache/torch_extensions/py37_cu113/wkv... Detected CUDA files, patching ldflags Emitting ninja build file /root/.cache/torch_extensions/py37_cu113/wkv/build.ninja... Building extension module wkv... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) Loading extension module wkv...
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.
Moving 0 files to the new cache system
0it [00:00, ?it/s]
!nvidia-smi
Wed Sep 21 06:59:29 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 | | N/A 50C P0 28W / 70W | 9282MiB / 15109MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| +-----------------------------------------------------------------------------+
########################################################################################################
context = "You are an AI running a house.\ngiven the following commands: volumeUp(int amount),volumeDown(int amount),setVolume(int percent),setLights([r,g,b]),playSong(string url)\nand the given instruction 'Please make the room romantic'\nList the commands, and the parameters they should have, that should be done to fullfil the command\nGive the commands in the format [command(parameter)]\n\nTask: list the commands and a reasonable value for the parameter\nResponse:"
TEMPERATURE = 0.9
top_p = 0.8
top_p_newline = 0.9 # only used in TOKEN_MODE = char
if tokenizer.charMode:
context = tokenizer.refine_context(context)
ctx = [tokenizer.stoi.get(s, tokenizer.UNKNOWN_CHAR) for s in context]
else:
ctx = tokenizer.tokenizer.encode(context)
src_len = len(ctx)
src_ctx = ctx.copy()
print('\nYour prompt has ' + str(src_len) + ' tokens.')
print('\n--> Currently the first run takes a while if your prompt is long, as we are using RNN to process the prompt. Use GPT to build the hidden state for better speed. <--\n')
for TRIAL in range(1 if DEBUG_DEBUG else NUM_TRIALS):
t_begin = time.time_ns()
print(('-' * 30) + context, end='')
ctx = src_ctx.copy()
model.clear()
if TRIAL == 0:
init_state = types.SimpleNamespace()
for i in range(src_len):
x = ctx[:i+1]
if i == src_len - 1:
init_state.out = model.run(x)
else:
model.run(x)
model.save(init_state)
else:
model.load(init_state)
for i in range(src_len, src_len + (1 if DEBUG_DEBUG else LENGTH_PER_TRIAL)):
x = ctx[:i+1]
x = x[-ctx_len:]
if i == src_len:
out = copy.deepcopy(init_state.out)
else:
out = model.run(x)
if DEBUG_DEBUG:
print('model', np.array(x), '==>', np.array(
out), np.max(out), np.min(out))
if TOKEN_MODE == 'pile':
out[0] = -999999999 # disable <|endoftext|>
char = tokenizer.sample_logits(out, x, ctx_len, temperature=TEMPERATURE,
top_p_usual=top_p, top_p_newline=top_p_newline)
char = char.item()
if tokenizer.charMode:
print(tokenizer.itos[int(char)], end='', flush=True)
else:
print(tokenizer.tokenizer.decode(int(char)), end='', flush=True)
ctx += [char]
t_end = time.time_ns()
print("\n----------", round((t_end - t_begin) / (10 ** 9), 2), end='s ')
Your prompt has 110 tokens. --> Currently the first run takes a while if your prompt is long, as we are using RNN to process the prompt. Use GPT to build the hidden state for better speed. <-- ------------------------------You are an AI running a house. given the following commands: volumeUp(int amount),volumeDown(int amount),setVolume(int percent),setLights([r,g,b]),playSong(string url) and the given instruction 'Please make the room romantic' List the commands, and the parameters they should have, that should be done to fullfil the command Give the commands in the format [command(parameter)] Task: list the commands and a reasonable value for the parameter Response: answers: - answer 1: - answer 2: - answer 3: - answer 4: - answer 5: The first two commands (room.increase volume and command.setVolume) work. But the others do not. The command "volumeUp(int percent)" seems to do nothing. I have tried to change the command by "volumeUp(1)" but the result is still the same. What command should I use to set the volume to full? I have an idea about what happens in the first command, but I don't know how it works. This is the full code of the game: #include <iostream> #include <vector> #include <list> #include <string> #include <
--------------------------------------------------------------------------- KeyboardInterrupt Traceback (most recent call last) <ipython-input-7-ce677f0ef442> in <module> 41 out = copy.deepcopy(init_state.out) 42 else: ---> 43 out = model.run(x) 44 if DEBUG_DEBUG: 45 print('model', np.array(x), '==>', np.array( /content/RWKV-LM/RWKV-v4/src/model_run.py in run(self, ctx) 365 else: 366 x = x + self.SA(self.LN(x, w.blocks[i].ln1), w.blocks[i].att, f'att.{i}') --> 367 x = x + self.FF(self.LN(x, w.blocks[i].ln2), w.blocks[i].ffn, f'ffn.{i}') 368 369 x = self.LN(x, w.ln_out) /content/RWKV-LM/RWKV-v4/src/model_run.py in LN(self, xx, w) 301 302 def LN(self, xx, w): --> 303 return F.layer_norm(xx, (self.n_embd,), weight=w.weight, bias=w.bias) 304 305 def FF(self, xx, w, name): /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in layer_norm(input, normalized_shape, weight, bias, eps) 2501 layer_norm, (input, weight, bias), input, normalized_shape, weight=weight, bias=bias, eps=eps 2502 ) -> 2503 return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled) 2504 2505 KeyboardInterrupt: