license: apache-2.0
RWKV-4-World的Hugface格式,因新版World的tokenizer较之前Raven\Pile版本有较大变化,因而需要进行新版HF适配
ringrwkv兼容了原生rwkv库和transformers的rwkv库,同时新添入World版本的配置及代码(支持1.5B,3B,7B全系列),并修复了原HF的RWKV在
Forward RWKVOutput时的细微问题,主要是引入和明确last_hidden_state。以下是轻量级使用代码,比较方便:
RingRWKV GIT开源地址:https://github.com/StarRing2022/RingRWKV
import torch
from ringrwkv.configuration_rwkv_world import RwkvConfig
from ringrwkv.rwkv_tokenizer import TRIE_TOKENIZER
from ringrwkv.modehf_world import RwkvForCausalLM
model = RwkvForCausalLM.from_pretrained("StarRing2022/RWKV-4-World-1.5B") #或将本模型下载至本地文件夹
tokenizer = TRIE_TOKENIZER('./ringrwkv/rwkv_vocab_v20230424.txt')
text = "你叫什么名字?"
question = f'Question: {text.strip()}\n\nAnswer:'
input_ids = tokenizer.encode(question)
input_ids = torch.tensor(input_ids).unsqueeze(0)
out = model.generate(input_ids,max_new_tokens=40)
outlist = out[0].tolist()
for i in outlist:
if i==0: #要删除tokenid为0的元素
outlist.remove(i)
answer = tokenizer.decode(outlist)
print(answer)