IDEA-CCNL
/

Ziya-LLaMA-13B-v1

Text Generation

text-generation-inference

Model card Files Files and versions Community

合并差值模型与原始llama模型后，模型生成乱码输出，请问base模型使用的是llama-13b还是llama-13b-chat？

#33

by jamestang2190 - opened Sep 21, 2023

Sep 21, 2023

另外，apply_delta脚本中，
if "embed_tokens" in name or "lm_head.weight" in name or "self_attn.rotary_emb.inv_freq" in name:
continue
也就是说这三个参数是使用的delta模型的参数，不知道是否符合预期？还是说这三个也需要合并？

Dec 21, 2023

遇到相同的问题，请问怎么解决的？

pskun

Fengshenbang-LM org Dec 21, 2023

遇到相同的问题，请问怎么解决的？

https://huggingface.co/Qianguo/ziya-13B-v1.1-full-weight
可以使用我们的开发者合并好的权重

Jan 16

主要是想知道，乱码问题怎么解决的。现在肯定也有其他人员遇到了这个问题

Jan 21

解决了，确实要用llama1的权重进行合并。说明ziya模型是根据llama1 训练来的

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment