--- license: cc-by-nc-4.0 datasets: - yuyijiong/LongData-Corpus - yuyijiong/Long-Instruction-Chinese language: - zh pipeline_tag: text-generation --- # 版本区别 | 模型 | 基座模型 | 位置插值 | 训练方式 | 训练数据 | |:-------------------------:|:-----------:|:------------:|:--:|:--:| | LongAlpaca-7b-16k-chinese | atom-7b | 8k->16k PI |指令微调 | 长度16k以内的多文档问答、论文总结、论文问答数据 | | LongAlpaca-7b-32k-chinese | atom-7b | 8k->32k PI | 指令微调 | 长度32k以内的多文档问答、论文总结、论文问答、sharegpt数据 | | LongAlpaca-7b-32k-chinese-v2 | CausalLM-7b | 8k->32k Yarn | 增量预训练+指令微调 |长度32k的中文预训练数据 + 长度32k以内的多文档多轮问答、论文多任务多轮问答、sharegpt、中英翻译数据 | ## 使用方法: ```python from transformers import AutoModelForCausalLM, AutoTokenizer from transformers.generation import GenerationConfig import os os.environ["CUDA_VISIBLE_DEVICES"] = "0" model_path="yuyijiong/LongAlpaca-7b-32k-chinese" tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) # use auto mode, automatically select precision based on the device. model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", load_in_8bit=True, trust_remote_code=True).eval() question="中国的首都是什么?" input_text = "<|im_start|>user\n" + question + "<|im_end|>\n" + "<|im_start|>assistant\n" input_ids = tokenizer(input_text, return_tensors='pt').input_ids.to(model.device) with torch.no_grad(): with torch.autocast('cuda'): output = model.generate(input_ids=input_ids, max_new_tokens=max_new_tokens, do_sample=True, temperature=0.85, top_k=None, top_p=0.9, use_cache=True, eos_token_id=[tokenizer.convert_tokens_to_ids('<|im_end|>') , tokenizer.convert_tokens_to_ids('<|endoftext|>')] **kwargs) reply = tokenizer.decode(output[0], skip_special_tokens=False) reply_return=reply.split('<|im_start|>assistant\n')[-1] print('模型回答:', reply_return) ```