模型介绍
在ChatGLM3-6B模型上使用QLoRA在HasturOfficial/adgen数据集上进行广告生成微调
数据集介绍
HasturOfficial/adgen是广告生成数据集
微调相关设置
- 微调使用一张4090显卡
- 使用nf4量化数据类型加载模型,开启双量化配置,以bf16混合精度训练
- per_device_train_batch_size = 8
- gradient_accumulation_steps = 4
- learning_rate = 1e-3
- warmup_ratio=0.1
- lr_scheduler_type="linear"
- lora_rank = 4
- lora_alpha = 32
- lora_dropout = 0.05
使用方法
from transformers import AutoTokenizer, AutoModel
model_name_or_path = "snowfly/glm3-QLoRA-adgen"
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_name_or_path, trust_remote_code=True)
model = AutoModel.from_pretrained(pretrained_model_name_or_path=model_name_or_path, trust_remote_code=True, device='cuda')
model = model.eval()
input_text = '类型#裙*版型#显瘦*风格#文艺*风格#简约*图案#印花*图案#撞色*裙下摆#压褶*裙长#连衣裙*裙领型#圆领'
response, history = model.chat(tokenizer=tokenizer, query=input_text, history=[])
print(response)
- Downloads last month
- 9
Inference API (serverless) does not yet support model repos that contain custom code.