modelscope
/

llama3-8b-agent-instruct-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

llama3-8b-agent-instruct-v2 / README.md

tastelikefeet's picture

Update README.md

ce33651 verified 3 months ago

|

history blame contribute delete

No virus

2.93 kB

	---
	frameworks:
	- Pytorch
	license: apache-2.0
	tasks:
	- text-generation

	#model-type:
	##如 gpt、phi、llama、chatglm、baichuan 等
	#- gpt

	#domain:
	##如 nlp、cv、audio、multi-modal
	#- nlp

	#language:
	##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa
	#- cn

	#metrics:
	##如 CIDEr、Blue、ROUGE 等
	#- CIDEr

	#tags:
	##各种自定义，包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他
	#- pretrained

	#tools:
	##如 vllm、fastchat、llamacpp、AdaSeq 等
	#- vllm
	---
	Fine-tuning the llama3-8b-instruct model using the [msagent-pro](https://modelscope.cn/datasets/iic/MSAgent-Pro/summary) dataset and the loss_scale technique with [swift](https://github.com/modelscope/swift), the script is as follows:
	```bash
	NPROC_PER_NODE=8 \
	CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
	MASTER_PORT=29500 \
	swift sft \
	--model_type llama3-8b-instruct \
	--learning_rate 2e-5 \
	--sft_type lora \
	--dataset msagent-pro \
	--gradient_checkpointing true \
	--gradient_accumulation_steps 8 \
	--deepspeed default-zero3 \
	--lora_target_modules ALL \
	--use_loss_scale true \
	--save_strategy epoch \
	--batch_size 1 \
	--num_train_epochs 2 \
	--max_length 4096 \
	--preprocess_num_proc 4 \
	--use_loss_scale true \
	--loss_scale_config_path agent-flan \
	--ddp_backend nccl \
	```

	Comparison with the Original Model on the ToolBench Evaluation Set

	\| Model \| ToolBench (in-domain) \| \| \| \| \| ToolBench (out-of-domain) \| \| \| \|
	\|-------------------------\|----------------------------------------------\|-------\|-------\|-------\|-------\|--------------------------------------------\|-------\|-------\|-------\|
	\| \| Plan.EM \| Act.EM\| HalluRate (lower is better) \| Avg.F1 \| R-L \| Plan.EM \| Act.EM\| HalluRate (lower is better) \| Avg.F1 \| R-L \|
	\| llama3-8b-instruct \| 74.22 \| 36.17 \| 15.68 \| 20.0 \| 12.14 \| 69.47 \| 34.21 \| 14.72 \| 20.25 \| 14.07 \|
	\| llama3-8b-agent-instruct-v2 \| 85.15 \| 58.1 \| 1.57 \| 52.10 \| 26.02 \| 85.79 \| 59.43 \| 2.56 \| 52.19 \| 31.43 \|

	For detailed explanations of the evaluation metrics, please refer to [document](https://github.com/modelscope/eval-scope/tree/main/llmuses/third_party/toolbench_static)

	Deploy this model:
	```shell
	USE_HF=True swift deploy \
	--model_id_or_path modelscope/llama3-8b-agent-instruct-v2 \
	--model_type llama3-8b-instruct \
	--infer_backend vllm \
	--tools_prompt toolbench
	```