Update README.md

34e27ac over 1 year ago

3.61 kB

	---
	license: cc-by-sa-4.0
	inference: false
	datasets:
	- PengQu/langchain-MRKL-finetune
	- fnlp/moss-003-sft-data
	---


	NOTE: This "delta model" cannot be used directly.
	Users have to apply it on top of the original LLaMA weights to get actual vicuna-13b-finetuned-langchain-MRKL weights.
	See https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL#model-weights for instructions.

	# vicuna-13b-finetuned-langchain-MRKL

	## Model details

	Model type:
	vicuna-13b-finetuned-langchain-MRKL is an open-source chatbot trained by fine-tuning vicuna-13b on 15 examples with langchain-MRKL format.

	Model Usage:

	To obtain the correct model, plese run apply_delta.py first.(https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL/blob/main/model/apply_delta.py) See instructions https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL#model-weights

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("path/to/vicuna-13b-finetuned-langchain-MRKL")
	model = AutoModelForCausalLM.from_pretrained("path/to/vicuna-13b-finetuned-langchain-MRKL")
	model.cuda()

	prompt = """Answer the following questions as best you can. You have access to the following tools:

	Search: useful for when you need to answer questions about current events
	Calculator: useful for when you need to answer questions about math

	Use the following format:

	Question: the input question you must answer
	Thought: you should always think about what to do
	Action: the action to take, should be one of [Search, Calculator]
	Action Input: the input to the action
	Observation: the result of the action
	... (this Thought/Action/Action Input/Observation can repeat N times)
	Thought: I now know the final answer
	Final Answer: the final answer to the original input question

	Begin!

	Question: The current age of the President of the United States multiplied by 0.5.
	Thought:"""

	input_ids = tokenizer(prompt, return_tensors='pt').input_ids.to("cuda")
	tokens = model.generate(input_ids,min_length = 5, max_new_tokens=128,do_sample = True, temperature = 0.7, top_p = 0.9)
	print(tokenizer.decode(tokens[0], skip_special_tokens=True))
	```

	output(The tokens after "Thought:"):<br>
	```sh
	I need to find the current age of the President and then multiply it by 0.5
	Action: Search
	Action Input: Who is the President of the United States?
	```

	if you launched a httpserver with the model and installed langchain(https://github.com/hwchase17/langchain), you can modify demo.py with your httpserver's ip&port and your SERPAPI_API_KEY, then run it.(https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL/blob/main/demo.py)<br>
	you can also try this by Jupyter Notebook. https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL/blob/main/demo.ipynb

	Where to send questions or comments about the model:

	https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL/issues

	## Training dataset
	train only one epoch on mix data (sharegpt + 32*my.json + moss-003-sft-data)

	## Evaluation
	- demo for langchain-MRKL: https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL/blob/main/demo.ipynb
	- No evaluation set. Because we don't think we improved the ability of model. we just make model fit langchain-MRKL strictly.
	- We just want to show vicuna-13b's powerful ability about thinking and action.

	## Major Improvement
	- support langchain-MRKL(agent= "zero-shot-react-description")
	- very fast because of stritcly format(it doesn't generate redundant tokens)

	## Author
	Qu Peng (https://huggingface.co/PengQu)