dlite-v1-124m / README.md

jacobrenn

Update README.md

761fbc8 over 1 year ago

preview code

raw

history blame

No virus

3.96 kB

	---
	license: apache-2.0
	datasets:
	- tatsu-lab/alpaca
	language:
	- en
	library_name: transformers
	---


	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->

	AI Squared's `dlite-v1-124m` ([blog post](https://medium.com/ai-squared/introducing-dlite-a-lightweight-chatgpt-like-model-based-on-dolly-deaa49402a1f)) is a large language
	model which is derived from OpenAI's smallest [GPT-2](https://huggingface.co/gpt2) model and fine-tuned on a single T4 GPU on a corpus of 50k records
	([Stanford Alpaca](https://crfm.stanford.edu/2023/03/13/alpaca.html)) to help it exhibit chat-based capabilities.

	While `dlite-v1-124m` is not a state-of-the-art model, we believe that the level of interactivity that can be achieved on such a small model that is trained so cheaply
	is important to showcase, as it continues to demonstrate that creating powerful AI capabilities may be much more accessible than previously thought.


	### Model Description

	<!-- Provide a longer summary of what this model is. -->

	- Developed by: AI Squared, Inc.
	- Shared by: AI Squared, Inc.
	- Model type: Large Language Model
	- Language(s) (NLP): EN
	- License: Apache v2.0
	- Finetuned from model: GPT-2


	## Bias, Risks, and Limitations

	<!-- This section is meant to convey both technical and sociotechnical limitations. -->

	`dlite-v1-124m` is not a state-of-the-art language model. `dlite-v1-124m` is an experimental technology and is not designed for use in any
	environment other than for research purposes. Furthermore, the model can sometimes exhibit undesired behaviors. Some of these behaviors include,
	but are not limited to: factual inaccuracies, biases, offensive responses, toxicity, and hallucinations.
	Just as with any other LLM, we advise users of this technology to exercise good judgment when applying this technology.


	## Usage

	The code below shows how to use `dlite-v1-124m` in the way which it was trained. While the model can be used "out of the box" using the
	`transformers` library, using the function defined below to create a response from the model will achieve better results.

	### Load Model and Tokenizer from this Repository Using the `transformers` Package

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import numpy as np

	model_id = 'aisquared/dlite-v1-124m'

	tokenizer = AutoTokenizer.from_pretrained(model_id, padding_side = 'left')
	model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code = True, device_map = 'auto')
	```


	### Create the Prompt Format and Other Variables

	```python
	PROMPT_FORMAT = """Below is an instruction that describes a task. Write a response that appropriately completes the request.

	### Instruction:
	{instruction}

	### Response:
	"""

	END_KEY = '### End'
	RESPONSE_KEY = '### Response:\n'
	```


	### Create a Function to Retrieve a Response

	```python
	def create_response(
	instruction,
	model,
	tokenizer,
	do_sample = True,
	max_new_tokens = 256,
	top_p = 0.92,
	top_k = 0,
	**kwargs
	):
	"""
	Create a response from the model by using a formatted prompt
	"""
	ids = tokenizer(PROMPT_FORMAT.format(instruction = instruction), return_tensors = 'pt').input_ids

	response_id = tokenizer.encode(RESPONSE_KEY)[0]
	end_id = tokenizer.encode(END_KEY)[0]

	tokens = model.generate(
	ids,
	pad_token_id = tokenizer.pad_token_id,
	eos_token_id = end_id,
	do_sample = do_sample,
	max_new_tokens = max_new_tokens,
	top_p = top_p,
	top_k = top_k,
	**kwargs
	)[0].cpu()

	res_pos = np.where(tokens == response_id)[0]

	if len(res_pos) == 0:
	return None

	res_pos = res_pos[0]
	end_pos = np.where(tokens == end_id)[0]
	if len(end_pos) > 0:
	end_pos = end_pos[0]
	else:
	end_pos = None

	return tokenizer.decode(tokens[res_pos + 1 : end_pos]).strip()
	```