OpenAssistant
/

falcon-7b-sft-mix-2000

Text Generation

RefinedWebModel

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

andreaskoepf commited on Jun 6, 2023

Commit

d0945e8

•

1 Parent(s): e94cc49

Update README.md

Files changed (1) hide show

README.md +70 -0

README.md CHANGED Viewed

@@ -1,11 +1,81 @@
 ---
 license: apache-2.0
 ---
 - base model: [tiiuae/falcon-7b](https://huggingface.co/tiiuae/falcon-7b)
 - [sampling report](https://open-assistant.github.io/oasst-model-eval/?f=https%3A%2F%2Fraw.githubusercontent.com%2FOpen-Assistant%2Foasst-model-eval%2Fmain%2Fsampling_reports%2Fchat-gpt%2F2023-04-11_gpt-3.5-turbo_lottery.json%0Ahttps%3A%2F%2Fraw.githubusercontent.com%2FOpen-Assistant%2Foasst-model-eval%2Fmain%2Fsampling_reports%2Foasst-sft%2F2023-06-05_OpenAssistant_falcon-7b-sft-mix-2000_sampling_noprefix2.json)
 - wandb: https://wandb.ai/open-assistant/public-sft/runs/tlevhltw
 - checkpoint: 2000 steps (~2.9 epochs)
 Model:
 ```
 falcon-7b:

 ---
 license: apache-2.0
+language:
+- en
+- de
+- es
+- fr
+tags:
+- sft
+pipeline_tag: text-generation
+widget:
+- text: >-
+    <|prompter|>What is a meme, and what's the history behind this
+    word?<|endoftext|><|assistant|>
+- text: <|prompter|>What's the Earth total population<|endoftext|><|assistant|>
+- text: >-
+    <|prompter|>Write a story about future of AI
+    development<|endoftext|><|assistant|>
+datasets:
+- OpenAssistant/oasst1
 ---
+# Open-Assistant Falcon 7B SFT MIX Model
 - base model: [tiiuae/falcon-7b](https://huggingface.co/tiiuae/falcon-7b)
 - [sampling report](https://open-assistant.github.io/oasst-model-eval/?f=https%3A%2F%2Fraw.githubusercontent.com%2FOpen-Assistant%2Foasst-model-eval%2Fmain%2Fsampling_reports%2Fchat-gpt%2F2023-04-11_gpt-3.5-turbo_lottery.json%0Ahttps%3A%2F%2Fraw.githubusercontent.com%2FOpen-Assistant%2Foasst-model-eval%2Fmain%2Fsampling_reports%2Foasst-sft%2F2023-06-05_OpenAssistant_falcon-7b-sft-mix-2000_sampling_noprefix2.json)
 - wandb: https://wandb.ai/open-assistant/public-sft/runs/tlevhltw
 - checkpoint: 2000 steps (~2.9 epochs)
+## Prompting
+Two special tokens are used to mark the beginning of user and assistant turns:
+`<|prompter|>` and `<|assistant|>`. Each turn ends with a `<|endoftext|>` token.
+Input prompt example:
+```
+<|prompter|>What is a meme, and what's the history behind this word?<|endoftext|><|assistant|>
+```
+The input ends with the `<|assistant|>` token to signal that the model should
+start generating the assistant reply.
+## Sample Code
+```python
+from transformers import AutoTokenizer
+import transformers
+import torch
+model = "OpenAssistant/falcon-7b-sft-mix-2000"
+tokenizer = AutoTokenizer.from_pretrained(model)
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model,
+    tokenizer=tokenizer,
+    torch_dtype=torch.bfloat16,
+    trust_remote_code=True,
+    device_map="auto",
+)
+input_text="<|prompter|>What is a meme, and what's the history behind this word?<|endoftext|><|assistant|>"
+sequences = pipeline(
+    input_text,
+    max_length=500,
+    do_sample=True,
+    return_full_text=False,
+    top_k=10,
+    num_return_sequences=1,
+    eos_token_id=tokenizer.eos_token_id,
+)
+for seq in sequences:
+    print(f"Result: {seq['generated_text']}")
+```
+## Configuration Details
 Model:
 ```
 falcon-7b: