Text Generation
Transformers
PyTorch
English
llama
text-generation-inference
Inference Endpoints
winglian commited on
Commit
ae1d1c4
1 Parent(s): 218fb47

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -13,6 +13,11 @@ pipeline_tag: text-generation
13
 
14
  Wizard Mega is a Llama 13B model fine-tuned on the [ShareGPT](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered), [WizardLM](https://huggingface.co/datasets/ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered), and [Wizard-Vicuna](https://huggingface.co/datasets/ehartford/wizard_vicuna_70k_unfiltered) datasets. These particular datasets have all been filtered to remove responses where the model responds with "As an AI language model...", etc or when the model refuses to respond.
15
 
 
 
 
 
 
16
  ## Release (Epoch Two)
17
 
18
  The Wizard Mega 13B SFT model is being released after two epochs as the eval loss increased during the 3rd (final planned epoch). Because of this, we have preliminarily decided to use the epoch 2 checkpoint as the final release candidate. https://wandb.ai/wing-lian/vicuna-13b/runs/5uebgm49
 
13
 
14
  Wizard Mega is a Llama 13B model fine-tuned on the [ShareGPT](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered), [WizardLM](https://huggingface.co/datasets/ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered), and [Wizard-Vicuna](https://huggingface.co/datasets/ehartford/wizard_vicuna_70k_unfiltered) datasets. These particular datasets have all been filtered to remove responses where the model responds with "As an AI language model...", etc or when the model refuses to respond.
15
 
16
+ # Demo
17
+
18
+ Try out the model in HF Spaces. The demo uses a quantized GGML version of the model to quickly return predictions on smaller GPUs (and even CPUs). Quantized GGML may have some minimal loss of model quality.
19
+ - https://huggingface.co/spaces/openaccess-ai-collective/ggml-ui
20
+
21
  ## Release (Epoch Two)
22
 
23
  The Wizard Mega 13B SFT model is being released after two epochs as the eval loss increased during the 3rd (final planned epoch). Because of this, we have preliminarily decided to use the epoch 2 checkpoint as the final release candidate. https://wandb.ai/wing-lian/vicuna-13b/runs/5uebgm49