README.md · avnishkr/falcon-QAMaster at c4ba08f08192f2b56641dbb57543167e55e334bd

falcon-QAMaster / README.md

avnishkr

Update README.md

c4ba08f over 1 year ago

preview code

raw

history blame

4.28 kB

	---
	library_name: adapter-transformers
	license: mit
	datasets:
	- squad
	- tiiuae/falcon-refinedweb
	- adversarial_qa
	- avnishkr/trimpixel
	language:
	- en
	pipeline_tag: question-answering
	tags:
	- QLoRA
	- Adapters
	- llms
	- Transformers
	- Fine-Tuning
	- PEFT
	- SFTTrainer
	- Open-Source
	- LoRA
	- Attention
	- code
	- Falcon-7b
	---


	# 🚀 Falcon-QAMaster

	Falcon-7b-QueAns is a chatbot-like model for Question and Answering. It was built by fine-tuning [Falcon-7B](https://huggingface.co/tiiuae/falcon-7b) on the [SQuAD](https://huggingface.co/datasets/squad), [Adversarial_qa](https://huggingface.co/datasets/adversarial_qa), Trimpixel (Self-Made) datasets. This repo only includes the QLoRA adapters from fine-tuning with 🤗's [peft](https://github.com/huggingface/peft) package.

	## Model Summary

	- Model Type: Causal decoder-only
	- Language(s): English
	- Base Model: Falcon-7B (License: Apache 2.0)
	- Dataset: [SQuAD](https://huggingface.co/datasets/squad) (License: cc-by-4.0), [Adversarial_qa](https://huggingface.co/datasets/adversarial_qa) (License: cc-by-sa-4.0), [Falcon-RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) (odc-by), Trimpixel (Self-Made)
	- License(s): Apache 2.0 inherited from "Base Model" and "Dataset"


	## Why use Falcon-7B?

	* It outperforms comparable open-source models (e.g., [MPT-7B](https://huggingface.co/mosaicml/mpt-7b), [StableLM](https://github.com/Stability-AI/StableLM), [RedPajama](https://huggingface.co/togethercomputer/RedPajama-INCITE-Base-7B-v0.1) etc.), thanks to being trained on 1,500B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. See the [OpenLLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
	* It features an architecture optimized for inference, with FlashAttention ([Dao et al., 2022](https://arxiv.org/abs/2205.14135)) and multiquery ([Shazeer et al., 2019](https://arxiv.org/abs/1911.02150)).
	* It is made available under a permissive Apache 2.0 license allowing for commercial use, without any royalties or restrictions.

	⚠️ This is a finetuned version for specifically question and answering. If you are looking for a version better suited to taking generic instructions in a chat format, we recommend taking a look at [Falcon-7B-Instruct](https://huggingface.co/tiiuae/falcon-7b-instruct).

	🔥 Looking for an even more powerful model? [Falcon-40B](https://huggingface.co/tiiuae/falcon-40b) is Falcon-7B's big brother!


	## Model Details

	The model was fine-tuned in 4-bit precision using 🤗 `peft` adapters, `transformers`, and `bitsandbytes`. Training relied on a method called "Low Rank Adapters" ([LoRA](https://arxiv.org/pdf/2106.09685.pdf)), specifically the [QLoRA](https://arxiv.org/abs/2305.14314) variant. The run took approximately 12 hours and was executed on a workstation with a single T4 NVIDIA GPU with 25 GB of available memory. See attached [Colab Notebook] used to train the model.

	### Model Date

	July 13, 2023


	Open source falcon 7b large language model fine tuned on SQuAD, Adversarial_qa, Trimpixel datasets for question and answering.
	QLoRA technique used for fine tuning the model on consumer grade GPU
	SFTTrainer is also used.

	## Datasets

	1.
	Dataset used: SQuAD
	Dataset Size: 87599
	Training Steps: 350

	2.
	Dataset used: Adversarial_qa
	Dataset Size: 30000
	Training Steps: 400

	3.
	Dataset used: Trimpixel
	Dataset Size: 1757
	Training Steps: 400




	## Training procedure


	The following `bitsandbytes` quantization config was used during training:
	- load_in_8bit: False
	- load_in_4bit: True
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: nf4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float16

	The following `bitsandbytes` quantization config was used during training:
	- load_in_8bit: False
	- load_in_4bit: True
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: nf4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float16
	### Framework versions

	- PEFT 0.4.0.dev0

	- PEFT 0.4.0.dev0