--- library_name: transformers tags: - PEFT - mistral - sft - 'TensorBoard ' - Safetensors - ' trl' - generated_from_trainer 4-bit - ' precision' license: mit datasets: - yahma/alpaca-cleaned language: - en pipeline_tag: question-answering --- # Model Card for Model ID This Model is Finetuned for Document Question and Answering purpose Trained on the yahma/alpaca-cleaned(https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) dataset. ## Model Details ### Training hyperparameters The following hyperparameters were used during training: -gradient_accumulation_steps=1, -warmup_steps=5, -max_steps=20, -learning_rate=2e-4, -fp16=not torch.cuda.is_bf16_supported(), -bf16=torch.cuda.is_bf16_supported(), -logging_steps=1, -optim="adamw_8bit", -weight_decay=0.01, -lr_scheduler_type="linear", -seed=3407, - ### Framework versions - PEFT 0.7.1 - Transformers 4.36.0 - Pytorch 2.0.0 - Datasets 2.16.1 - Tokenizers 0.15.0