metadata
library_name: transformers
tags:
- PEFT
- mistral
- sft
- 'TensorBoard '
- Safetensors
- ' trl'
- generated_from_trainer 4-bit
- ' precision'
license: mit
datasets:
- yahma/alpaca-cleaned
language:
- en
pipeline_tag: question-answering
Model Card for Model ID
This Model is Finetuned for Document Question and Answering purpose Trained on the yahma/alpaca-cleaned(https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) dataset.
Model Details
Training hyperparameters
The following hyperparameters were used during training:
-gradient_accumulation_steps=1,
-warmup_steps=5,
-max_steps=20,
-learning_rate=2e-4,
-fp16=not torch.cuda.is_bf16_supported(),
-bf16=torch.cuda.is_bf16_supported(),
-logging_steps=1,
-optim="adamw_8bit",
-weight_decay=0.01,
-lr_scheduler_type="linear",
-seed=3407,