File size: 2,474 Bytes
4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b 4567b6c fb5682b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
---
library_name: peft
base_model: mistralai/Mistral-7B-v0.1
license: mit
language:
- en
metrics:
- perplexity
- bertscore
---
# Model Card for Model ID
Fine-tuned using QLoRA for story generation task.
### Model Description
We utilize "Hierarchical Neural Story Generation" dataset and fine-tune the model to generate stories.
The input to the model is structred as follows:
'''
\#\#\# Instruction: Below is a story idea. Write a short story based on this context.
\#\#\# Input: [story idea here]
\#\#\# Response:
'''
- **Developed by:** Abdelrahman ’Boda’ Sadallah, Anastasiia Demidova, Daria Kotova
- **Model type:** Causal LM
- **Language(s) (NLP):** English
- **Finetuned from model [optional]:** mistralai/Mistral-7B-v0.1
### Model Sources
- **Repository:** https://github.com/BodaSadalla98/llm-optimized-fintuning
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
The model is the result of our AI project. If you intend to use it, please, refer to the repo.
### Recommendations
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
For improving stories generation, you can play parameters: temeperature, top_p/top_k, repetition_penalty, etc.
## Training Details
### Training Data
<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
Github for the dataset: https://github.com/kevalnagda/StoryGeneration
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
### Testing Data, Factors & Metrics
Test split of the same dataset.
#### Metrics
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
We are using perplexity and BERTScore.
### Results
Perplexity: 8.8647
BERTScore: 80.76
## Training procedure
The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: fp4
- bnb_4bit_use_double_quant: False
- bnb_4bit_compute_dtype: float32
### Framework versions
- PEFT 0.6.0.dev0 |