Abe13
/

juni-Mistral-7B-OpenOrca

PEFT

Model card Files Files and versions Community

Abe13 commited on Nov 1, 2023

Commit

c53020b

•

1 Parent(s): c50889b

Upload model

Browse files

Files changed (1) hide show

README.md +200 -129

README.md CHANGED Viewed

@@ -1,148 +1,219 @@
 ---
-license: apache-2.0
 base_model: Open-Orca/Mistral-7B-OpenOrca
-tags:
-- generated_from_trainer
-model-index:
-- name: juni-Mistral-7B-OpenOrca
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# juni-Mistral-7B-OpenOrca
-This model is a fine-tuned version of [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 3.0758
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 0.0001
-- train_batch_size: 2
-- eval_batch_size: 2
-- seed: 42
-- gradient_accumulation_steps: 8
-- total_train_batch_size: 16
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- num_epochs: 10
-### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 1.7594        | 0.11  | 1    | 3.4155          |
-| 1.7761        | 0.22  | 2    | 3.3643          |
-| 1.6344        | 0.32  | 3    | 3.3129          |
-| 1.8145        | 0.43  | 4    | 3.2624          |
-| 1.7308        | 0.54  | 5    | 3.2462          |
-| 1.6688        | 0.65  | 6    | 3.2282          |
-| 1.8082        | 0.76  | 7    | 3.2052          |
-| 1.5884        | 0.86  | 8    | 3.1957          |
-| 1.6247        | 0.97  | 9    | 3.1926          |
-| 1.7539        | 1.08  | 10   | 3.1759          |
-| 1.6578        | 1.19  | 11   | 3.1674          |
-| 1.661         | 1.3   | 12   | 3.1829          |
-| 1.5935        | 1.41  | 13   | 3.1785          |
-| 1.5209        | 1.51  | 14   | 3.1687          |
-| 1.6052        | 1.62  | 15   | 3.1504          |
-| 1.495         | 1.73  | 16   | 3.1539          |
-| 1.5238        | 1.84  | 17   | 3.1357          |
-| 1.5698        | 1.95  | 18   | 3.1196          |
-| 1.3628        | 2.05  | 19   | 3.1099          |
-| 1.5966        | 2.16  | 20   | 3.1170          |
-| 1.5713        | 2.27  | 21   | 3.1327          |
-| 1.5321        | 2.38  | 22   | 3.1060          |
-| 1.5511        | 2.49  | 23   | 3.1153          |
-| 1.5605        | 2.59  | 24   | 3.0925          |
-| 1.515         | 2.7   | 25   | 3.1066          |
-| 1.4646        | 2.81  | 26   | 3.1005          |
-| 1.3957        | 2.92  | 27   | 3.1305          |
-| 1.4377        | 3.03  | 28   | 3.1143          |
-| 1.4452        | 3.14  | 29   | 3.1472          |
-| 1.4925        | 3.24  | 30   | 3.1050          |
-| 1.4749        | 3.35  | 31   | 3.1264          |
-| 1.5017        | 3.46  | 32   | 3.1107          |
-| 1.5082        | 3.57  | 33   | 3.1000          |
-| 1.4657        | 3.68  | 34   | 3.1220          |
-| 1.2359        | 3.78  | 35   | 3.1199          |
-| 1.4095        | 3.89  | 36   | 3.0966          |
-| 1.5437        | 4.0   | 37   | 3.0847          |
-| 1.339         | 4.11  | 38   | 3.1319          |
-| 1.3762        | 4.22  | 39   | 3.0917          |
-| 1.3964        | 4.32  | 40   | 3.0947          |
-| 1.4472        | 4.43  | 41   | 3.1034          |
-| 1.3863        | 4.54  | 42   | 3.1100          |
-| 1.434         | 4.65  | 43   | 3.1018          |
-| 1.5171        | 4.76  | 44   | 3.0831          |
-| 1.215         | 4.86  | 45   | 3.0755          |
-| 1.4791        | 4.97  | 46   | 3.0790          |
-| 1.3341        | 5.08  | 47   | 3.0816          |
-| 1.3899        | 5.19  | 48   | 3.0909          |
-| 1.3621        | 5.3   | 49   | 3.0668          |
-| 1.4034        | 5.41  | 50   | 3.0818          |
-| 1.3541        | 5.51  | 51   | 3.0512          |
-| 1.2916        | 5.62  | 52   | 3.0861          |
-| 1.3359        | 5.73  | 53   | 3.0695          |
-| 1.3962        | 5.84  | 54   | 3.0544          |
-| 1.3537        | 5.95  | 55   | 3.0808          |
-| 1.2551        | 6.05  | 56   | 3.0733          |
-| 1.4321        | 6.16  | 57   | 3.0481          |
-| 1.3511        | 6.27  | 58   | 3.0660          |
-| 1.4584        | 6.38  | 59   | 3.0385          |
-| 1.1897        | 6.49  | 60   | 3.0632          |
-| 1.3157        | 6.59  | 61   | 3.0724          |
-| 1.2269        | 6.7   | 62   | 3.0747          |
-| 1.4017        | 6.81  | 63   | 3.0593          |
-| 1.357         | 6.92  | 64   | 3.0655          |
-| 1.4048        | 7.03  | 65   | 3.0649          |
-| 1.308         | 7.14  | 66   | 3.0707          |
-| 1.2297        | 7.24  | 67   | 3.0561          |
-| 1.2186        | 7.35  | 68   | 3.0729          |
-| 1.2583        | 7.46  | 69   | 3.0800          |
-| 1.4283        | 7.57  | 70   | 3.0698          |
-| 1.224         | 7.68  | 71   | 3.0787          |
-| 1.2403        | 7.78  | 72   | 3.0669          |
-| 1.2677        | 7.89  | 73   | 3.0615          |
-| 1.3997        | 8.0   | 74   | 3.0658          |
-| 1.2593        | 8.11  | 75   | 3.0714          |
-| 1.1997        | 8.22  | 76   | 3.0752          |
-| 1.2961        | 8.32  | 77   | 3.0662          |
-| 1.3297        | 8.43  | 78   | 3.0637          |
-| 1.2994        | 8.54  | 79   | 3.0660          |
-| 1.3623        | 8.65  | 80   | 3.0626          |
-| 1.1564        | 8.76  | 81   | 3.0658          |
-| 1.3229        | 8.86  | 82   | 3.0674          |
-| 1.1027        | 8.97  | 83   | 3.0688          |
-| 1.3022        | 9.08  | 84   | 3.0699          |
-| 1.2523        | 9.19  | 85   | 3.0684          |
-| 1.198         | 9.3   | 86   | 3.0687          |
-| 0.9721        | 9.41  | 87   | 3.0730          |
-| 1.2124        | 9.51  | 88   | 3.0756          |
-| 1.3073        | 9.62  | 89   | 3.0761          |
-| 1.2945        | 9.73  | 90   | 3.0758          |
 ### Framework versions
-- Transformers 4.34.1
-- Pytorch 2.0.1+cu118
-- Datasets 2.14.6
-- Tokenizers 0.14.1

 ---
+library_name: peft
 base_model: Open-Orca/Mistral-7B-OpenOrca
 ---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Data Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
 ## Training procedure
+The following `bitsandbytes` quantization config was used during training:
+- quant_method: bitsandbytes
+- load_in_8bit: False
+- load_in_4bit: True
+- llm_int8_threshold: 6.0
+- llm_int8_skip_modules: None
+- llm_int8_enable_fp32_cpu_offload: False
+- llm_int8_has_fp16_weight: False
+- bnb_4bit_quant_type: nf4
+- bnb_4bit_use_double_quant: False
+- bnb_4bit_compute_dtype: float16
 ### Framework versions
+- PEFT 0.6.0.dev0