Model Card for Meta-Llama-3.1-8B-openhermes-2.5
This model is a fine-tuned version of Meta-Llama-3.1-8B on the OpenHermes-2.5 dataset.
Model Details
Model Description
This is a fine-tuned version of the Meta-Llama-3.1-8B model, trained on the OpenHermes-2.5 dataset. It is designed for instruction following and general language tasks.
- Developed by: artificialguybr
- Model type: Causal Language Model
- Language(s): English
- License: apache-2.0
- Finetuned from model: meta-llama/Meta-Llama-3.1-8B
Model Sources
Uses
This model can be used for various natural language processing tasks, particularly those involving instruction following and general language understanding.
Direct Use
The model can be used for tasks such as text generation, question answering, and other language-related applications.
Out-of-Scope Use
The model should not be used for generating harmful or biased content. Users should be aware of potential biases in the training data.
Training Details
Training Data
The model was fine-tuned on the teknium/OpenHermes-2.5 dataset.
Training Procedure
Training Hyperparameters
- Training regime: BF16 mixed precision
- Optimizer: AdamW
- Learning rate: Started at 0.00000249316296439037 (decaying)
- Batch size: Not specified (gradient accumulation steps: 8)
- Training steps: 13,368
- Evaluation strategy: Steps (every 0.16666666666666666 steps)
- Gradient checkpointing: Enabled
- Weight decay: 0
Hardware and Software
- Hardware: NVIDIA A100-SXM4-80GB (1 GPU)
- Software Framework: π€ Transformers, Axolotl
Evaluation
Metrics
- Loss: 0.6727465987205505 (evaluation)
- Perplexity: Not provided
Results
- Evaluation runtime: 2,676.4173 seconds
- Samples per second: 18.711
- Steps per second: 18.711
Model Architecture
- Model Type: LlamaForCausalLM
- Hidden size: 4,096
- Intermediate size: 14,336
- Number of attention heads: Not specified
- Number of layers: Not specified
- Activation function: SiLU
- Vocabulary size: 128,256
Limitations and Biases
More information is needed about specific limitations and biases of this model.
- Downloads last month
- 224