artificialguybr
/

QWEN-2-1.5B-Synthia-II-Redmond

Text Generation

Generated from Trainer

instruction-tuning

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Qwen2-1.5B Fine-tuned on Synthia v1.5-II

A special thanks to Redmond.ai for sponsoring the GPU resources for this fine-tuning process.

This model is a fine-tuned version of Qwen/Qwen2-1.5B on the Synthia v1.5-II dataset, which contains over 20.7k instruction-following examples.

Model Description

Qwen2-1.5B is part of the latest Qwen2 series of large language models. The base model brings significant improvements in:

Language understanding and generation
Structured data processing
Support for multiple languages
Long context handling

This fine-tuned version enhances the base model's instruction-following capabilities through training on the Synthia v1.5-II dataset.

Model Architecture

Type: Causal Language Model
Parameters: 1.5B
Training Framework: Transformers 4.45.0.dev0

Intended Uses & Limitations

This model is intended for:

Instruction following and task completion
Text generation and completion
Conversational AI applications

The model inherits the capabilities of the base Qwen2-1.5B model, while being specifically tuned for instruction following.

Training Procedure

Training Data

The model was fine-tuned on the Synthia v1.5-II dataset containing 20.7k instruction-following examples.

Training Hyperparameters

The following hyperparameters were used during training:

Learning rate: 1e-05
Train batch size: 5
Eval batch size: 5
Seed: 42
Gradient accumulation steps: 8
Total train batch size: 40
Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
LR scheduler type: cosine
LR scheduler warmup steps: 100
Number of epochs: 3
Sequence length: 4096
Sample packing: enabled
Pad to sequence length: enabled

Framework Versions

Transformers 4.45.0.dev0
Pytorch 2.3.1+cu121
Datasets 2.21.0
Tokenizers 0.19.1

See axolotl config

axolotl version: 0.4.1

Downloads last month: 16

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for artificialguybr/QWEN-2-1.5B-Synthia-II-Redmond

Base model

Qwen/Qwen2-1.5B

Finetuned

(23)

this model

Quantizations

Evaluation results

Metadata error: specify a dataset to view leaderboard