Edit model card

Built with Axolotl

Qwen2-1.5B Fine-tuned on Synthia v1.5-II

A special thanks to Redmond.ai for sponsoring the GPU resources for this fine-tuning process.

This model is a fine-tuned version of Qwen/Qwen2-1.5B on the Synthia v1.5-II dataset, which contains over 20.7k instruction-following examples.

Model Description

Qwen2-1.5B is part of the latest Qwen2 series of large language models. The base model brings significant improvements in:

  • Language understanding and generation
  • Structured data processing
  • Support for multiple languages
  • Long context handling

This fine-tuned version enhances the base model's instruction-following capabilities through training on the Synthia v1.5-II dataset.

Model Architecture

  • Type: Causal Language Model
  • Parameters: 1.5B
  • Training Framework: Transformers 4.45.0.dev0

Intended Uses & Limitations

This model is intended for:

  • Instruction following and task completion
  • Text generation and completion
  • Conversational AI applications

The model inherits the capabilities of the base Qwen2-1.5B model, while being specifically tuned for instruction following.

Training Procedure

Training Data

The model was fine-tuned on the Synthia v1.5-II dataset containing 20.7k instruction-following examples.

Training Hyperparameters

The following hyperparameters were used during training:

  • Learning rate: 1e-05
  • Train batch size: 5
  • Eval batch size: 5
  • Seed: 42
  • Gradient accumulation steps: 8
  • Total train batch size: 40
  • Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • LR scheduler type: cosine
  • LR scheduler warmup steps: 100
  • Number of epochs: 3
  • Sequence length: 4096
  • Sample packing: enabled
  • Pad to sequence length: enabled

Framework Versions

  • Transformers 4.45.0.dev0
  • Pytorch 2.3.1+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1
See axolotl config

axolotl version: 0.4.1

Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for artificialguybr/QWEN-2-1.5B-Synthia-II-Redmond

Base model

Qwen/Qwen2-1.5B
Finetuned
(23)
this model
Quantizations
3 models