tinyllama_rm_sentiment_1b
This model is a fine-tuned version of TinyLlama/TinyLlama_v1.1 on https://huggingface.co/datasets/trl-internal-testing/sentiment-trl-style. It achieves the following results on the evaluation set:
- Loss: 0.6514
- Accuracy: 0.625
Model description
Trained using:
python trl/examples/scripts/rm/rm.py \
--dataset_name trl-internal-testing/sentiment-trl-style \
--dataset_train_split train \
--dataset_eval_split test \
--model_name_or_path TinyLlama/TinyLlama_v1.1 \
--chat_template simple_concat \
--learning_rate 3e-6 \
--per_device_train_batch_size 32 \
--per_device_eval_batch_size 32 \
--gradient_accumulation_steps 1 \
--logging_steps 1 \
--eval_strategy steps \
--max_token_length 1024 \
--max_prompt_token_lenth 1024 \
--remove_unused_columns False \
--num_train_epochs 1 \
--eval_steps 100 \
--output_dir models/ppo_torchtune/tinyllama/tinyllama_rm_sentiment_1b \
--push_to_hub
on the "dataset-processor" branch of trl:
git clone -b "dataset-processor" https://github.com/huggingface/trl
Intended uses & limitations
More information needed
Training and evaluation data
https://huggingface.co/datasets/trl-internal-testing/sentiment-trl-style
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-06
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1.0
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
0.6033 | 0.6410 | 100 | 0.6514 | 0.625 |
Framework versions
- Transformers 4.42.2
- Pytorch 2.2.0+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for smohammadi/tinyllama_rm_sentiment_1b
Base model
TinyLlama/TinyLlama_v1.1