rishavranaut
/

Mistral_7B_MT

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

Mistral_7B_MT / README.md

rishavranaut's picture

rishavranaut/Mistral_7B_MT

29827fc verified 2 months ago

|

history blame contribute delete

3.36 kB

	---
	license: apache-2.0
	library_name: peft
	tags:
	- generated_from_trainer
	base_model: mistralai/Mistral-7B-v0.1
	metrics:
	- accuracy
	- precision
	- recall
	model-index:
	- name: Mistral_7B_MT
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Mistral_7B_MT

	This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.8388
	- Accuracy: 0.8167
	- Precision: 0.8519
	- Recall: 0.7667
	- F1 score: 0.8070

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Accuracy \| F1 score \| Precision \| Recall \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:--------:\|:--------:\|:---------:\|:------:\|:---------------:\|
	\| 1.687 \| 0.25 \| 200 \| 0.6233 \| 0.4378 \| 0.8627 \| 0.2933 \| 2.0030 \|
	\| 0.9482 \| 0.5 \| 400 \| 0.68 \| 0.5616 \| 0.8913 \| 0.41 \| 1.4557 \|
	\| 0.9232 \| 0.75 \| 600 \| 0.72 \| 0.6471 \| 0.875 \| 0.5133 \| 0.8805 \|
	\| 0.7781 \| 1.0 \| 800 \| 0.57 \| 0.3246 \| 0.7561 \| 0.2067 \| 1.4515 \|
	\| 0.5468 \| 1.25 \| 1000 \| 0.7233 \| 0.6483 \| 0.8895 \| 0.51 \| 0.8474 \|
	\| 0.5549 \| 1.5 \| 1200 \| 0.7767 \| 0.7403 \| 0.8843 \| 0.6367 \| 0.7168 \|
	\| 0.4883 \| 1.75 \| 1400 \| 0.8 \| 0.7719 \| 0.8982 \| 0.6767 \| 0.6943 \|
	\| 0.4639 \| 2.0 \| 1600 \| 0.7767 \| 0.7276 \| 0.9323 \| 0.5967 \| 0.7637 \|
	\| 0.3804 \| 2.25 \| 1800 \| 0.7617 \| 0.7146 \| 0.8905 \| 0.5967 \| 0.8467 \|
	\| 0.3847 \| 2.5 \| 2000 \| 0.81 \| 0.7942 \| 0.8661 \| 0.7333 \| 0.6699 \|
	\| 0.346 \| 2.75 \| 2200 \| 0.7833 \| 0.7575 \| 0.8602 \| 0.6767 \| 0.8569 \|
	\| 0.3488 \| 3.0 \| 2400 \| 0.7824 \| 0.815 \| 0.9238 \| 0.6867 \| 0.7878 \|
	\| 0.2654 \| 3.25 \| 2600 \| 1.0799 \| 0.7683 \| 0.9259 \| 0.5833 \| 0.7157 \|
	\| 0.2506 \| 3.5 \| 2800 \| 0.8567 \| 0.8033 \| 0.9062 \| 0.6767 \| 0.7748 \|
	\| 0.2574 \| 3.75 \| 3000 \| 0.7490 \| 0.8083 \| 0.7846 \| 0.85 \| 0.816 \|
	\| 0.2137 \| 4.0 \| 3200 \| 0.7665 \| 0.8333 \| 0.8546 \| 0.8033 \| 0.8282 \|
	\| 0.1335 \| 4.25 \| 3400 \| 0.8591 \| 0.8133 \| 0.8013 \| 0.8333 \| 0.8170 \|
	\| 0.1486 \| 4.5 \| 3600 \| 0.9781 \| 0.83 \| 0.9091 \| 0.7333 \| 0.8118 \|
	\| 0.126 \| 4.75 \| 3800 \| 0.8723 \| 0.8217 \| 0.8642 \| 0.7633 \| 0.8106 \|
	\| 0.1474 \| 5.0 \| 4000 \| 0.8388 \| 0.8167 \| 0.8519 \| 0.7667 \| 0.8070 \|


	### Framework versions

	- PEFT 0.11.1
	- Transformers 4.44.2
	- Pytorch 2.3.0+cu121
	- Datasets 2.19.1
	- Tokenizers 0.19.1