FemkeBakker
/

AmsterdamDocClassificationMistral200T1Epochs

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

FemkeBakker commited on Jul 12

Commit

151bdf7

•

1 Parent(s): 6e7fc51

Update README.md

Files changed (1) hide show

README.md +18 -13

README.md CHANGED Viewed

@@ -7,32 +7,32 @@ tags:
 model-index:
 - name: AmsterdamDocClassificationMistral200T1Epochs
   results: []
 ---
 # AmsterdamDocClassificationMistral200T1Epochs
-This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on the [AmsterdamDocClassification](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.7673
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -57,6 +57,7 @@ The following hyperparameters were used during training:
 | 0.5285        | 0.7952 | 492  | 0.7687          |
 | 0.7677        | 0.9939 | 615  | 0.7673          |
 ### Framework versions
@@ -64,3 +65,7 @@ The following hyperparameters were used during training:
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1

 model-index:
 - name: AmsterdamDocClassificationMistral200T1Epochs
   results: []
+license: eupl-1.1
+datasets:
+- FemkeBakker/AmsterdamBalancedFirst200Tokens
+language:
+- nl
 ---
 # AmsterdamDocClassificationMistral200T1Epochs
+As part of the Assessing Large Language Models for Document Classification project by the Municipality of Amsterdam, we fine-tune Mistral, Llama, and GEITje for document classification.
+The fine-tuning is performed using the [AmsterdamBalancedFirst200Tokens](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset, which consists of documents truncated to the first 200 tokens.
+In our research, we evaluate the fine-tuning of these LLMs across one, two, and three epochs.
+This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) and has been fine-tuned for one epoch.
 It achieves the following results on the evaluation set:
 - Loss: 0.7673
 ## Training and evaluation data
+- The training data consists of 9900 documents and their labels formatted into conversations.
+- The evaluation data consists of 1100 documents and their labels formatted into conversations.
 ## Training procedure
+See the [GitHub](https://github.com/Amsterdam-Internships/document-classification-using-large-language-models) for specifics about the training and the code.
 ### Training hyperparameters
 The following hyperparameters were used during training:
 | 0.5285        | 0.7952 | 492  | 0.7687          |
 | 0.7677        | 0.9939 | 615  | 0.7673          |
+Training time: it took in total 44 minutes to fine-tuned the model for one epoch.
 ### Framework versions
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1
+### Acknowledgements
+This model was trained as part of [insert thesis info] in collaboration with Amsterdam Intelligence for the City of Amsterdam.