kenhktsui
/

llm-data-textbook-quality-classifier-v1

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

kenhktsui commited on Jan 22

Commit

e9f5cc1

•

1 Parent(s): a2af9e6

docs: update README.md

Files changed (1) hide show

README.md +8 -3

README.md CHANGED Viewed

@@ -9,14 +9,19 @@ metrics:
 - recall
 - f1
 model-index:
-- name: llm-data-quality-classifer-compare
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# llm-data-quality-classifer-compare
 This model is a fine-tuned version of [FacebookAI/xlm-roberta-base](https://huggingface.co/FacebookAI/xlm-roberta-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
@@ -187,4 +192,4 @@ The following hyperparameters were used during training:
 - Transformers 4.35.2
 - Pytorch 2.1.0+cu121
 - Datasets 2.16.1
-- Tokenizers 0.15.0

 - recall
 - f1
 model-index:
+- name: llm-data-textbook-quality-classifer-v1
   results: []
+datasets:
+- kenhktsui/llm-data-quality-tokenized
+language:
+- en
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# llm-data-textbook-quality-classifer-v1
+This model can classify if a text is of textbook quality data. It can be used as a filter for data curation when training a LLM.
 This model is a fine-tuned version of [FacebookAI/xlm-roberta-base](https://huggingface.co/FacebookAI/xlm-roberta-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
 - Transformers 4.35.2
 - Pytorch 2.1.0+cu121
 - Datasets 2.16.1
+- Tokenizers 0.15.0