spaly99
/

my-setfit-model-dataset-PG-OCR-3

@@ -11,15 +11,17 @@ metrics:
 - recall
 - f1
 widget:
-- text: 'Recognized as one of the Most Energy Efficient Dealerships in North America! '
-- text: 'Nespresso VertuoLine WS Keurig 2.0 '
-- text: Threat Intelligence & Brand Reputation
-- text: 'FpeANUTOUTTER- CUPS '
 pipeline_tag: text-classification
 inference: true
-base_model: sentence-transformers/paraphrase-mpnet-base-v2
 model-index:
-- name: SetFit with sentence-transformers/paraphrase-mpnet-base-v2
   results:
   - task:
       type: text-classification
@@ -30,22 +32,22 @@ model-index:
       split: test
     metrics:
     - type: accuracy
-      value: 1.0
       name: Accuracy
     - type: precision
-      value: 1.0
       name: Precision
     - type: recall
-      value: 1.0
       name: Recall
     - type: f1
-      value: 1.0
       name: F1
 ---
-# SetFit with sentence-transformers/paraphrase-mpnet-base-v2
-This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
 The model has been trained using an efficient few-shot learning technique that involves:
@@ -56,7 +58,7 @@ The model has been trained using an efficient few-shot learning technique that i
 ### Model Description
 - **Model Type:** SetFit
-- **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
 - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
 - **Maximum Sequence Length:** 512 tokens
 - **Number of Classes:** 2 classes
@@ -71,17 +73,17 @@ The model has been trained using an efficient few-shot learning technique that i
 - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
 ### Model Labels
-| Label | Examples                                                                                                                                                                                                                                                                                                                                                                                                          |
-|:------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| True  | <ul><li>'Solve lunch first. Introducing The 12™, made with seasoned Canadian chicken breast, fresh tomato and crisp lettuce. '</li><li>'MAKE THE MOST OF _ Notional Ube Chocoleté Diy '</li><li>'ee ee Ra bere car 100% nisared } Bon ard whist riekes is wo ire. Ta fag eesti eas nen Pa asered, hoathy and hing groen with every sip of  Potand Spring? Grand 100% Natural Spring Warler you eqioy. '</li></ul> |
-| False | <ul><li>'Wykorzystywanie ograniczonych danych do wyboru treści '</li><li>'GitHub'</li><li>'Draftsmen'</li></ul>                                                                                                                                                                                                                                                                                                   |
 ## Evaluation
 ### Metrics
-| Label   | Accuracy | Precision | Recall | F1  |
-|:--------|:---------|:----------|:-------|:----|
-| **all** | 1.0      | 1.0       | 1.0    | 1.0 |
 ## Uses
@@ -101,7 +103,8 @@ from setfit import SetFitModel
 # Download from the 🤗 Hub
 model = SetFitModel.from_pretrained("setfit_model_id")
 # Run inference
-preds = model("FpeANUTOUTTER- CUPS ")
 ```
 <!--
@@ -131,38 +134,14 @@ preds = model("FpeANUTOUTTER- CUPS ")
 ## Training Details
 ### Training Set Metrics
-| Training set | Min | Median | Max |
-|:-------------|:----|:-------|:----|
-| Word count   | 1   | 7.5625 | 41  |
 | Label | Training Sample Count |
 |:------|:----------------------|
-| False | 9                     |
-| True  | 7                     |
-### Training Hyperparameters
-- batch_size: (16, 2)
-- num_epochs: (1, 16)
-- max_steps: -1
-- sampling_strategy: oversampling
-- num_iterations: 20
-- body_learning_rate: (2e-05, 1e-05)
-- head_learning_rate: 0.01
-- loss: CosineSimilarityLoss
-- distance_metric: cosine_distance
-- margin: 0.25
-- end_to_end: False
-- use_amp: False
-- warmup_proportion: 0.1
-- seed: 42
-- run_name: PG-OCR-test-3
-- eval_max_steps: -1
-- load_best_model_at_end: False
-### Training Results
-| Epoch | Step | Training Loss | Validation Loss |
-|:-----:|:----:|:-------------:|:---------------:|
-| 0.025 | 1    | 0.027         | -               |
 ### Framework Versions
 - Python: 3.11.0

 - recall
 - f1
 widget:
+- text: GMB Gambia
+- text: ' end flyout 2 '
+- text: 'Books
+    '
+- text: Persistent
+- text: Session
 pipeline_tag: text-classification
 inference: true
 model-index:
+- name: SetFit
   results:
   - task:
       type: text-classification
       split: test
     metrics:
     - type: accuracy
+      value: 0.87325
       name: Accuracy
     - type: precision
+      value: 0.8566450970632156
       name: Precision
     - type: recall
+      value: 0.8871134020618556
       name: Recall
     - type: f1
+      value: 0.8716130665991391
       name: F1
 ---
+# SetFit
+This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
 The model has been trained using an efficient few-shot learning technique that involves:
 ### Model Description
 - **Model Type:** SetFit
+<!-- - **Sentence Transformer:** [Unknown](https://huggingface.co/unknown) -->
 - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
 - **Maximum Sequence Length:** 512 tokens
 - **Number of Classes:** 2 classes
 - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
 ### Model Labels
+| Label | Examples                                                                                                                                                                                                                     |
+|:------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| True  | <ul><li>'-. Pepsi-Colacold beats any cola cold! '</li><li>"Use “Jemes! et : L lemen peeple wen't Lemon. “i720 ait? "</li><li>'Ifit happens once, it could happen again. soptacaceee tates | WOE ¥ 1800 774 5025. '</li></ul> |
+| False | <ul><li>'ps-script'</li><li>'Make your bidder browser agnostic to access high-performing cookie alternative supply'</li><li>'International Students & Scholars'</li></ul>                                                    |
 ## Evaluation
 ### Metrics
+| Label   | Accuracy | Precision | Recall | F1     |
+|:--------|:---------|:----------|:-------|:-------|
+| **all** | 0.8732   | 0.8566    | 0.8871 | 0.8716 |
 ## Uses
 # Download from the 🤗 Hub
 model = SetFitModel.from_pretrained("setfit_model_id")
 # Run inference
+preds = model("Books
+")
 ```
 <!--
 ## Training Details
 ### Training Set Metrics
+| Training set | Min | Median | Max  |
+|:-------------|:----|:-------|:-----|
+| Word count   | 1   | 8.4845 | 1060 |
 | Label | Training Sample Count |
 |:------|:----------------------|
+| False | 7940                  |
+| True  | 8060                  |
 ### Framework Versions
 - Python: 3.11.0

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "./checkpoints/step_40000",
   "architectures": [
     "MPNetModel"
   ],

 {
+  "_name_or_path": "spaly99/my-setfit-model-dataset-PG-OCR-3",
   "architectures": [
     "MPNetModel"
   ],

config_setfit.json CHANGED Viewed

@@ -1,4 +1,4 @@
 {
-  "labels": null,
-  "normalize_embeddings": false
 }

 {
+  "normalize_embeddings": false,
+  "labels": null
 }