--- base_model: BAAI/bge-base-en-v1.5 library_name: setfit metrics: - accuracy pipeline_tag: text-classification tags: - setfit - sentence-transformers - text-classification - generated_from_setfit_trainer widget: - text: 'Reasoning why the answer may be good: 1. **Context Grounding:** The answer correctly interprets and references the specific part of the provided document that mentions the response status column. 2. **Relevance:** The answer directly addresses the question by explaining what the percentage in the response status column indicates. 3. **Conciseness:** The answer is concise and avoids unnecessary information, focusing solely on the meaning of the percentage in the response status column. 4. **Specificity:** The answer gives a detailed explanation about what the percentage represents—successful completion of response actions. 5. **Accuracy:** The key concept of the percentage indicating the total amount of successful completion of response actions is correctly conveyed as per the document. Reasoning why the answer may be bad: There is no apparent reason for the answer to be bad; it aligns well with the document and addresses the question directly and concisely. Final result:' - text: 'Reasoning: **Why the answer may be good:** - It accurately states that the provided information does not address the specific question. - It directs the reader to seek additional information or context. **Why the answer may be bad:** - It does not attempt to relate or infer an answer based on the document provided. - The document does provide relevant details about endpoint controls, including that they involve Device Control, Personal Firewall Control, and Full Disk Encryption Visibility, which can imply their purpose. - The response is somewhat evasive and does not leverage any context offered by the document to give an informed answer. Final Result:' - text: 'Reasoning why the answer may be good: - The provided answer is addressing the purpose of an agent (collecting and securely forwarding logs), aligning with the context of log collection and forwarding described in the document. Reasoning why the answer may be bad: - The answer is missing specificity. It does not mention on-site collection nor does it specify the direct forwarding feature mentioned in the document. - It lacks details specified in the document about the agent being used for integrations that do not use cloud feeds, e.g., firewalls. - The answer does not mention explicitly ties into the detection and correlation engine. Final result:' - text: '### Reasoning **Good Aspects of the Answer:** 1. **Context Grounding:** The answer correctly pulls context from the document regarding email notifications. 2. **Relevance:** The answer attempts to address the purpose of the email notifications checkbox directly. **Bad Aspects of the Answer:** 1. **Key/Value/Event Name Accuracy:** The document outlines a checkbox related to enabling or disabling email notifications, but the answer uses placeholder text "ORGANIZATION_2," which needs replacement for accuracy. 2. **Conciseness:** The answer is somewhat repetitive with the phrasing "ORGANIZATION_2 or disable," which should clearly state "enable or disable." 3. **Specificity:** The answer lacks the specific detail from the document that this is specific to stale or archived sensors and involves System Admins. ### Final Evaluation **Final Result:** `Bad` The placeholder text makes the information ambiguous and not actionable. Additionally, there is a slight redundancy and missing specific roles (System Admin) and context(stale/archived sensors) necessary for precision.' - text: 'Reasoning why the answer may be good: - The answer provides a specific URL, which is required by the question. - It appears to be in the format expected for image URLs as hinted at in the document. Reasoning why the answer may be bad: - The provided answer does not match the precise URL given in the document. - The correct URL for the second query should be `..\/..\/_images\/hunting_http://miller.co`, while the answer contains `hunting_http://www.flores.net/`, which is not mentioned in the document. - The answer does not reflect careful cross-referencing with the provided document. Final result:' inference: true model-index: - name: SetFit with BAAI/bge-base-en-v1.5 results: - task: type: text-classification name: Text Classification dataset: name: Unknown type: unknown split: test metrics: - type: accuracy value: 0.704225352112676 name: Accuracy --- # SetFit with BAAI/bge-base-en-v1.5 This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. The model has been trained using an efficient few-shot learning technique that involves: 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. 2. Training a classification head with features from the fine-tuned Sentence Transformer. ## Model Details ### Model Description - **Model Type:** SetFit - **Sentence Transformer body:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance - **Maximum Sequence Length:** 512 tokens - **Number of Classes:** 2 classes ### Model Sources - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit) ### Model Labels | Label | Examples | |:------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 0 | | | 1 | | ## Evaluation ### Metrics | Label | Accuracy | |:--------|:---------| | **all** | 0.7042 | ## Uses ### Direct Use for Inference First install the SetFit library: ```bash pip install setfit ``` Then you can load this model and run inference. ```python from setfit import SetFitModel # Download from the 🤗 Hub model = SetFitModel.from_pretrained("Netta1994/setfit_baai_cybereason_gpt-4o_improved-cot-instructions_two_reasoning_remove_final_ev") # Run inference preds = model("Reasoning why the answer may be good: - The answer provides a specific URL, which is required by the question. - It appears to be in the format expected for image URLs as hinted at in the document. Reasoning why the answer may be bad: - The provided answer does not match the precise URL given in the document. - The correct URL for the second query should be `..\/..\/_images\/hunting_http://miller.co`, while the answer contains `hunting_http://www.flores.net/`, which is not mentioned in the document. - The answer does not reflect careful cross-referencing with the provided document. Final result:") ``` ## Training Details ### Training Set Metrics | Training set | Min | Median | Max | |:-------------|:----|:---------|:----| | Word count | 45 | 136.1487 | 302 | | Label | Training Sample Count | |:------|:----------------------| | 0 | 311 | | 1 | 321 | ### Training Hyperparameters - batch_size: (16, 16) - num_epochs: (1, 1) - max_steps: -1 - sampling_strategy: oversampling - num_iterations: 20 - body_learning_rate: (2e-05, 2e-05) - head_learning_rate: 2e-05 - loss: CosineSimilarityLoss - distance_metric: cosine_distance - margin: 0.25 - end_to_end: False - use_amp: False - warmup_proportion: 0.1 - l2_weight: 0.01 - seed: 42 - eval_max_steps: -1 - load_best_model_at_end: False ### Training Results | Epoch | Step | Training Loss | Validation Loss | |:------:|:----:|:-------------:|:---------------:| | 0.0006 | 1 | 0.1893 | - | | 0.0316 | 50 | 0.2645 | - | | 0.0633 | 100 | 0.2581 | - | | 0.0949 | 150 | 0.2491 | - | | 0.1266 | 200 | 0.2544 | - | | 0.1582 | 250 | 0.2538 | - | | 0.1899 | 300 | 0.2413 | - | | 0.2215 | 350 | 0.1942 | - | | 0.2532 | 400 | 0.1354 | - | | 0.2848 | 450 | 0.0857 | - | | 0.3165 | 500 | 0.0544 | - | | 0.3481 | 550 | 0.0412 | - | | 0.3797 | 600 | 0.0313 | - | | 0.4114 | 650 | 0.0239 | - | | 0.4430 | 700 | 0.018 | - | | 0.4747 | 750 | 0.0268 | - | | 0.5063 | 800 | 0.0185 | - | | 0.5380 | 850 | 0.0245 | - | | 0.5696 | 900 | 0.0255 | - | | 0.6013 | 950 | 0.0201 | - | | 0.6329 | 1000 | 0.0187 | - | | 0.6646 | 1050 | 0.0132 | - | | 0.6962 | 1100 | 0.0129 | - | | 0.7278 | 1150 | 0.0065 | - | | 0.7595 | 1200 | 0.004 | - | | 0.7911 | 1250 | 0.0029 | - | | 0.8228 | 1300 | 0.0028 | - | | 0.8544 | 1350 | 0.0026 | - | | 0.8861 | 1400 | 0.0022 | - | | 0.9177 | 1450 | 0.0021 | - | | 0.9494 | 1500 | 0.0021 | - | | 0.9810 | 1550 | 0.0019 | - | ### Framework Versions - Python: 3.10.14 - SetFit: 1.1.0 - Sentence Transformers: 3.1.1 - Transformers: 4.44.0 - PyTorch: 2.4.0+cu121 - Datasets: 3.0.0 - Tokenizers: 0.19.1 ## Citation ### BibTeX ```bibtex @article{https://doi.org/10.48550/arxiv.2209.11055, doi = {10.48550/ARXIV.2209.11055}, url = {https://arxiv.org/abs/2209.11055}, author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Efficient Few-Shot Learning Without Prompts}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} } ```