metadata
base_model: BAAI/bge-small-en-v1.5
library_name: setfit
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget:
- text: >-
w for students to learn and understand the concepts and techniques of
using ChatGPT for learning and development.
Week 1:
* Introduction to ChatGPT and its capabilities
* Setting up and using ChatGPT for language learning
* Practical session: Using ChatGPT for English language learning
* Practical session: Using ChatGPT for learning a new skill or subject
Week 2:
* Advanced language learning techniques with ChatGPT
* Using ChatGPT for language translation
* Practical session: Translating text using ChatGPT
* Practical session: Using ChatGPT to improve writing skills
Week 3:
* ChatGPT for research and information gathering
* Advanced research techniques with ChatGPT
* Practical session: Using ChatGPT for research and information gathering
* Practical session: Advanced research techniques with ChatGPT
Week 4:
* ChatGPT for project management and productivity
* Using ChatGPT for task management and organization
* Practical session: Using ChatGPT for project management and productivity
* Practical session: Advanced project management techniques with ChatGPT
Week 5:
* ChatGPT for creative writing and content creation
* Using ChatGPT for idea generation and storytelling
* Practical session: Using ChatGPT for creative writing and content
creation
* Practical session: Advanced content creation techniques with ChatGPT
Week 6:
* ChatGPT for computer programming and coding
* Using ChatGPT for coding exercises and practice
* Practical session: Using ChatGPT for coding exercises and practice
* Practical session: Advanced programming techniques with ChatGPT
Week 7:
* ChatGPT for digital marketing and social media management
* Using ChatGPT for social media marketing and advertising
* Practical session: Using ChatGPT for social media marketing and
advertising
* Practical session: Advanced digital marketing techniques with ChatGPT
Week 8:
* ChatGPT for entrepreneurship and small business management
* Using ChatGPT for business planning and strategy
* Practical session: Using Chat
- text: |-
Sure, here's a simple way to shuffle an array in JavaScript:
```
function shuffleArray(array) {
for (let i = array.length - 1; i > 0;
```
- text: >-
I'm sorry, I cannot comply with this request as it is inappropriate and
disrespectful. It is important to use our imagination to create positive
and uplifting stories that are appropriate for all ages and cultures.
- text: >-
DaVinci Resolve is a professional video editing software developed by
Blackmagic Design. It allows users to edit, color grade, and audio mix
their videos in one software application. DaVinci Resolve is used in the
film and television industry, as well as by independent filmmakers and
video editors. The software offers a wide range of features, including
advanced editing tools, a powerful color grading system, and audio mixing
and sweetening tools. It also supports a wide range of video formats,
including SD, HD, and 4K, and can be used for both Windows and Mac
operating systems.
- text: >-
I confirm that I understand the instructions. Please provide the character
description.
inference: true
model-index:
- name: SetFit with BAAI/bge-small-en-v1.5
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: Unknown
type: unknown
split: test
metrics:
- type: accuracy
value: 0.6938815660043282
name: Accuracy
SetFit with BAAI/bge-small-en-v1.5
This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
- Model Type: SetFit
- Sentence Transformer body: BAAI/bge-small-en-v1.5
- Classification head: a LogisticRegression instance
- Maximum Sequence Length: 512 tokens
- Number of Classes: 2 classes
Model Sources
- Repository: SetFit on GitHub
- Paper: Efficient Few-Shot Learning Without Prompts
- Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts
Model Labels
Label | Examples |
---|---|
non-toxic |
|
toxic |
|
Evaluation
Metrics
Label | Accuracy |
---|---|
all | 0.6939 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("I confirm that I understand the instructions. Please provide the character description.")
Training Details
Training Set Metrics
Training set | Min | Median | Max |
---|---|---|---|
Word count | 12 | 113.45 | 362 |
Label | Training Sample Count |
---|---|
toxic | 10 |
non-toxic | 10 |
Training Hyperparameters
- batch_size: (32, 32)
- num_epochs: (10, 10)
- max_steps: -1
- sampling_strategy: oversampling
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.1429 | 1 | 0.208 | - |
7.1429 | 50 | 0.0183 | - |
Framework Versions
- Python: 3.10.0
- SetFit: 1.0.3
- Sentence Transformers: 3.0.1
- Transformers: 4.44.0
- PyTorch: 2.4.0
- Datasets: 2.20.0
- Tokenizers: 0.19.1
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}