--- library_name: setfit tags: - setfit - sentence-transformers - text-classification - generated_from_setfit_trainer metrics: - accuracy widget: - text: Who was Cleopatra? She was a queen of ancient Egypt. - text: Did you go anywhere interesting this weekend? Yes, I went to the zoo. - text: Can robots think like humans? Not exactly, but AI can mimic some thinking processes. - text: Can you name an adjective? 'Quick' is an adjective because it describes. - text: How does the water cycle work? Water evaporates, condenses into clouds, and then precipitates back to the ground. pipeline_tag: text-classification inference: true base_model: BAAI/bge-small-en-v1.5 --- # SetFit with BAAI/bge-small-en-v1.5 This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. The model has been trained using an efficient few-shot learning technique that involves: 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. 2. Training a classification head with features from the fine-tuned Sentence Transformer. ## Model Details ### Model Description - **Model Type:** SetFit - **Sentence Transformer body:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance - **Maximum Sequence Length:** 512 tokens - **Number of Classes:** 7 classes ### Model Sources - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit) ### Model Labels | Label | Examples | |:-----------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | English |

"Can you tell me about your favorite book? I love 'Harry Potter' because it's full of magic and adventure."
'What did you learn about poems today? We learned about rhymes and how they create a rhythm in poems.'
"Can you make a sentence using the word 'enigmatic'? The old man's smile was enigmatic, making me wonder what secrets he hid."

| | Math |

"What is 8 times 9? It's 72."
'How do you find the area of a rectangle? Multiply the length by the width.'
"What's the difference between a prime number and a composite number? A prime number has only two factors, 1 and itself, while a composite number has more than two factors."

| | Art |

'What colors do you mix to make green? Yellow and blue make green.'
'Who painted the Mona Lisa? Leonardo da Vinci painted it.'
"What's the difference between sculpture and pottery? Sculpture is the art of making figures while pottery is specifically making vessels from clay."

| | Science |

"What is photosynthesis? It's the process by which plants make their food using sunlight."
'Can you name the planets in our solar system? Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.'
"What's the difference between a solid and a liquid? A solid has a fixed shape while a liquid takes the shape of its container."

| | History |

'Who was the first president of the United States? George Washington was the first president.'
'Can you tell me about the Egyptian pyramids? They were massive tombs built for pharaohs, the biggest is the Pyramid of Giza.'
'What was the Renaissance? It was a period of great cultural and scientific advancement in Europe.'

| | Technology |

"What is the Internet? It's a global network of computers that can share information."
'Can you name a famous computer scientist? Alan Turing is known as one of the fathers of computer science.'
"What does 'AI' stand for? It stands for Artificial Intelligence."

| | NONE |

'What did you have for lunch today? I had a sandwich and some fruit.'
'Do you like playing outside? Yes, I love playing soccer with my friends.'
"What's your favorite TV show? I love watching 'SpongeBob SquarePants'."

| ## Uses ### Direct Use for Inference First install the SetFit library: ```bash pip install setfit ``` Then you can load this model and run inference. ```python from setfit import SetFitModel # Download from the 🤗 Hub model = SetFitModel.from_pretrained("bew/setfit-subject-model-basic") # Run inference preds = model("Who was Cleopatra? She was a queen of ancient Egypt.") ``` ## Training Details ### Training Set Metrics | Training set | Min | Median | Max | |:-------------|:----|:--------|:----| | Word count | 6 | 14.1333 | 30 | | Label | Training Sample Count | |:-----------|:----------------------| | Art | 10 | | English | 10 | | History | 10 | | Math | 10 | | NONE | 15 | | Science | 10 | | Technology | 10 | ### Training Hyperparameters - batch_size: (32, 32) - num_epochs: (10, 10) - max_steps: -1 - sampling_strategy: oversampling - body_learning_rate: (2e-05, 1e-05) - head_learning_rate: 0.01 - loss: CosineSimilarityLoss - distance_metric: cosine_distance - margin: 0.25 - end_to_end: False - use_amp: False - warmup_proportion: 0.1 - seed: 42 - eval_max_steps: -1 - load_best_model_at_end: False ### Training Results | Epoch | Step | Training Loss | Validation Loss | |:------:|:----:|:-------------:|:---------------:| | 0.0067 | 1 | 0.1987 | - | | 0.3333 | 50 | 0.1814 | - | | 0.6667 | 100 | 0.128 | - | | 1.0 | 150 | 0.0146 | - | | 1.3333 | 200 | 0.006 | - | | 1.6667 | 250 | 0.0037 | - | | 2.0 | 300 | 0.0031 | - | | 2.3333 | 350 | 0.0027 | - | | 2.6667 | 400 | 0.0024 | - | | 3.0 | 450 | 0.0024 | - | | 3.3333 | 500 | 0.002 | - | | 3.6667 | 550 | 0.002 | - | | 4.0 | 600 | 0.0017 | - | | 4.3333 | 650 | 0.0019 | - | | 4.6667 | 700 | 0.0018 | - | | 5.0 | 750 | 0.0014 | - | | 5.3333 | 800 | 0.0013 | - | | 5.6667 | 850 | 0.0014 | - | | 6.0 | 900 | 0.0014 | - | | 6.3333 | 950 | 0.0014 | - | | 6.6667 | 1000 | 0.0016 | - | | 7.0 | 1050 | 0.0013 | - | | 7.3333 | 1100 | 0.0013 | - | | 7.6667 | 1150 | 0.0012 | - | | 8.0 | 1200 | 0.0014 | - | | 8.3333 | 1250 | 0.001 | - | | 8.6667 | 1300 | 0.0012 | - | | 9.0 | 1350 | 0.0014 | - | | 9.3333 | 1400 | 0.0012 | - | | 9.6667 | 1450 | 0.0012 | - | | 10.0 | 1500 | 0.0011 | - | ### Framework Versions - Python: 3.10.12 - SetFit: 1.0.3 - Sentence Transformers: 2.3.1 - Transformers: 4.35.2 - PyTorch: 2.1.0+cu121 - Datasets: 2.17.0 - Tokenizers: 0.15.2 ## Citation ### BibTeX ```bibtex @article{https://doi.org/10.48550/arxiv.2209.11055, doi = {10.48550/ARXIV.2209.11055}, url = {https://arxiv.org/abs/2209.11055}, author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Efficient Few-Shot Learning Without Prompts}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} } ```