SetFit with ppsingh/TAPP-multilabel-mpnet

This is a SetFit model that can be used for Text Classification. This SetFit model uses ppsingh/TAPP-multilabel-mpnet as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: ppsingh/TAPP-multilabel-mpnet
Classification head: a SetFitHead instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
NEGATIVE	'(p 70-1).Antigua and Barbuda’s 2021 update to the first Nationally Determined Contribution the most vulnerable in society have been predominantly focused on adaptation measures like building resilience to flooding and hurricanes. The updated NDC ambition provides an opportunity to focus more intently on enabling access to energy efficiency and renewable energy for the most vulnerable, particularly women who are most affected when electricity is not available since the grid is down after an extreme weather event. Nationally, Antigua and Barbuda intends to utilize the SIRF Fund as a mechanism primarily to catalyse and leverage investment in the transition for NGOs, MSMEs and informal sectors that normally cannot access traditional local commercial financing due to perceived high risks.' 'The transport system cost will be increased by 16.2% compared to the BAU level. Electric trucks and electric pick-ups will account for the highest share of investment followed by electric buses and trucks. In the manufacturing industries, the energy efficiency improvement in the heating and the motor systems and the deployment of CCS require the highest investment in the non-metallic and the chemical industries in 2050. The manufacturing industries system cost will be increased by 15.3% compared to the BAU level.' 'Figure 1-9: Total GHG emissions by sector (excluding LULUCF) 2000 and 2016 1.2.2 Greenhouse Gas Emission by Sector • Energy Total direct GHG emissions from the Energy sector in 2016 were estimated to be 253,895.61 eq. The majority of GHG emissions in the Energy sector were generated by fuel combustion, consisting mostly of grid-connected electricity and heat production at around eq (42.84%). GHG emissions from Transport, Manufacturing Industries and Construction, and other sectors were 68,260.17 GgCO2 eq eq (6.10%), respectively. Fugitive Emissions from fuel eq or a little over 4.33% of total GHG emissions from the Energy sector. Details of GHG emissions in the Energy sector by gas type and source in 2016 are presented in Figure 1-10. Source: Thailand Third Biennial Update Report, UNFCCC 2020.'
TARGET	'DNPM, NFA,. Cocoa. Board,. Spice Board,. Provincial. gov-ernments. in the. Momase. region. Ongoing -. 2025. 340. European Union. Support committed. Priority Sector: Health. By 2030, 100% of the population benefit from introduced health measures to respond to malaria and other climate-sensitive diseases in PNG. Action or Activity. Indicator. Status. Lead. Implementing. Agencies. Supporting. Agencies. Time Frame. Budget (USD). Funding Source. (Existing/Potential). Other Support. Improve vector control. measures, with a priority. of all households having. access to a long-lasting. insecticidal net (LLIN).' 'Conditionality: With national effort it is intended to increase the attention to vulnerable groups in case of disasters and/or emergencies up to 50% of the target and 100% of the target with international cooperation. Description: In this goal, it is projected to increase coverage from 33% to 50% (211,000 families) of agricultural insurance in attention to the number of families, whose crops were affected by various adverse weather events (flood, drought, frost, hailstorm, among others), in addition to the implementation of comprehensive actions for risk management and adaptation to Climate Change.' 'By 2030, upgrade watershed health and vitality in at least 20 districts to a higher condition category. By 2030, create an inventory of wetlands in Nepal and sustainably manage vulnerable wetlands. By 2025, enhance the sink capacity of the landuse sector by instituting the Forest Development Fund (FDF) for compensation of plantations and forest restoration. Increase growing stock including Mean Annual Increment in Tarai, Hills and Mountains. Afforest/reforest viable public and private lands, including agroforestry.'

Label

Examples

NEGATIVE

'(p 70-1).Antigua and Barbuda’s 2021 update to the first Nationally Determined Contribution the most vulnerable in society have been predominantly focused on adaptation measures like building resilience to flooding and hurricanes. The updated NDC ambition provides an opportunity to focus more intently on enabling access to energy efficiency and renewable energy for the most vulnerable, particularly women who are most affected when electricity is not available since the grid is down after an extreme weather event. Nationally, Antigua and Barbuda intends to utilize the SIRF Fund as a mechanism primarily to catalyse and leverage investment in the transition for NGOs, MSMEs and informal sectors that normally cannot access traditional local commercial financing due to perceived high risks.'
'The transport system cost will be increased by 16.2% compared to the BAU level. Electric trucks and electric pick-ups will account for the highest share of investment followed by electric buses and trucks. In the manufacturing industries, the energy efficiency improvement in the heating and the motor systems and the deployment of CCS require the highest investment in the non-metallic and the chemical industries in 2050. The manufacturing industries system cost will be increased by 15.3% compared to the BAU level.'
'Figure 1-9: Total GHG emissions by sector (excluding LULUCF) 2000 and 2016 1.2.2 Greenhouse Gas Emission by Sector • Energy Total direct GHG emissions from the Energy sector in 2016 were estimated to be 253,895.61 eq. The majority of GHG emissions in the Energy sector were generated by fuel combustion, consisting mostly of grid-connected electricity and heat production at around eq (42.84%). GHG emissions from Transport, Manufacturing Industries and Construction, and other sectors were 68,260.17 GgCO2 eq eq (6.10%), respectively. Fugitive Emissions from fuel eq or a little over 4.33% of total GHG emissions from the Energy sector. Details of GHG emissions in the Energy sector by gas type and source in 2016 are presented in Figure 1-10. Source: Thailand Third Biennial Update Report, UNFCCC 2020.'

TARGET

'DNPM, NFA,. Cocoa. Board,. Spice Board,. Provincial. gov-ernments. in the. Momase. region. Ongoing -. 2025. 340. European Union. Support committed. Priority Sector: Health. By 2030, 100% of the population benefit from introduced health measures to respond to malaria and other climate-sensitive diseases in PNG. Action or Activity. Indicator. Status. Lead. Implementing. Agencies. Supporting. Agencies. Time Frame. Budget (USD). Funding Source. (Existing/Potential). Other Support. Improve vector control. measures, with a priority. of all households having. access to a long-lasting. insecticidal net (LLIN).'
'Conditionality: With national effort it is intended to increase the attention to vulnerable groups in case of disasters and/or emergencies up to 50% of the target and 100% of the target with international cooperation. Description: In this goal, it is projected to increase coverage from 33% to 50% (211,000 families) of agricultural insurance in attention to the number of families, whose crops were affected by various adverse weather events (flood, drought, frost, hailstorm, among others), in addition to the implementation of comprehensive actions for risk management and adaptation to Climate Change.'
'By 2030, upgrade watershed health and vitality in at least 20 districts to a higher condition category. By 2030, create an inventory of wetlands in Nepal and sustainably manage vulnerable wetlands. By 2025, enhance the sink capacity of the landuse sector by instituting the Forest Development Fund (FDF) for compensation of plantations and forest restoration. Increase growing stock including Mean Annual Increment in Tarai, Hills and Mountains. Afforest/reforest viable public and private lands, including agroforestry.'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("ppsingh/iki_target_setfit")
# Run inference
preds = model("In the oil sector, the country has benefited from 372 million dollars for the reduction of gas flaring at the initiative (GGFR - \"Global Gas Flaring Reduction\") of the World Bank after having adopted in November 2015 a national reduction plan flaring and associated gas upgrading. In the electricity sector, the NDC highlights the development of hydroelectricity which should make it possible to cover 80% of production in 2025, the remaining 20% &ZeroWidthSpace;&ZeroWidthSpace;being covered by gas and other renewable energies.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	58	116.6632	508

Label	Training Sample Count
NEGATIVE	51
TARGET	44

Training Hyperparameters

batch_size: (8, 2)
num_epochs: (1, 0)
max_steps: -1
sampling_strategy: undersampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0018	1	0.3343	-
0.1783	100	0.0026	0.1965
0.3565	200	0.0001	0.1995
0.5348	300	0.0001	0.2105
0.7130	400	0.0001	0.2153
0.8913	500	0.0	0.1927

Training Results Classifier

Classes Representation in Test Data: Target: 9, Negative: 8
F1-score: 87.8%
Accuracy: 88.2%

Environmental Impact

Carbon emissions were measured using CodeCarbon.

Carbon Emitted: 0.006 kg of CO2
Hours Used: 0.185 hours

Training Hardware

On Cloud: No
GPU Model: 1 x Tesla T4
CPU Model: Intel(R) Xeon(R) CPU @ 2.00GHz
RAM Size: 12.67 GB

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 2.3.1
Transformers: 4.35.2
PyTorch: 2.1.0+cu121
Datasets: 2.3.0
Tokenizers: 0.15.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

ppsingh
/

iki_target_setfit