Update README.md

beff054 verified 9 months ago

5.09 kB

	---
	license: apache-2.0
	base_model: climatebert/distilroberta-base-climate-f
	tags:
	- generated_from_trainer
	model-index:
	- name: SECTOR-multilabel-climatebert
	results: []
	datasets:
	- GIZ/policy_classification

	co2_eq_emissions:
	emissions: 23.3572576873636
	source: codecarbon
	training_type: fine-tuning
	on_cloud: true
	cpu_model: Intel(R) Xeon(R) CPU @ 2.00GHz
	ram_total_size: 12.6747894287109
	hours_used: 0.529
	hardware_used: 1 x Tesla T4
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# SECTOR-multilabel-climatebert

	This model is a fine-tuned version of [climatebert/distilroberta-base-climate-f](https://huggingface.co/climatebert/distilroberta-base-climate-f) on the [Policy-Classification](https://huggingface.co/datasets/GIZ/policy_classification) dataset.

	The loss function BCEWithLogitsLoss is modified with pos_weight to focus on recall, therefore instead of loss the evaluation metrics are used to assess the model performance during training
	It achieves the following results on the evaluation set:
	- Loss: 0.6028
	- Precision-micro: 0.6395
	- Precision-samples: 0.7543
	- Precision-weighted: 0.6475
	- Recall-micro: 0.7762
	- Recall-samples: 0.8583
	- Recall-weighted: 0.7762
	- F1-micro: 0.7012
	- F1-samples: 0.7655
	- F1-weighted: 0.7041

	## Model description

	The purpose of this model is to predict multiple labels simultaneously from a given input data. Specifically, the model will predict Sector labels - Agriculture,Buildings,
	Coastal Zone,Cross-Cutting Area,Disaster Risk Management (DRM),Economy-wide,Education,Energy,Environment,Health,Industries,LULUCF/Forestry,Social Development,Tourism,
	Transport,Urban,Waste,Water

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	- Training Dataset: 10031
	\| Class \| Positive Count of Class\|
	\|:-------------\|:--------\|
	\| Action \| 5416 \|
	\| Plans \| 2140 \|
	\| Policy \| 1396\|
	\| Target \| 2911 \|

	- Validation Dataset: 932
	\| Class \| Positive Count of Class\|
	\|:-------------\|:--------\|
	\| Action \| 513 \|
	\| Plans \| 198 \|
	\| Policy \| 122 \|
	\| Target \| 256 \|

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 9.07e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 300
	- num_epochs: 7

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Precision-micro \| Precision-samples \| Precision-weighted \| Recall-micro \| Recall-samples \| Recall-weighted \| F1-micro \| F1-samples \| F1-weighted \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:---------------:\|:-----------------:\|:------------------:\|:------------:\|:--------------:\|:---------------:\|:--------:\|:----------:\|:-----------:\|
	\| 0.6978 \| 1.0 \| 633 \| 0.5968 \| 0.3948 \| 0.5274 \| 0.4982 \| 0.7873 \| 0.8675 \| 0.7873 \| 0.5259 \| 0.5996 \| 0.5793 \|
	\| 0.485 \| 2.0 \| 1266 \| 0.5255 \| 0.5089 \| 0.6365 \| 0.5469 \| 0.7984 \| 0.8749 \| 0.7984 \| 0.6216 \| 0.6907 \| 0.6384 \|
	\| 0.3657 \| 3.0 \| 1899 \| 0.5248 \| 0.4984 \| 0.6617 \| 0.5397 \| 0.8141 \| 0.8769 \| 0.8141 \| 0.6183 \| 0.7066 \| 0.6393 \|
	\| 0.2585 \| 4.0 \| 2532 \| 0.5457 \| 0.5807 \| 0.7148 \| 0.5992 \| 0.8007 \| 0.8752 \| 0.8007 \| 0.6732 \| 0.7449 \| 0.6813 \|
	\| 0.1841 \| 5.0 \| 3165 \| 0.5551 \| 0.6016 \| 0.7426 \| 0.6192 \| 0.7937 \| 0.8677 \| 0.7937 \| 0.6844 \| 0.7590 \| 0.6917 \|
	\| 0.1359 \| 6.0 \| 3798 \| 0.5913 \| 0.6349 \| 0.7506 \| 0.6449 \| 0.7844 \| 0.8676 \| 0.7844 \| 0.7018 \| 0.7667 \| 0.7057 \|
	\| 0.1133 \| 7.0 \| 4431 \| 0.6028 \| 0.6395 \| 0.7543 \| 0.6475 \| 0.7762 \| 0.8583 \| 0.7762 \| 0.7012 \| 0.7655 \| 0.7041 \|

	\|label \| precision \|recall \|f1-score\| support\|
	\|:-------------:\|:---------:\|:-----:\|:------:\|:------:\|
	\|Action \|0.828 \|0.807 \|0.817 \| 513.0 \|
	\|Plans \|0.560 \|0.707 \|0.625 \| 198.0 \|
	\|Policy \|0.727 \|0.786 \|0.756 \| 122.0 \|
	\|Target \|0.741 \|0.886 \|0.808 \| 256.0 \|

	### Environmental Impact
	Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
	- Carbon Emitted: 0.02335 kg of CO2
	- Hours Used: 0.529 hours

	### Training Hardware
	- On Cloud: yes
	- GPU Model: 1 x Tesla T4
	- CPU Model: Intel(R) Xeon(R) CPU @ 2.00GHz
	- RAM Size: 12.67 GB


	### Framework versions

	- Transformers 4.38.1
	- Pytorch 2.1.0+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2