Add new SentenceTransformer model.

f327852 verified 2 months ago

No virus

11.4 kB

	---
	base_model: BAAI/bge-base-en-v1.5
	datasets: []
	language:
	- en
	library_name: sentence-transformers
	license: apache-2.0
	pipeline_tag: sentence-similarity
	tags:
	- sentence-transformers
	- sentence-similarity
	- feature-extraction
	- generated_from_trainer
	- dataset_size:6300
	- loss:MatryoshkaLoss
	- loss:MultipleNegativesRankingLoss
	widget:
	- source_sentence: Consumer Products segment decreased 10% to $3,572.5 million.
	sentences:
	- What was the impact of the Federal Reserve’s policy changes on Schwab money market
	funds in 2022?
	- What was the total revenue of Hasbro's Consumer Products segment in 2022?
	- How much did the company's currently payable U.S. taxes amount to in 2023?
	- source_sentence: PricewaterhouseCoopers LLP is mentioned as the Firm’s independent
	registered public accounting firm (PCAOB ID 238) in the audit of the Consolidated
	Financial Statements.
	sentences:
	- Where in the document can the Consolidated Financial Statements be found as mentioned
	in a 2024 report?
	- What type of firm is PricewaterhouseCoopers LLP as mentioned in the context of
	auditing?
	- Which note in the report provides details about legal proceedings?
	- source_sentence: If, in the future, foreign exchange or capital control restrictions
	were to be imposed and become applicable to us, such restrictions could potentially
	reduce the amounts that we would be able to receive from our Macao, Hong Kong
	and mainland China subsidiaries.
	sentences:
	- What are the potential consequences for the parent company if foreign exchange
	or capital control restrictions were imposed in the future?
	- What is described under Item 8 in the context of a financial document?
	- What types of investments are primarily included in the Goldman Sachs' investments
	in funds at NAV as of December 2023?
	- source_sentence: Determining income tax provisions involves forecasting future financial
	results, planning potential tax strategies, and evaluating the probability of
	sustaining tax positions against audits.
	sentences:
	- What type of company is Johnson & Johnson described as?
	- What determines the fair value of available-for-sale short-term investments?
	- What factors influence the determination of income tax provisions and related
	tax balances?
	- source_sentence: During the fiscal year ended March 31, 2023, a $118 million tax
	charge increased the valuation allowance on Swiss deferred tax assets, leading
	to a higher effective tax rate.
	sentences:
	- What accounted for the significant tax rate increase in fiscal year 2023?
	- What percentage of the box office revenue in the U.S./Canada was generated by
	the three largest exhibitors in 2023?
	- What percentage of eBay's 2023 net revenues were attributed to international markets?
	---

	# BGE base Financial Matryoshka

	This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

	## Model Details

	### Model Description
	- Model Type: Sentence Transformer
	- Base model: [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
	- Maximum Sequence Length: 512 tokens
	- Output Dimensionality: 768 tokens
	- Similarity Function: Cosine Similarity
	<!-- - Training Dataset: Unknown -->
	- Language: en
	- License: apache-2.0

	### Model Sources

	- Documentation: [Sentence Transformers Documentation](https://sbert.net)
	- Repository: [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
	- Hugging Face: [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

	### Full Model Architecture

	```
	SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	)
	```

	## Usage

	### Direct Usage (Sentence Transformers)

	First install the Sentence Transformers library:

	```bash
	pip install -U sentence-transformers
	```

	Then you can load this model and run inference.
	```python
	from sentence_transformers import SentenceTransformer

	# Download from the 🤗 Hub
	model = SentenceTransformer("ValentinaKim/bge-base-financial-matryoshka4")
	# Run inference
	sentences = [
	'During the fiscal year ended March 31, 2023, a $118 million tax charge increased the valuation allowance on Swiss deferred tax assets, leading to a higher effective tax rate.',
	'What accounted for the significant tax rate increase in fiscal year 2023?',
	'What percentage of the box office revenue in the U.S./Canada was generated by the three largest exhibitors in 2023?',
	]
	embeddings = model.encode(sentences)
	print(embeddings.shape)
	# [3, 768]

	# Get the similarity scores for the embeddings
	similarities = model.similarity(embeddings, embeddings)
	print(similarities.shape)
	# [3, 3]
	```

	<!--
	### Direct Usage (Transformers)

	<details><summary>Click to see the direct usage in Transformers</summary>

	</details>
	-->

	<!--
	### Downstream Usage (Sentence Transformers)

	You can finetune this model on your own dataset.

	<details><summary>Click to expand</summary>

	</details>
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Dataset

	#### Unnamed Dataset


	* Size: 6,300 training samples
	* Columns: <code>positive</code> and <code>anchor</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| positive \| anchor \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 2 tokens</li><li>mean: 46.25 tokens</li><li>max: 512 tokens</li></ul> \| <ul><li>min: 2 tokens</li><li>mean: 20.35 tokens</li><li>max: 51 tokens</li></ul> \|
	* Samples:
	\| positive \| anchor \|
	\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:--------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>For the year ended December 31, 2023, net cash used in financing activities included $1.8 billion for dividends to GM, which are eliminated within the consolidated statements of cash flows.</code> \| <code>What amount of dividends to GM were included in the net cash used in financing activities for GM Financial for the year ended December 31, 2023?</code> \|
	\| <code>Assets and liabilities of these foreign entities are translated at exchange rates in effect as of the balance sheet date.</code> \| <code>At what values are assets and liabilities of foreign entities translated in financial statements?</code> \|
	\| <code>The 21st Century Cures Act broadened patient access to certain enhanced benefits offered by Medicare Advantage plans, increasing the percentage of patients on these plans.</code> \| <code>How did the 21st Century Cures Act affect patient access to Medicare Advantage plans?</code> \|
	* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesRankingLoss",
	"matryoshka_dims": [
	768,
	512,
	256,
	128,
	64
	],
	"matryoshka_weights": [
	1,
	1,
	1,
	1,
	1
	],
	"n_dims_per_step": -1
	}
	```

	### Framework Versions
	- Python: 3.10.14
	- Sentence Transformers: 3.0.1
	- Transformers: 4.41.2
	- PyTorch: 2.1.2+cu121
	- Accelerate: 0.33.0
	- Datasets: 2.19.1
	- Tokenizers: 0.19.1

	## Citation

	### BibTeX

	#### Sentence Transformers
	```bibtex
	@inproceedings{reimers-2019-sentence-bert,
	title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
	author = "Reimers, Nils and Gurevych, Iryna",
	booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
	month = "11",
	year = "2019",
	publisher = "Association for Computational Linguistics",
	url = "https://arxiv.org/abs/1908.10084",
	}
	```

	#### MatryoshkaLoss
	```bibtex
	@misc{kusupati2024matryoshka,
	title={Matryoshka Representation Learning},
	author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
	year={2024},
	eprint={2205.13147},
	archivePrefix={arXiv},
	primaryClass={cs.LG}
	}
	```

	#### MultipleNegativesRankingLoss
	```bibtex
	@misc{henderson2017efficient,
	title={Efficient Natural Language Response Suggestion for Smart Reply},
	author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
	year={2017},
	eprint={1705.00652},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->