Update README.md
Browse files
README.md
CHANGED
@@ -26,7 +26,7 @@ FinanceConnect is a state-of-the-art, open-source chat model tailored for financ
|
|
26 |
|
27 |
Drawing strength from the FinTalk-19k and Alpaca dataset, a curated collection focused on financial knowledge, this model provides insights and information related to the finance industry. For a deeper dive into the dataset, visit: [FinTalk-19k](https://huggingface.co/datasets/ceadar-ie/FinTalk-19k), [Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca)
|
28 |
|
29 |
-
|
30 |
|
31 |
- **Developed by:** CeADAR Connect Group
|
32 |
- **Model type:** Large Language Model
|
@@ -35,21 +35,16 @@ Drawing strength from the FinTalk-19k and Alpaca dataset, a curated collection f
|
|
35 |
|
36 |
## Key Features and Functionalities
|
37 |
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
### Performance Optimisation
|
43 |
-
Efficient performance across both CPU and GPU platforms.
|
44 |
-
### Data Representation
|
45 |
-
Utilises a combination of comprehensive Finance dataset, enabling content generation to professional standards.
|
46 |
|
47 |
## Model Usage
|
48 |
-
|
49 |
Experience the capabilities of the FinanceConnect model through a well-structured Python interface. To kick-start your exploration, follow the steps and snippets given below:
|
50 |
|
51 |
-
|
52 |
-
|
53 |
|
54 |
```python
|
55 |
import torch
|
@@ -67,7 +62,7 @@ from transformers import (
|
|
67 |
import time
|
68 |
```
|
69 |
|
70 |
-
|
71 |
|
72 |
```python
|
73 |
model_name = "ceadar-ie/FinanceConnect-13B"
|
@@ -75,7 +70,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_name)
|
|
75 |
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, load_in_8bit = True, device_map = "auto", trust_remote_code=True)
|
76 |
```
|
77 |
|
78 |
-
|
79 |
|
80 |
```python
|
81 |
def generate_text(input_text):
|
@@ -104,11 +99,10 @@ def generate_text(input_text):
|
|
104 |
```
|
105 |
|
106 |
## Example Evaluation and Use
|
107 |
-
|
108 |
### Example Prompt 1:
|
109 |
List in detail ten key factors influencing the current state of the global economy.
|
110 |
|
111 |
-
|
112 |
The current state of the global economy is influenced by several factors, including:
|
113 |
|
114 |
1. Globalization: The increasing interconnectedness of economies through trade, investment, and the movement of people.
|
@@ -125,10 +119,10 @@ The current state of the global economy is influenced by several factors, includ
|
|
125 |
### Example Prompt 2:
|
126 |
Explain the concept of quantitative easing and its impact on financial markets.
|
127 |
|
128 |
-
|
129 |
Quantitative easing is a monetary policy tool used by central banks to stimulate economic growth and combat inflation. It involves the purchase of assets such as government bonds, mortgage-backed securities, and corporate bonds by a central bank. This increases the amount of money in circulation, lowers interest rates, and encourages banks to lend more, which can boost economic growth. However, quantitative easing can also lead to inflation if it is not implemented carefully, as it can cause prices to rise more than wages can keep up with.
|
130 |
-
## Training Details
|
131 |
|
|
|
132 |
### Training Hyperparameters
|
133 |
- per_device_train_batch_size = 10
|
134 |
- gradient_accumulation_steps = 4
|
@@ -139,20 +133,24 @@ Quantitative easing is a monetary policy tool used by central banks to stimulate
|
|
139 |
|
140 |
## Model Limitations
|
141 |
Potential Biases: With its fine-tuning centered on financial conversations sources, inherent biases from these sources may reflect in the model's outputs.
|
|
|
142 |
## Licensing
|
143 |
The FinanceConnect model, developed by CeADAR Connect Group, combines the licensing frameworks of Llama2, FinTalk-8k and Alpaca. Under Meta's terms, users are granted a non-exclusive, worldwide, non-transferable, royalty-free limited license for the use and modification of Llama Materials, inclusive of the Llama2 model and its associated documentation. When redistributing, the provided Agreement and a specific attribution notice must be included. In alignment with the FinTalk dataset's licensing and Alpaca dataset's licensing, the model is also distributed under the "cc-by-nc-4.0" license.
|
|
|
144 |
## Out-of-Scope Use
|
145 |
FinanceConnect is specifically tailored for finanical discussions and knowledge. It is not optimized for:
|
146 |
- General, non-AI-related conversations.
|
147 |
- Domain-specific tasks outside financial tasks.
|
148 |
- Direct interfacing with physical devices or applications.
|
|
|
149 |
## Bias, Risks, and Limitations
|
150 |
- Dataset Biases: The FinTalk-19k and Alpaca dataset may contain inherent biases that influence the model's outputs.
|
151 |
- Over-reliance: The model is an aid, not a replacement for human expertise. Decisions should be made with careful consideration.
|
152 |
- Content Understanding: The model lacks human-like understanding and cannot judge the veracity of knowledge.
|
153 |
- Language Limitations: The model's primary language is English. Performance may decrease with other languages.
|
154 |
- Knowledge Cut-off: The model may not be aware of events or trends post its last training update.
|
155 |
-
## Citation:
|
156 |
|
157 |
-
##
|
|
|
|
|
158 |
For any further inquiries or feedback concerning FinanceConnect, please forward your communications to [email protected]
|
|
|
26 |
|
27 |
Drawing strength from the FinTalk-19k and Alpaca dataset, a curated collection focused on financial knowledge, this model provides insights and information related to the finance industry. For a deeper dive into the dataset, visit: [FinTalk-19k](https://huggingface.co/datasets/ceadar-ie/FinTalk-19k), [Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca)
|
28 |
|
29 |
+
## Model Specification
|
30 |
|
31 |
- **Developed by:** CeADAR Connect Group
|
32 |
- **Model type:** Large Language Model
|
|
|
35 |
|
36 |
## Key Features and Functionalities
|
37 |
|
38 |
+
**Domain Specialization:** The FinanceConnect model is specialized in Finance conversations, serving as a resource for financial researchers, and enthusiasts.
|
39 |
+
**Model API Accessibility:** Offers a straightforward Python integration for generating financial content insights.
|
40 |
+
**Performance Optimisation:** Efficient performance across both CPU and GPU platforms.
|
41 |
+
**Data Representation:** Utilises a combination of comprehensive Finance dataset, enabling content generation to professional standards.
|
|
|
|
|
|
|
|
|
42 |
|
43 |
## Model Usage
|
|
|
44 |
Experience the capabilities of the FinanceConnect model through a well-structured Python interface. To kick-start your exploration, follow the steps and snippets given below:
|
45 |
|
46 |
+
### Prerequisites
|
47 |
+
#### 1. Ensure required packages are available
|
48 |
|
49 |
```python
|
50 |
import torch
|
|
|
62 |
import time
|
63 |
```
|
64 |
|
65 |
+
#### 2. Initiate the model and tokenizer
|
66 |
|
67 |
```python
|
68 |
model_name = "ceadar-ie/FinanceConnect-13B"
|
|
|
70 |
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, load_in_8bit = True, device_map = "auto", trust_remote_code=True)
|
71 |
```
|
72 |
|
73 |
+
#### 3. Create a function for generating text
|
74 |
|
75 |
```python
|
76 |
def generate_text(input_text):
|
|
|
99 |
```
|
100 |
|
101 |
## Example Evaluation and Use
|
|
|
102 |
### Example Prompt 1:
|
103 |
List in detail ten key factors influencing the current state of the global economy.
|
104 |
|
105 |
+
### Generated Output:
|
106 |
The current state of the global economy is influenced by several factors, including:
|
107 |
|
108 |
1. Globalization: The increasing interconnectedness of economies through trade, investment, and the movement of people.
|
|
|
119 |
### Example Prompt 2:
|
120 |
Explain the concept of quantitative easing and its impact on financial markets.
|
121 |
|
122 |
+
### Generated Output:
|
123 |
Quantitative easing is a monetary policy tool used by central banks to stimulate economic growth and combat inflation. It involves the purchase of assets such as government bonds, mortgage-backed securities, and corporate bonds by a central bank. This increases the amount of money in circulation, lowers interest rates, and encourages banks to lend more, which can boost economic growth. However, quantitative easing can also lead to inflation if it is not implemented carefully, as it can cause prices to rise more than wages can keep up with.
|
|
|
124 |
|
125 |
+
## Training Details
|
126 |
### Training Hyperparameters
|
127 |
- per_device_train_batch_size = 10
|
128 |
- gradient_accumulation_steps = 4
|
|
|
133 |
|
134 |
## Model Limitations
|
135 |
Potential Biases: With its fine-tuning centered on financial conversations sources, inherent biases from these sources may reflect in the model's outputs.
|
136 |
+
|
137 |
## Licensing
|
138 |
The FinanceConnect model, developed by CeADAR Connect Group, combines the licensing frameworks of Llama2, FinTalk-8k and Alpaca. Under Meta's terms, users are granted a non-exclusive, worldwide, non-transferable, royalty-free limited license for the use and modification of Llama Materials, inclusive of the Llama2 model and its associated documentation. When redistributing, the provided Agreement and a specific attribution notice must be included. In alignment with the FinTalk dataset's licensing and Alpaca dataset's licensing, the model is also distributed under the "cc-by-nc-4.0" license.
|
139 |
+
|
140 |
## Out-of-Scope Use
|
141 |
FinanceConnect is specifically tailored for finanical discussions and knowledge. It is not optimized for:
|
142 |
- General, non-AI-related conversations.
|
143 |
- Domain-specific tasks outside financial tasks.
|
144 |
- Direct interfacing with physical devices or applications.
|
145 |
+
|
146 |
## Bias, Risks, and Limitations
|
147 |
- Dataset Biases: The FinTalk-19k and Alpaca dataset may contain inherent biases that influence the model's outputs.
|
148 |
- Over-reliance: The model is an aid, not a replacement for human expertise. Decisions should be made with careful consideration.
|
149 |
- Content Understanding: The model lacks human-like understanding and cannot judge the veracity of knowledge.
|
150 |
- Language Limitations: The model's primary language is English. Performance may decrease with other languages.
|
151 |
- Knowledge Cut-off: The model may not be aware of events or trends post its last training update.
|
|
|
152 |
|
153 |
+
## Citation
|
154 |
+
|
155 |
+
## Contact
|
156 |
For any further inquiries or feedback concerning FinanceConnect, please forward your communications to [email protected]
|