|
|
|
<h1><center>Falcon-7B-Hawkish-Lora</center></h1> |
|
|
|
<font size="1">rev1-119</font> |
|
|
|
<br/> |
|
|
|
<center>A GPT4-Distilled-Instruct-7B Model, trained with 9000+ Curated Financial & Economic Instruction Prompts. Shows improved reasoning and accuracy on some financial questions and benchmarks over ChatGPT and comparable scores to BloombergGPT (https://arxiv.org/pdf/2303.17564.pdf) on Public Finance NLP Benchmark scores while having 85% less parameters (7B vs 50B). 0.2 of an Epoch completed in released adapter weights and further work is to be done.. </center> |
|
|
|
<h2><center> Hawkish vs BloombergGPT </center></h2> |
|
<center>Financial Sentiment Analysis benchmark tests have been attempted to be matched as close as possible to BloombergGPT paper, 5 few shots and using F1 Weighted.</center> |
|
|
|
| FinBenchmark | Hawkish-7B | BloombergGPT | |
|
| :------------ |:------------- | :----------- | |
|
| Headline (SA)*| 71.29 | **82.20** | |
|
| FiQA SA | **77.01** | 75.07 | |
|
| FPB | **73.41** | 51.07 | |
|
|
|
|
|
<center><font size="1">*Sentiment analysis task only</font> </center> |
|
<br/> |
|
|
|
<h2><center> Hawkish vs CFA Level 1 Mock Exam</center></h2> |
|
|
|
<br/> |
|
<center>The new model surpasses Falcon-Instruct & ChatGPT on a publicly found CFA Level 1 mock exam both with one-shot prompting. Due to both models showing discrepences between samples, both were sampled twice for their best answers. First answer scores were 46.52 (ChatGPT) and 49.46 (Hawkish). </center> |
|
<br/> |
|
|
|
|
|
| CFA Level 1 Past Paper | # Questions| ChatGPT (%) | Falcon-Instruct (%) | Hawkish-7B (%) | Exam Weighting | |
|
| :---------------------------- | :--------: | :---------: | :----------------: | :------------: | :------------: | |
|
| Ethical and Professional | 18 | **66.66** | 33.3 | 50.0 | 0.15 | |
|
| Quantitative Methods | 14 | 57.15 | 50.0 | **78.57** | 0.1 | |
|
| Economics | 12 | **58.33** | 25.0 | 50.0 | 0.1 | |
|
| Financial Reporting | 24 | 37.5 | 25.0 | **58.3** | 0.15 | |
|
| Corporate Finance | 10 | 40.0 | 20.0 | **60.0** | 0.1 | |
|
| Equity Investments | 12 | 58.3 | 33.3 | **66.6** | 0.11 | |
|
| Fixed Income | 14 | 50.0 | 35.7 | **57.15** | 0.11 | |
|
| Derivatives | 6 | **66.6** | 16.7 | 50.0 | 0.06 | |
|
| Alternative Investments | 4 | **50.0** | 0.0 | 25.0 | 0.06 | |
|
| Portfolio Management | 6 | 0.0 | **50.0** | **50.0** | 0.06 | |
|
| Weighted Average | - | 50.1 | 29.8 | **56.2** | | |
|
<br/> |
|
|
|
|
|
<h3><center> Disclaimer & Intended Uses </center></h3> |
|
|
|
<center>This model is intended for use as a research artifact, exclusively for research purposes, to study the influence of financial data in the use of training. It is not recommended using this model in high-risk applications (e.g. educational or vocational training, product safety components, or other uses that may impact the well-being of individuals.) as it has not been evaluated fully. The opinions in this paper are solely those of the author and do not reflect the views of the evaluated exam boards or any affiliated entities, including the authors' employers. This research is independent and neither endorsed nor sponsored by these bodies. Any inaccuracies or omissions are entirely the authors' responsibility. Any findings are for academic discourse only and should be used at the reader's discretion. All trademarks and logos belong to their respective owners. </center> |
|
|
|
|
|
|
|
## Inference Code |
|
|
|
Use the provided instruction template for prompting, you can modify system prompt to "You are an AI assistant. You are a financial expert." for better performance on Finance and Economics instructions. Minimum 10GB VRAM and 15GB CPU RAM required. |
|
|
|
``` |
|
import torch |
|
from transformers import BitsAndBytesConfig, AutoModelForCausalLM, AutoTokenizer |
|
from peft import PeftModel |
|
|
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_compute_dtype=getattr(torch, "bfloat16"), |
|
bnb_4bit_use_double_quant=True, |
|
) |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("tiiuae/falcon-7b") |
|
tokenizer.pad_token = tokenizer.eos_token |
|
model = AutoModelForCausalLM.from_pretrained( |
|
"tiiuae/falcon-7b", quantization_config=bnb_config, device_map="auto", trust_remote_code=True |
|
) |
|
model = PeftModel.from_pretrained(model, "mukaj/falcon-7b-hawkish-lora") |
|
|
|
def PROMPT_TEMPLATE(system, human): |
|
return f"### System: {system}\n### Human: {human}\n### Assistant:" |
|
|
|
PROMPT = PROMPT_TEMPLATE("You are a helpful assistant.","Which bird is on the Albanian flag?") |
|
|
|
inputs = tokenizer(PROMPT, return_tensors="pt").to(device) |
|
|
|
outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"), |
|
attention_mask=inputs["attention_mask"], |
|
max_new_tokens=10, |
|
pad_token_id=tokenizer.eos_token_id, |
|
do_sample=True, temperature=0.8, top_p=0.9) |
|
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
|
### The bird on the Albanian flag is the eagle. |
|
|
|
``` |
|
|
|
|
|
# Financial Domain Completion Examples |
|
|
|
Equities |
|
``` |
|
### System: You are an AI assistant. You are a financial expert and answer questions with respect to finance and economic domains. |
|
### Human: Explain the types of standard ratios used in company analysis. |
|
### Assistant: The types of standard ratios used in company analysis are: |
|
|
|
1. Liquidity ratios: These ratios measure a company's ability to meet its short-term obligations. |
|
2. Profitability ratios: These ratios measure a company's ability to generate profits. |
|
3. Efficiency ratios: These ratios measure a company's ability to use its assets and liabilities effectively. |
|
4. Market ratios: These ratios measure a company's ability to meet the expectations of its investors. |
|
5. Financial leverage ratios: These ratios measure a company's ability to use debt to finance its assets. |
|
6. Activity ratios: These ratios measure a company's ability to generate sales and profits. |
|
7. Asset turnover ratios: These ratios measure a company's ability to generate sales and profits from its assets. |
|
8. Investment ratios: These ratios measure a company's ability to generate profits from its investments. |
|
9. Coverage ratios: These ratios measure a company's ability to meet its short-term obligations. |
|
10. Du Pont ratios: These ratios measure a company's ability to generate profits from its assets and investments. |
|
``` |
|
|
|
<hr style="border:1px solid gray"> |
|
|
|
Fixed Income |
|
``` |
|
### System: You are an AI assistant. You are a financial expert and answer questions with respect to finance and economic domains. |
|
### Human: What is CMBS-level call protection, and is a Residual Tranche an example of one or is a Prepayment lockout an example? Or is that a individual loan level protection? |
|
### Assistant: CMBS-level call protection is a level of protection that covers all individual loans in a CMBS transaction, |
|
and ensures that any single prepayment on any class of assets covered by the loan level protection, will not result in a call of the remaining underlying assets. |
|
A Residual Tranche is an example of CMBS-level call protection, which is more commonly employed in public deals, as opposed to private transactions. |
|
``` |
|
<hr style="border:1px solid gray"> |
|
|
|
Economics |
|
``` |
|
### System: You are an AI assistant. You are a financial expert and answer questions with respect to finance and economic domains. |
|
### Human: A new country is formed in Wakanda. They introduce a new currency. How does the international market value what it is worth?. |
|
### Assistant: The value of such a novel economy could be difficult to determine. Assume a nominal return of 0.4%, constant to inflation based on the national nominal wage. |
|
Assume nominal wages of both countries have a 200% price level (real price would be less). GDP of the novel sector will be about 5% (20% of the total GDP), assuming labor productivity is unchanged. National product will decrease. |
|
With trade of a factor of the nominal price level, the novel GDP will be about 5%x20(50%). Nominal wage in the novel economy can be the same as for the main economy, resulting in the price level equal to (3/5)*(200/100) = 60/. |
|
One of the challenges in valuing this asset will be deciding if the novel sector should be valued more as the capital market or the labor market. Assume a 30-year horizon. |
|
Real wage of labor will be about 8% with national real interest rate at 5%. Assume that nominal returns and capital gains yield a 7/12 interest rate. |
|
The present value of the factor-adjusted wage-based return would be: $1.4(7/12)×15(5). |
|
It would be a risky asset yielding a return of about 4.6%. This is also equivalent to investing 0.4 + $1.4×(30) = $50 at risk in 30 years. |
|
Of course, if the rate is higher, the value of the real wage is correspondingly higher. The return to holding labor is the wage rate, plus a risk premium equal to 0.5/4.4/3*(30). |
|
The present value of expected cash flows and the intrinsic value of capital are assumed equal. In general, the expected discounted cash flows will be the market price, less expected inflation and real interest rates for those sectors. |
|
The present value is therefore determined by multiplying the cash flows by an appropriate discount rate which can be estimated based by using historical inflation-adjusted returns, |
|
and discounting cash flows at the domestic risk-adjusted risk premium and discounting future flows at the domestic real interest rate. |
|
``` |
|
<hr style="border:1px solid gray"> |
|
|
|
Real Estate |
|
``` |
|
### System: You are an AI assistant. You are a financial expert and answer questions with respect to finance and economic domains. |
|
### Human: Joshua Petersen is a real estate analyst that wants to make appropriate adjustments to a capitalization rate. He predicts that vacancy rates will increase and that the availability of credit will decrease. Based on these views, it is most likely that Petersen: |
|
|
|
1) should increase the capitalization rate. |
|
|
|
2) may need to either increase or decrease the capitalization rate because the two predictions have offsetting effects. |
|
|
|
3) should decrease the capitalization rate. |
|
### Assistant: B) To determine the impact of vacancy rate fluctuations and availability of credit the two views must be considered as opposed to simply adjusting due to increases in vacancies for vacant rates to go down |
|
With these two views are the vacancy increases and decreases likely to result in a decline in available capital with credit getting more selective? To make a more robust conclusion, we'll use some of this input. |
|
We will assume vacancy rates remain unchanged and assume available lines of credit are unchanged. In those two cases, vacancy increases can have a much greater effect leading to even higher and worse risks to asset value. |
|
We're also assuming here that there are no issues with interest in rates, so higher vacancy rate and limited credit availability are really enough to put real assets assets out of consideration until conditions look better. |
|
A decline in capital appreciation potential due to falling value of real estate, may require the investor to exit and switch to safer investments or wait out the conditions and have the potential of a huge capital gain when conditions improve. |
|
From looking to real time statistics the conditions mentioned probably wouldn't be significant in most real contexts of what to capitalize on at real rates. |
|
Real estate investment in certain large metros may see value drops, other investment categories will show some movement up while still some will look flat. |
|
|
|
Here's where your analysis is missing, an important point not to skip that impacts whether all investors leave because of real estate volatility being high. |
|
With vacancy rates and credit issues combined, it is true that assets would be viewed as poor assets to capitalize rates or real assets in general at that moment, |
|
because so much money leaves market causing lack of market liquidity which makes investing an almost irrational thing at points. |
|
You have a good understanding with this but need the market understanding and data side, then add it to a robust analysis of macro outlook. |
|
Your conclusion is based on real market conditions and future conditions is uncertain so no hard conclusions should be drawn, |
|
there can be multiple markets over same period, in your same industry, or across the whole industry that are going down for various reasons which you need to determine prior to assuming. |
|
It gets more granular and looks deeper than numbers on a piece of paper for individual assets but instead it relates it with a market condition. Hope it helps. |
|
``` |
|
## Limitations & Further Work |
|
|
|
Model is not fully evaluated, it is still work in progess, and is only trained on 0.2 Epochs. |
|
|
|
<ol> |
|
<li>Repeat experiment with larger model.</li> |
|
<p>This is self explanatory, larger models have been shown to be better all-around, so a larger base model should help drastically.</p> |
|
<li>Filter data further.</li> |
|
<p>Less is more, filter to highest quality samples that maintain highest diversity and compare results.</p> |
|
<li>Data Mix Comparison</li> |
|
<p>Different mixes of finance weightings in the data need to be tried to get best performance.</p> |
|
</ol> |
|
|
|
|
|
### Compute |
|
|
|
Hardware: |
|
|
|
1 x RTX 4090 |
|
|
|
Training Software: |
|
|
|
https://huggingface.co/docs/trl/main/en/sft_trainer |
|
|
|
Max Seq Length: 2048 |
|
|
|
Learning Rate: 2e-4 |
|
|
|
|
|
## Attributions & References |
|
|
|
Base Model: https://huggingface.co/tiiuae/falcon-7b |
|
|
|
Trained with QLoRA: https://arxiv.org/abs/2305.14314 |
|
|
|
OpenOrca Data: https://huggingface.co/datasets/Open-Orca/OpenOrca |
|
|
|
Guanaco Data: https://huggingface.co/datasets/timdettmers/openassistant-guanaco |
|
|
|
CodeParrot Self Instruct: https://huggingface.co/datasets/codeparrot/self-instruct-starcoder |
|
|
|
BloombergGPT Paper: https://arxiv.org/pdf/2303.17564.pdf |
|
|
|
|