metadata

license: apache-2.0
datasets:
  - gbharti/finance-alpaca
language:
  - en
library_name: transformers
tags:
  - finance
widget:
  - text: >-
      Is this headline positive or negative? Headline: Australian Tycoon Forrest
      Shuts Nickel Mines After Prices Crash.
    example_title: Sentiment analysis
  - text: >-
      Aluminum price per KG is 50$. Forecast max: +1$ min:+0.3$. What should be
      the current price of aluminum?
    example_title: Forecast

Fin-RWKV: Attention Free Financal Expert (WIP)

Fin-RWKV is a cutting-edge, attention-free model designed specifically for financial analysis and prediction. Developed as part of a MindsDB Hackathon, this model leverages the simplicity and efficiency of the RWKV architecture to process financial data, providing insights and forecasts with remarkable accuracy. Fin-RWKV is tailored for professionals and enthusiasts in the finance sector who seek to integrate advanced deep learning techniques into their financial analyses.

Features

Attention-Free Architecture: Utilizes the RWKV (Recurrent Weighted Kernel-based) model, which bypasses the complexity of attention mechanisms while maintaining high performance.
Lower Costs: 10x to over a 100x+ lower inference cost, 2x to 10x lower training cost
Tinyyyy: Lightweight enough to run on CPUs in real-time bypassing the GPU - and is able to run on your laptop today
Finance-Specific Training: Trained on the gbharti/finance-alpaca dataset, ensuring that the model is finely tuned for financial data analysis.
Transformers Library Integration: Built on the popular 'transformers' library, ensuring easy integration with existing ML pipelines and applications.

Competing Against

Name	Param Count	Cost	Inference Cost
Fin-RWKV	169M	$1.45	Free on HuggingFace 🤗 & Low-End CPU
BloombergGPT	50 Billion	$1.3 million	Enterprise GPUs
FinGPT	7 Bilion	$302.4	Consumer GPUs

Architecture	Status	Compute Efficiency	Largest Model	Trained Token	Link
(Fin)RWKV	In Production	O ( N )	14B	500B++ (the pile+)	Paper
Ret Net (Microsoft)	Research	O ( N )	6.7B	100B (mixed)	Paper
State Space (Stanford)	Prototype	O ( Log N )	355M	15B (the pile, subset)	Paper
Liquid (MIT)	Research	-	<1M	-	Paper
Transformer Architecture (included for contrasting reference)	In Production	O ( N^2 )	800B (est)	13T++ (est)	-

Inference computational cost vs. Number of tokens

Note: Needs more data and training, testing purposes only.