vijaye12 commited on
Commit
746a2a0
1 Parent(s): ab779ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +153 -0
README.md CHANGED
@@ -1,3 +1,156 @@
1
  ---
2
  license: cdla-permissive-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cdla-permissive-2.0
3
  ---
4
+
5
+ # Model Card for TTM
6
+
7
+ TTM refers to the initial open-source release of Pretrained TinyTimeMixers from IBM Research. With less than 1 Million parameters, TTM
8
+ introduces the notion of the first-ever “tiny” pre-trained models for Time-Series Forecasting. TTM outperforms several popular benchmarks
9
+ demanding billions of parameters in zero-shot and few-shot forecasting. TTM is pre-trained on diverse public time-series datasets which can
10
+ be easily fine-tuned for your target data. Refer to our [paper](https://arxiv.org/pdf/2401.03955.pdf) for more details.
11
+
12
+ **Note that zeroshot, fine-tuning and inference tasks using TTM can easily be executed in 1 GPU machine or in laptops too!!**
13
+
14
+ ## Model Description
15
+
16
+ TTM falls under the category of “focused pre-trained models”, wherein each pre-trained TTM is tailored for a particular forecasting
17
+ setting (governed by the context length and forecast length). Instead of building one massive model supporting all forecasting settings,
18
+ we opt for the approach of constructing smaller pre-trained models, each focusing on a specific forecasting setting, thereby
19
+ yielding more accurate results. Furthermore, this approach ensures that our models remain extremely small and exceptionally fast,
20
+ facilitating easy deployment without demanding a ton of resources.
21
+
22
+ Hence, in this model card, we plan to release several pre-trained
23
+ TTMs that can cater to many common forecasting settings in practice. Additionally, we have released our source code along with
24
+ our pretraining scripts that users can utilize to pretrain models on their own. Pretraining TTMs is very easy and fast, taking
25
+ only 3-6 hours using 6 A100 GPUs, as opposed to several days or weeks in traditional approaches.
26
+
27
+ ## Model Releases (along with the branch name where the models are stored):
28
+
29
+ - 512-96: Given the last 512 time-points (i.e. context length), this model can forecast the next 96 time-points (i.e. forecast length)
30
+ in future. Recommended for hourly and minutely forecasts (Ex. resolutions 5 min, 10 min, 15 min, etc) (branch name: main)
31
+
32
+ - 1024-96: Given the last 1024 time-points (i.e. context length), this model can forecast the next 96 time-points (i.e. forecast length)
33
+ in future. Recommended for hourly and minutely forecasts (Ex. resolutions 5 min, 10 min, 15 min, etc) (branch name: 1024-96-v1)
34
+
35
+ - Stay tuned for more models !
36
+
37
+ ## Benchmark Highlights:
38
+
39
+ TTM outperforms pre-trained GPT4TS (NeurIPS 23) by …
40
+
41
+ TTM outperforms pre-trained LLMTime (NeurIPS 23) by ..
42
+
43
+ TTM outperforms pre-trained Time-LLM (NeurIPS 23) by ..
44
+
45
+ TTM outperform pre-trained MOIRAI by …
46
+
47
+ TTM outperforms other popular benchmarks by ….
48
+
49
+ TTM also outperforms the hard statistical baselines (Statistical ensemble and S-Naive) in M4-hourly dataset which pretrained TS models are finding hard to outperform.
50
+
51
+ ## Model Details
52
+
53
+ For more details on TTM architecture and benchmarks, refer to our [paper](https://arxiv.org/pdf/2401.03955.pdf).
54
+
55
+ TTM-1 currently supports 2 modes:
56
+
57
+ - Zeroshot forecasting: Directly apply the pre-trained model on your target data to get an initial forecast (with no training).
58
+
59
+ - Finetuned forecasting: Finetune the pre-trained model with your target data to further improve the forecast.
60
+
61
+ **Since, TTM models are extremely small and fast, it is practically very easy to finetune the model with your available target data to
62
+ get more accurate forecasts.**
63
+
64
+ The current release supports multivariate forecasting via both channel independence and channel-mixing approaches.
65
+ Decoder Channel-Mixing can be enabled during fine-tuning for capturing strong channel-correlation patterns across
66
+ time-series variates, critical capability lacking in existing counterparts.
67
+
68
+ In addition, TTM also supports exogenous infusion and categorical data which is not released as part of this version.
69
+ Stay tuned for these extended features.
70
+
71
+ ## Recommended Use
72
+ 1. Users have to standard scale their data before feeding it to the model (Refer to TSP, our data processing utility for data scaling.)
73
+ 2. Enabling any upsampling or prepending zeros to virtually increase the context length is not recommended and will
74
+ impact the model performance.
75
+
76
+
77
+ ### Model Sources [optional]
78
+
79
+ <!-- Provide the basic links for the model. -->
80
+
81
+ - **Repository:** [More Information Needed]
82
+ - **Paper [optional]:** [More Information Needed]
83
+
84
+
85
+ ## Uses
86
+
87
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
88
+
89
+ ### Direct Use
90
+
91
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
92
+
93
+ [More Information Needed]
94
+
95
+ ### Downstream Use [optional]
96
+
97
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
98
+
99
+ [More Information Needed]
100
+
101
+ ## How to Get Started with the Model
102
+
103
+ [Point notebooks]
104
+
105
+ ## Benchmarks
106
+
107
+ ## Training Data
108
+
109
+ The TTM models were trained on a collection of datasets from the Monash Time Series Forecasting repository. The datasets used include:
110
+ - Australian Electricity Demand: https://zenodo.org/records/4659727
111
+ - Australian Weather: https://zenodo.org/records/4654822
112
+ - Bitcoin dataset: https://zenodo.org/records/5122101
113
+ - KDD Cup 2018 dataset: https://zenodo.org/records/4656756
114
+ - London Smart Meters: https://zenodo.org/records/4656091
115
+ - Saugeen River Flow: https://zenodo.org/records/4656058
116
+ - Solar Power: https://zenodo.org/records/4656027
117
+ - Sunspots: https://zenodo.org/records/4654722
118
+ - Solar: https://zenodo.org/records/4656144
119
+ - US Births: https://zenodo.org/records/4656049
120
+ - Wind Farms Production data: https://zenodo.org/records/4654858
121
+ - Wind Power: https://zenodo.org/records/4656032
122
+
123
+
124
+ ## Citation [optional]
125
+ Kindly cite the following paper, if you intend to use our model or its associated architectures/approaches in your
126
+ work
127
+
128
+ **BibTeX:**
129
+
130
+ @article{ekambaram2024ttms,
131
+ title={TTMs: Fast Multi-level Tiny Time Mixers for Improved Zero-shot and Few-shot Forecasting of Multivariate Time Series},
132
+ author={Ekambaram, Vijay and Jati, Arindam and Nguyen, Nam H and Dayama, Pankaj and Reddy, Chandra and Gifford, Wesley M and Kalagnanam, Jayant},
133
+ journal={arXiv preprint arXiv:2401.03955},
134
+ year={2024}
135
+ }
136
+
137
+ **APA:**
138
+
139
+ Ekambaram, V., Jati, A., Nguyen, N. H., Dayama, P., Reddy, C., Gifford, W. M., & Kalagnanam, J. (2024). TTMs: Fast Multi-level Tiny Time Mixers for Improved Zero-shot and Few-shot Forecasting of Multivariate Time Series. arXiv preprint arXiv:2401.03955.
140
+
141
+
142
+ ## Model Card Authors [optional]
143
+
144
+ [More Information Needed]
145
+
146
+ ## Model Card Contact
147
+
148
+ [More Information Needed]
149
+
150
+ ## IBM Public Repository Disclosure:
151
+
152
+ All content in this repository including code has been provided by IBM under the associated
153
+ open source software license and IBM is under no obligation to provide enhancements,
154
+ updates, or support. IBM developers produced this code as an
155
+ open source project (not as an IBM product), and IBM makes no assertions as to
156
+ the level of quality nor security, and will not be maintaining this code going forward.