vijaye12 commited on
Commit
ba1f9d5
1 Parent(s): 1f5d691

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -21
README.md CHANGED
@@ -25,9 +25,9 @@ version supports point forecasting use-cases ranging from minutely to hourly res
25
  - Zero-shot results of TTM surpass the *few-shot results of many popular SOTA approaches* including
26
  PatchTST (ICLR 23), PatchTSMixer (KDD 23), TimesNet (ICLR 23), DLinear (AAAI 23) and FEDFormer (ICML 22).
27
  - TTM (1024-96, released in this model card with 1M parameters) outperforms pre-trained MOIRAI-Small (14M parameters) by 10%, MOIRAI-Base (91M parameters) by 2% and
28
- MOIRAI-Large (311M parameters) by 3% on zero-shot forecasting (fl = 96). (TODO: add notebook)
29
  - TTM quick fine-tuning also outperforms the hard statistical baselines (Statistical ensemble and S-Naive) in
30
- M4-hourly dataset which existing pretrained TS models are finding hard to outperform. (TODO: add notebook)
31
  - TTM takes only a *few seconds for zeroshot/inference* and a *few minutes for finetuning* in 1 GPU machine, as
32
  opposed to long timing-requirements and heavy computing infra needs of other existing pretrained models.
33
 
@@ -45,12 +45,15 @@ TTMs that can cater to many common forecasting settings in practice. Additionall
45
  our pretraining scripts that users can utilize to pretrain models on their own. Pretraining TTMs is very easy and fast, taking
46
  only 3-6 hours using 6 A100 GPUs, as opposed to several days or weeks in traditional approaches.
47
 
 
 
 
48
  ## Model Releases (along with the branch name where the models are stored):
49
 
50
- - 512-96: Given the last 512 time-points (i.e. context length), this model can forecast up to next 96 time-points (i.e. forecast length)
51
  in future. Recommended for hourly and minutely forecasts (Ex. resolutions 5 min, 10 min, 15 min, 1 hour, etc) (branch name: main)
52
 
53
- - 1024-96: Given the last 1024 time-points (i.e. context length), this model can forecast up to next 96 time-points (i.e. forecast length)
54
  in future. Recommended for hourly and minutely forecasts (Ex. resolutions 5 min, 10 min, 15 min, 1 hour, etc) (branch name: 1024-96-v1)
55
 
56
  - Stay tuned for more models !
@@ -76,40 +79,61 @@ In addition, TTM also supports exogenous infusion and categorical data which is
76
  Stay tuned for these extended features.
77
 
78
  ## Recommended Use
79
- 1. Users have to externally standard scale their data before feeding it to the model (Refer to TSP, our data processing utility for data scaling.)
80
- 2. Enabling any upsampling or prepending zeros to virtually increase the context length is not recommended and will
81
- impact the model performance.
82
 
83
 
84
- ### Model Sources [optional]
85
-
86
- <!-- Provide the basic links for the model. -->
87
 
88
- - **Repository:** [More Information Needed]
89
- - **Paper [optional]:** [More Information Needed]
90
 
91
 
92
  ## Uses
93
 
94
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 
95
 
96
- ### Direct Use
 
 
97
 
98
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
 
 
 
99
 
100
- [More Information Needed]
101
 
102
- ### Downstream Use [optional]
103
 
104
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
105
 
106
- [More Information Needed]
107
 
108
  ## How to Get Started with the Model
109
 
110
- [Point notebooks]
111
 
112
- ## Benchmarks
113
 
114
  ## Training Data
115
 
@@ -134,12 +158,14 @@ work
134
 
135
  **BibTeX:**
136
 
 
137
  @article{ekambaram2024ttms,
138
  title={TTMs: Fast Multi-level Tiny Time Mixers for Improved Zero-shot and Few-shot Forecasting of Multivariate Time Series},
139
  author={Ekambaram, Vijay and Jati, Arindam and Nguyen, Nam H and Dayama, Pankaj and Reddy, Chandra and Gifford, Wesley M and Kalagnanam, Jayant},
140
  journal={arXiv preprint arXiv:2401.03955},
141
  year={2024}
142
  }
 
143
 
144
  **APA:**
145
 
 
25
  - Zero-shot results of TTM surpass the *few-shot results of many popular SOTA approaches* including
26
  PatchTST (ICLR 23), PatchTSMixer (KDD 23), TimesNet (ICLR 23), DLinear (AAAI 23) and FEDFormer (ICML 22).
27
  - TTM (1024-96, released in this model card with 1M parameters) outperforms pre-trained MOIRAI-Small (14M parameters) by 10%, MOIRAI-Base (91M parameters) by 2% and
28
+ MOIRAI-Large (311M parameters) by 3% on zero-shot forecasting (fl = 96). [[notebook]](https://github.com/IBM/tsfm/blob/main/notebooks/hfdemo/tinytimemixer/ttm_benchmarking_1024_96.ipynb)
29
  - TTM quick fine-tuning also outperforms the hard statistical baselines (Statistical ensemble and S-Naive) in
30
+ M4-hourly dataset which existing pretrained TS models are finding hard to outperform. [[notebook]](https://github.com/IBM/tsfm/blob/main/notebooks/hfdemo/tinytimemixer/ttm_m4_hourly.ipynb)
31
  - TTM takes only a *few seconds for zeroshot/inference* and a *few minutes for finetuning* in 1 GPU machine, as
32
  opposed to long timing-requirements and heavy computing infra needs of other existing pretrained models.
33
 
 
45
  our pretraining scripts that users can utilize to pretrain models on their own. Pretraining TTMs is very easy and fast, taking
46
  only 3-6 hours using 6 A100 GPUs, as opposed to several days or weeks in traditional approaches.
47
 
48
+ Each pre-trained model will be released in a different branch name in this model card. Kindly access the required model using our
49
+ getting started [notebook](https://github.com/IBM/tsfm/blob/main/notebooks/hfdemo/ttm_getting_started.ipynb) mentioning the branch name.
50
+
51
  ## Model Releases (along with the branch name where the models are stored):
52
 
53
+ - **512-96:** Given the last 512 time-points (i.e. context length), this model can forecast up to next 96 time-points (i.e. forecast length)
54
  in future. Recommended for hourly and minutely forecasts (Ex. resolutions 5 min, 10 min, 15 min, 1 hour, etc) (branch name: main)
55
 
56
+ - **1024-96:** Given the last 1024 time-points (i.e. context length), this model can forecast up to next 96 time-points (i.e. forecast length)
57
  in future. Recommended for hourly and minutely forecasts (Ex. resolutions 5 min, 10 min, 15 min, 1 hour, etc) (branch name: 1024-96-v1)
58
 
59
  - Stay tuned for more models !
 
79
  Stay tuned for these extended features.
80
 
81
  ## Recommended Use
82
+ 1. Users have to externally standard scale their data indepedently for every channel before feeding it to the model (Refer to [TSP](https://github.com/IBM/tsfm/blob/main/tsfm_public/toolkit/time_series_preprocessor.py), our data processing utility for data scaling.)
83
+ 2. Enabling any upsampling or prepending zeros to virtually increase the context length for shorter length datasets is not recommended and will
84
+ impact the model performance.
85
 
86
 
87
+ ### Model Sources
 
 
88
 
89
+ - **Repository:** https://github.com/IBM/tsfm/tree/main/tsfm_public/models/tinytimemixer
90
+ - **Paper:** https://arxiv.org/pdf/2401.03955.pdf
91
 
92
 
93
  ## Uses
94
 
95
+ ```
96
+ # Load Model from HF Model Hub mentioning the branch name in revision field
97
 
98
+ model = TinyTimeMixerForPrediction.from_pretrained(
99
+ "https://huggingface.co/ibm/TTM", revision="main"
100
+ )
101
 
102
+ # Do zeroshot
103
+ zeroshot_trainer = Trainer(
104
+ model=model,
105
+ args=zeroshot_forecast_args,
106
+ )
107
+ )
108
 
109
+ zeroshot_output = zeroshot_trainer.evaluate(dset_test)
110
 
 
111
 
112
+ # Freeze backbone and enable few-shot or finetuning:
113
+
114
+ # freeze backbone
115
+ for param in model.backbone.parameters():
116
+ param.requires_grad = False
117
+
118
+ finetune_forecast_trainer = Trainer(
119
+ model=model,
120
+ args=finetune_forecast_args,
121
+ train_dataset=dset_train,
122
+ eval_dataset=dset_val,
123
+ callbacks=[early_stopping_callback, tracking_callback],
124
+ optimizers=(optimizer, scheduler),
125
+ )
126
+ finetune_forecast_trainer.train()
127
+ fewshot_output = finetune_forecast_trainer.evaluate(dset_test)
128
+
129
+ ```
130
+
131
 
 
132
 
133
  ## How to Get Started with the Model
134
 
135
+ [Getting Started Notebook](https://github.com/IBM/tsfm/blob/main/notebooks/hfdemo/ttm_getting_started.ipynb)
136
 
 
137
 
138
  ## Training Data
139
 
 
158
 
159
  **BibTeX:**
160
 
161
+ ```
162
  @article{ekambaram2024ttms,
163
  title={TTMs: Fast Multi-level Tiny Time Mixers for Improved Zero-shot and Few-shot Forecasting of Multivariate Time Series},
164
  author={Ekambaram, Vijay and Jati, Arindam and Nguyen, Nam H and Dayama, Pankaj and Reddy, Chandra and Gifford, Wesley M and Kalagnanam, Jayant},
165
  journal={arXiv preprint arXiv:2401.03955},
166
  year={2024}
167
  }
168
+ ```
169
 
170
  **APA:**
171