TristanBehrens commited on
Commit
bf028b1
1 Parent(s): e5f231a

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - NLP
6
+ license: mit
7
+ datasets:
8
+ - TristanBehrens/bach_garland_2024-100K
9
+ base_model: None
10
+ ---
11
+
12
+ # bach_garland_transformer - An xLSTM Model
13
+
14
+ ![Trained with Helibrunna](banner.jpg)
15
+
16
+ Trained with [Helibrunna](https://github.com/AI-Guru/helibrunna) by [Dr. Tristan Behrens](https://de.linkedin.com/in/dr-tristan-behrens-734967a2).
17
+
18
+ ## Configuration
19
+
20
+ ```
21
+ training:
22
+ model_name: bach_garland_transformer
23
+ batch_size: 40
24
+ lr: 0.001
25
+ lr_warmup_steps: 1000
26
+ lr_decay_until_steps: 10000
27
+ lr_decay_factor: 0.001
28
+ weight_decay: 0.1
29
+ amp_precision: bfloat16
30
+ weight_precision: float32
31
+ enable_mixed_precision: true
32
+ num_epochs: 8
33
+ output_dir: output/bach_garland_transformer
34
+ save_every_step: 500
35
+ log_every_step: 10
36
+ wandb_project: bach_garland
37
+ torch_compile: false
38
+ model:
39
+ type: transformer
40
+ dim: 64
41
+ n_layers: 4
42
+ n_heads: 4
43
+ fc_scale: 2
44
+ context_length: 2048
45
+ vocab_size: 178
46
+ dataset:
47
+ hugging_face_id: TristanBehrens/bach_garland_2024-100K
48
+ tokenizer:
49
+ type: whitespace
50
+ fill_token: '[EOS]'
51
+
52
+ ```