The learning rate was not displayed as it should.

#1
by kapllan - opened
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -75,7 +75,7 @@ For further details see [Niklaus et al. 2023](https://arxiv.org/abs/2306.02069?u
75
  - batche size: 512 samples
76
  - Number of steps: 1M/500K for the base/large model
77
  - Warm-up steps for the first 5\% of the total training steps
78
- - Learning rate: (linearly increasing up to) $1e\!-\!4$
79
  - Word masking: increased 20/30\% masking rate for base/large models respectively
80
 
81
  ## Evaluation
 
75
  - batche size: 512 samples
76
  - Number of steps: 1M/500K for the base/large model
77
  - Warm-up steps for the first 5\% of the total training steps
78
+ - Learning rate: (linearly increasing up to) 1e-4
79
  - Word masking: increased 20/30\% masking rate for base/large models respectively
80
 
81
  ## Evaluation