DeBERTa commited on
Commit
ad6e42c
1 Parent(s): f962ff9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -11,8 +11,7 @@ license: mit
11
 
12
  Please check the [official repository](https://github.com/microsoft/DeBERTa) for more details and updates.
13
 
14
- This is the DeBERTa V2 xxlarge model with 48 layers, 1536 hidden size. Total parameters 1.5B. It's trained with 160GB data.
15
-
16
 
17
  ### Fine-tuning on NLU tasks
18
 
@@ -36,8 +35,8 @@ We present the dev results on SQuAD 1.1/2.0 and several GLUE benchmark tasks.
36
  ```bash
37
  cd transformers/examples/text-classification/
38
  export TASK_NAME=mrpc
39
- python -m torch.distributed.launch --nproc_per_node=8 run_glue.py --model_name_or_path microsoft/deberta-v2-xxlarge \\
40
- --task_name $TASK_NAME --do_train --do_eval --max_seq_length 128 --per_device_train_batch_size 4 \\
41
  --learning_rate 3e-6 --num_train_epochs 3 --output_dir /tmp/$TASK_NAME/ --overwrite_output_dir --sharded_ddp --fp16
42
  ```
43
 
 
11
 
12
  Please check the [official repository](https://github.com/microsoft/DeBERTa) for more details and updates.
13
 
14
+ This is the DeBERTa V2 xlarge model with 24 layers, 1536 hidden size. The total parameters are 900M and it is trained with 160GB raw data.
 
15
 
16
  ### Fine-tuning on NLU tasks
17
 
 
35
  ```bash
36
  cd transformers/examples/text-classification/
37
  export TASK_NAME=mrpc
38
+ python -m torch.distributed.launch --nproc_per_node=8 run_glue.py --model_name_or_path microsoft/deberta-v2-xxlarge \\\\
39
+ --task_name $TASK_NAME --do_train --do_eval --max_seq_length 128 --per_device_train_batch_size 4 \\\\
40
  --learning_rate 3e-6 --num_train_epochs 3 --output_dir /tmp/$TASK_NAME/ --overwrite_output_dir --sharded_ddp --fp16
41
  ```
42