+ deepspeed --num_nodes=1 --num_gpus=4 --master_port 47607 --module safe_rlhf.values.reward --train_datasets PKU-SafeRLHF/train:1.0:PKU-SafeRLHF-harmless-only-30k --eval_datasets PKU-SafeRLHF/test --model_name_or_path output/sft --max_length 512 --trust_remote_code True --loss_type sequence-wise --epochs 2 --per_device_train_batch_size 16 --per_device_eval_batch_size 16 --gradient_accumulation_steps 2 --gradient_checkpointing --normalize_score_during_training False --normalizer_type ExponentialMovingAverage --normalizer_momentum 0.9 --learning_rate 2e-5 --lr_scheduler_type cosine --lr_warmup_ratio 0.03 --weight_decay 0.1 --seed 42 --eval_strategy epoch --output_dir /data/jiongxiao_wang/rlhf_attack/safe-rlhf/output/rm_30k --log_type wandb --log_project Safe-RLHF-RM --zero_stage 3 --bf16 True --tf32 True 2024-01-05 20:02:46.835068: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-01-05 20:02:46.835067: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-01-05 20:02:46.835067: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-01-05 20:02:46.835114: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-01-05 20:02:46.835114: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-01-05 20:02:46.835114: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-01-05 20:02:46.835826: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-01-05 20:02:46.835865: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-01-05 20:02:46.836421: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-01-05 20:02:46.836422: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-01-05 20:02:46.836424: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-01-05 20:02:46.836771: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-01-05 20:02:48.497891: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 2024-01-05 20:02:48.498124: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 2024-01-05 20:02:48.498360: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 2024-01-05 20:02:48.498588: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT Some weights of the model checkpoint at output/sft were not used when initializing LlamaModelForScore: ['lm_head.weight'] - This IS expected if you are initializing LlamaModelForScore from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing LlamaModelForScore from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of the model checkpoint at output/sft were not used when initializing LlamaModelForScore: ['lm_head.weight'] - This IS expected if you are initializing LlamaModelForScore from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing LlamaModelForScore from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of the model checkpoint at output/sft were not used when initializing LlamaModelForScore: ['lm_head.weight'] - This IS expected if you are initializing LlamaModelForScore from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing LlamaModelForScore from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of LlamaModelForScore were not initialized from the model checkpoint at output/sft and are newly initialized: ['normalizer.count', 'normalizer.mean', 'normalizer.var'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Some weights of LlamaModelForScore were not initialized from the model checkpoint at output/sft and are newly initialized: ['normalizer.count', 'normalizer.var', 'normalizer.mean'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Some weights of LlamaModelForScore were not initialized from the model checkpoint at output/sft and are newly initialized: ['normalizer.mean', 'normalizer.var', 'normalizer.count'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. You are using the legacy behaviour of the . This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at https://github.com/huggingface/transformers/pull/24565 You are using the legacy behaviour of the . This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at https://github.com/huggingface/transformers/pull/24565 You are using the legacy behaviour of the . This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at https://github.com/huggingface/transformers/pull/24565 Some weights of the model checkpoint at output/sft were not used when initializing LlamaModelForScore: ['lm_head.weight'] - This IS expected if you are initializing LlamaModelForScore from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing LlamaModelForScore from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of LlamaModelForScore were not initialized from the model checkpoint at output/sft and are newly initialized: ['normalizer.var', 'normalizer.mean', 'normalizer.count'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. You are using the legacy behaviour of the . This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at https://github.com/huggingface/transformers/pull/24565 Using /data/jiongxiao_wang/.cache/torch_extensions/py310_cu117 as PyTorch extensions root... Using /data/jiongxiao_wang/.cache/torch_extensions/py310_cu117 as PyTorch extensions root... Using /data/jiongxiao_wang/.cache/torch_extensions/py310_cu117 as PyTorch extensions root... Using /data/jiongxiao_wang/.cache/torch_extensions/py310_cu117 as PyTorch extensions root... Detected CUDA files, patching ldflags Emitting ninja build file /data/jiongxiao_wang/.cache/torch_extensions/py310_cu117/fused_adam/build.ninja... Building extension module fused_adam... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) Loading extension module fused_adam... Loading extension module fused_adam... Loading extension module fused_adam... Loading extension module fused_adam... `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`... `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`... `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`... wandb: Currently logged in as: jayfeather (jayfeather1024). Use `wandb login --relogin` to force relogin wandb: Tracking run with wandb version 0.16.1 wandb: Run data is saved locally in /data/jiongxiao_wang/rlhf_attack/safe-rlhf/output/rm_30k/wandb/run-20240105_200327-0bh9htd8 wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run reward-2024-01-05-20-03-25 wandb: ⭐️ View project at https://wandb.ai/jayfeather1024/Safe-RLHF-RM wandb: 🚀 View run at https://wandb.ai/jayfeather1024/Safe-RLHF-RM/runs/0bh9htd8 Training 1/2 epoch: 0%| | 0/840 [00:00