sfairXC
/

FsfairX-LLaMA3-RM-v0.1

johnowhitaker commited on 22 days ago

Commit

94fad49

•

1 Parent(s): 883c72b

Update README.md (#4)

- Update README.md (5eefeb2345a1f3399427c8faf937d9ad38028c3d)

Co-authored-by: Jonathan Whitaker <[email protected]>

Files changed (1) hide show

README.md CHANGED Viewed

@@ -43,7 +43,7 @@ We use the training script at `https://github.com/WeiXiongUST/RLHF-Reward-Model
    {"role": "user", "content": "I'd like to show off how chat templating works!"},
   ]
-  test_texts = [tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=False).replace(tokenizer.bos_token, "")]
   pipe_outputs = rm_pipe(test_texts, **pipe_kwargs)
   rewards = [output[0]["score"] for output in pipe_outputs]
 ```

    {"role": "user", "content": "I'd like to show off how chat templating works!"},
   ]
+  test_texts = [rm_tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=False).replace(rm_tokenizer.bos_token, "")]
   pipe_outputs = rm_pipe(test_texts, **pipe_kwargs)
   rewards = [output[0]["score"] for output in pipe_outputs]
 ```