Di Zhang

qq8933

AI & ML interests

AI4Chem, LLM, Green LLM

Recent Activity

updated a dataset 14 minutes ago
qq8933/OpenLongCoT-Pretrain-v2
updated a dataset about 19 hours ago
qq8933/llama_o1_offline_training_data_v1
New activity 2 days ago
qq8933/OpenLongCoT-Pretrain

Organizations

Posts 16

view post
Post
2268
Discovered an outrageous bug on the ChatGPT official website, especially for those using ad-blocking plugins. Please make sure to add browser-intake-datadoghq.com to your ad block whitelist. The ChatGPT webpage relies on this site for heartbeat detection, but since it belongs to an ad tracking network, it's included in major ad-blocking lists. (If you're using Clash, also remember to add it to the whitelist.) Failing to do so may cause the ChatGPT web interface to display a greyed-out send button after clicking, with no response.

For users with Chinese IP addresses, consider adding this URL to the rules of your U.S. node, as the response headers from this site will report the user's physical location to GPT.
view post
Post
5509
LLaMA-O1: Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace
Large Reasoning Models powered by Monte Carlo Tree Search (MCTS), Self-Play Reinforcement Learning, PPO, AlphaGo Zero's dua policy paradigm and Large Language Models!
https://github.com/SimpleBerry/LLaMA-O1/

What will happen when you compound MCTS ❤ LLM ❤ Self-Play ❤RLHF?
Just a little bite of strawberry!🍓

Past related works:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)