Cannot reproduce the reported values
Hi, I am trying to reproduce the results of retrosynthesis, property prediction, and molecule captioning using MolInstructions. Unfortunately, I cannot reproduce the results of Llama3 that you have reported in your GitHub. Isn't it right that setting 1) base_model = "meta-llama/Meta-Llama-3-8B-Instruct", 2) lora_weights= "zjunlp/llama3-instruct-molinst-molecule-8b" in generate.py, and 3) change LlamaTokenizer to AutoTokenizer enough to change the settings from Llama2 to Llama3? Is there any additional settings or details required?
The reported values seem to be very high but when I input the first test data of molecule captioning, it generates totally different caption from the ground truth. Even using the training data does not yield reasonable captions.
Thank you for your help.
Hi, thank you for your interest.
We are using the training and generation code for LLaMA3 provided at https://github.com/hiyouga/LLaMA-Factory, which you may refer to. The code on our GitHub is specifically for LLaMA2.