Model Card for ReactionT5v2-forward

This is a ReactionT5 pre-trained to predict the products of reactions. You can use the demo here.

Model Sources

Repository: https://github.com/sagawatatsuya/ReactionT5v2
Paper: https://arxiv.org/abs/2311.06708
Demo: https://huggingface.co/spaces/sagawa/ReactionT5_task_forward

Uses

You can use this model for forward reaction prediction or fine-tune this model with your dataset.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("sagawa/ReactionT5v2-forward", return_tensors="pt")
model = AutoModelForSeq2SeqLM.from_pretrained("sagawa/ReactionT5v2-forward")

inp = tokenizer('REACTANT:COC(=O)C1=CCCN(C)C1.O.[Al+3].[H-].[Li+].[Na+].[OH-]REAGENT:C1CCOC1', return_tensors='pt')
output = model.generate(**inp, num_beams=1, num_return_sequences=1, return_dict_in_generate=True, output_scores=True)
output = tokenizer.decode(output['sequences'][0], skip_special_tokens=True).replace(' ', '').rstrip('.')
output # 'CN1CCC=C(CO)C1'

Training Details

Training Procedure

We used the Open Reaction Database (ORD) dataset for model training. In addition, we used USPTO_MIT dataset's test split to prevent data leakage. The command used for training is the following. For more information about data preprocessing and training, please refer to the paper and GitHub repository.

cd task_forward
python train.py \
    --output_dir='t5' \
    --epochs=100 \
    --lr=1e-3 \
    --batch_size=32 \
    --input_max_len=150 \
    --target_max_len=100 \
    --weight_decay=0.01 \
    --evaluation_strategy='epoch' \
    --save_strategy='epoch' \
    --logging_strategy='epoch' \
    --train_data_path='../data/preprocessed_ord_train.csv' \
    --valid_data_path='../data/preprocessed_ord_valid.csv' \
    --test_data_path='../data/preprocessed_ord_test.csv' \
    --USPTO_test_data_path='../data/USPTO_MIT/MIT_separated/test.csv' \
    --disable_tqdm \
    --pretrained_model_name_or_path='sagawa/CompoundT5'

Results

Model	Training set	Test set	Top-1 [% acc.]	Top-2 [% acc.]	Top-3 [% acc.]	Top-5 [% acc.]
Sequence-to-sequence	USPTO_MIT	USPTO_MIT	80.3	84.7	86.2	87.5
WLDN	USPTO_MIT	USPTO_MIT	80.6 (85.6)	90.5	92.8	93.4
Molecular Transformer	USPTO_MIT	USPTO_MIT	88.8	92.6	–	94.4
T5Chem	USPTO_MIT	USPTO_MIT	90.4	94.2	–	96.4
CompoundT5	USPTO_MIT	USPTO_MIT	86.6	89.5	90.4	91.2
ReactionT5 (This model)	-	USPTO_MIT	92.8	95.6	96.4	97.1
ReactionT5	USPTO_MIT	USPTO_MIT	97.5	98.6	98.8	99.0

Performance comparison of Compound T5, ReactionT5, and other models in product prediction.

Citation

arxiv link: https://arxiv.org/abs/2311.06708

@misc{sagawa2023reactiont5,  
      title={ReactionT5: a large-scale pre-trained model towards application of limited reaction data}, 
      author={Tatsuya Sagawa and Ryosuke Kojima},  
      year={2023},  
      eprint={2311.06708},  
      archivePrefix={arXiv},  
      primaryClass={physics.chem-ph}  
}

sagawa
/

ReactionT5v2-forward