This bert model was trained using wikimedia/wikipedia japanese data.It took about a day to train on a single RTX3080.