line-corporation/line-distilbert-base-japanese

This model is a fine-tuned version of line-corporation/line-distilbert-base-japanese on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0615

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 64
eval_batch_size: 8
seed: 42
distributed_type: tpu
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	306	0.0856
0.1041	2.0	612	0.0819
0.1041	3.0	918	0.0795
0.0919	4.0	1224	0.0781
0.0876	5.0	1530	0.0770
0.0876	6.0	1836	0.0758
0.0845	7.0	2142	0.0751
0.0845	8.0	2448	0.0750
0.083	9.0	2754	0.0737
0.0809	10.0	3060	0.0732
0.0809	11.0	3366	0.0727
0.0802	12.0	3672	0.0722
0.0802	13.0	3978	0.0717
0.0797	14.0	4284	0.0721
0.078	15.0	4590	0.0711
0.078	16.0	4896	0.0707
0.0765	17.0	5202	0.0703
0.0774	18.0	5508	0.0699
0.0774	19.0	5814	0.0698
0.0762	20.0	6120	0.0696
0.0762	21.0	6426	0.0692
0.0756	22.0	6732	0.0691
0.0756	23.0	7038	0.0688
0.0756	24.0	7344	0.0687
0.075	25.0	7650	0.0680
0.075	26.0	7956	0.0680
0.0742	27.0	8262	0.0678
0.0738	28.0	8568	0.0677
0.0738	29.0	8874	0.0672
0.0742	30.0	9180	0.0673
0.0742	31.0	9486	0.0669
0.0733	32.0	9792	0.0669
0.0732	33.0	10098	0.0667
0.0732	34.0	10404	0.0664
0.0722	35.0	10710	0.0665
0.0728	36.0	11016	0.0662
0.0728	37.0	11322	0.0660
0.0719	38.0	11628	0.0659
0.0719	39.0	11934	0.0655
0.072	40.0	12240	0.0655
0.0721	41.0	12546	0.0654
0.0721	42.0	12852	0.0651
0.0711	43.0	13158	0.0651
0.0711	44.0	13464	0.0649
0.0715	45.0	13770	0.0651
0.0709	46.0	14076	0.0645
0.0709	47.0	14382	0.0644
0.0706	48.0	14688	0.0644
0.0706	49.0	14994	0.0642
0.0703	50.0	15300	0.0642
0.0706	51.0	15606	0.0641
0.0706	52.0	15912	0.0641
0.07	53.0	16218	0.0638
0.07	54.0	16524	0.0635
0.07	55.0	16830	0.0634
0.0695	56.0	17136	0.0634
0.0695	57.0	17442	0.0634
0.0701	58.0	17748	0.0633
0.0696	59.0	18054	0.0630
0.0696	60.0	18360	0.0637
0.0688	61.0	18666	0.0630
0.0688	62.0	18972	0.0629
0.0691	63.0	19278	0.0628
0.0692	64.0	19584	0.0627
0.0692	65.0	19890	0.0630
0.0694	66.0	20196	0.0625
0.0687	67.0	20502	0.0628
0.0687	68.0	20808	0.0623
0.0696	69.0	21114	0.0625
0.0696	70.0	21420	0.0624
0.0675	71.0	21726	0.0624
0.0688	72.0	22032	0.0622
0.0688	73.0	22338	0.0622
0.0682	74.0	22644	0.0621
0.0682	75.0	22950	0.0620
0.0683	76.0	23256	0.0620
0.0683	77.0	23562	0.0620
0.0683	78.0	23868	0.0620
0.0679	79.0	24174	0.0620
0.0679	80.0	24480	0.0619
0.0678	81.0	24786	0.0619
0.0679	82.0	25092	0.0618
0.0679	83.0	25398	0.0618
0.068	84.0	25704	0.0618
0.0684	85.0	26010	0.0617
0.0684	86.0	26316	0.0616
0.0676	87.0	26622	0.0617
0.0676	88.0	26928	0.0617
0.0676	89.0	27234	0.0617
0.0679	90.0	27540	0.0616
0.0679	91.0	27846	0.0616
0.0677	92.0	28152	0.0616
0.0677	93.0	28458	0.0616
0.067	94.0	28764	0.0615
0.0678	95.0	29070	0.0615
0.0678	96.0	29376	0.0615
0.067	97.0	29682	0.0615
0.067	98.0	29988	0.0615
0.0682	99.0	30294	0.0615
0.0681	100.0	30600	0.0615

Framework versions

Transformers 4.34.0
Pytorch 2.0.0+cu118
Datasets 2.14.5
Tokenizers 0.14.0

liwii
/

factual-consistency-regression-ja

line-corporation/line-distilbert-base-japanese

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for liwii/factual-consistency-regression-ja

Evaluation results