Librarian Bot: Add base_model information to model (#3)

9c4e06c about 1 year ago

5.82 kB

	---
	language:
	- ja
	license: mit
	tags:
	- question-answering
	- generated_from_trainer
	- bert
	- jaquad
	datasets: SkelterLabsInc/JaQuAD
	inference:
	parameters:
	align_to_words: false
	widget:
	- text: 決勝トーナメントで日本に勝ったのはどこでしたか。
	context: 日本は予選リーグで強豪のドイツとスペインに勝って決勝トーナメントに進んだが、クロアチアと対戦して敗れた。
	- text: 8世紀に日本の首都はどこでしたか。
	context: 8世紀に日本の首都であった奈良を代表する寺院である東大寺は、「古都奈良の文化財」の一部として世界遺産に登録されている。東大寺には、「奈良の大仏」として知られる、高さ約15メートルの盧舎那仏像をはじめ、日本仏教美術史を代表する著名作品が多く所蔵されている。
	- text: 「奈良の大仏」の高さは何メートルなの?
	context: 8世紀に日本の首都であった奈良を代表する寺院である東大寺は、「古都奈良の文化財」の一部として世界遺産に登録されている。東大寺には、「奈良の大仏」として知られる、高さ約15メートルの盧舎那仏像をはじめ、日本仏教美術史を代表する著名作品が多く所蔵されている。
	base_model: rinna/japanese-roberta-base
	model-index:
	- name: roberta_qa_japanese
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# roberta_qa_japanese

	(Japanese caption : 日本語の (抽出型) 質問応答のモデル)

	This model is a fine-tuned version of [rinna/japanese-roberta-base](https://huggingface.co/rinna/japanese-roberta-base) (pre-trained RoBERTa model provided by rinna Co., Ltd.) trained for extractive question answering.

	The model is fine-tuned on [JaQuAD](https://huggingface.co/datasets/SkelterLabsInc/JaQuAD) dataset provided by Skelter Labs, in which data is collected from Japanese Wikipedia articles and annotated by a human.

	## Intended uses

	When running with a dedicated pipeline :

	```python
	from transformers import pipeline

	model_name = "tsmatz/roberta_qa_japanese"
	qa_pipeline = pipeline(
	"question-answering",
	model=model_name,
	tokenizer=model_name)
	result = qa_pipeline(
	question = "決勝トーナメントで日本に勝ったのはどこでしたか。",
	context = "日本は予選リーグで強豪のドイツとスペインに勝って決勝トーナメントに進んだが、クロアチアと対戦して敗れた。",
	align_to_words = False,
	)
	print(result)
	```

	When manually running through forward pass :

	```python
	import torch
	import numpy as np
	from transformers import AutoModelForQuestionAnswering, AutoTokenizer

	model_name = "tsmatz/roberta_qa_japanese"
	model = (AutoModelForQuestionAnswering
	.from_pretrained(model_name))
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	def inference_answer(question, context):
	question = question
	context = context
	test_feature = tokenizer(
	question,
	context,
	max_length=318,
	)
	with torch.no_grad():
	outputs = model(torch.tensor([test_feature["input_ids"]]))
	start_logits = outputs.start_logits.cpu().numpy()
	end_logits = outputs.end_logits.cpu().numpy()
	answer_ids = test_feature["input_ids"][np.argmax(start_logits):np.argmax(end_logits)+1]
	return "".join(tokenizer.batch_decode(answer_ids))

	question = "決勝トーナメントで日本に勝ったのはどこでしたか。"
	context = "日本は予選リーグで強豪のドイツとスペインに勝って決勝トーナメントに進んだが、クロアチアと対戦して敗れた。"
	answer_pred = inference_answer(question, context)
	print(answer_pred)
	```

	## Training procedure

	You can download the source code for fine-tuning from [here](https://github.com/tsmatz/huggingface-finetune-japanese/blob/master/03-question-answering.ipynb).

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 7e-05
	- train_batch_size: 2
	- eval_batch_size: 1
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 32
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 100
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 2.1293 \| 0.13 \| 150 \| 1.0311 \|
	\| 1.1965 \| 0.26 \| 300 \| 0.6723 \|
	\| 1.022 \| 0.39 \| 450 \| 0.4838 \|
	\| 0.9594 \| 0.53 \| 600 \| 0.5174 \|
	\| 0.9187 \| 0.66 \| 750 \| 0.4671 \|
	\| 0.8229 \| 0.79 \| 900 \| 0.4650 \|
	\| 0.71 \| 0.92 \| 1050 \| 0.2648 \|
	\| 0.5436 \| 1.05 \| 1200 \| 0.2665 \|
	\| 0.5045 \| 1.19 \| 1350 \| 0.2686 \|
	\| 0.5025 \| 1.32 \| 1500 \| 0.2082 \|
	\| 0.5213 \| 1.45 \| 1650 \| 0.1715 \|
	\| 0.4648 \| 1.58 \| 1800 \| 0.1563 \|
	\| 0.4698 \| 1.71 \| 1950 \| 0.1488 \|
	\| 0.4823 \| 1.84 \| 2100 \| 0.1050 \|
	\| 0.4482 \| 1.97 \| 2250 \| 0.0821 \|
	\| 0.2755 \| 2.11 \| 2400 \| 0.0898 \|
	\| 0.2834 \| 2.24 \| 2550 \| 0.0964 \|
	\| 0.2525 \| 2.37 \| 2700 \| 0.0533 \|
	\| 0.2606 \| 2.5 \| 2850 \| 0.0561 \|
	\| 0.2467 \| 2.63 \| 3000 \| 0.0601 \|
	\| 0.2799 \| 2.77 \| 3150 \| 0.0562 \|
	\| 0.2497 \| 2.9 \| 3300 \| 0.0516 \|


	### Framework versions

	- Transformers 4.23.1
	- Pytorch 1.12.1+cu102
	- Datasets 2.6.1
	- Tokenizers 0.13.1