Update README.md

48564d6 verified 5 months ago

6.45 kB

	---
	license: apache-2.0
	base_model: google/electra-small-discriminator
	tags:
	- generated_from_keras_callback
	model-index:
	- name: nguyennghia0902/electra-small-discriminator_0.0001_16_15e
	results: []
	language:
	- vi
	- en
	metrics:
	- accuracy
	pipeline_tag: question-answering
	datasets:
	- nguyennghia0902/project02_textming_dataset
	---

	<!-- This model card has been generated automatically according to the information Keras had access to. You should
	probably proofread and complete it, then remove this comment. -->

	# nguyennghia0902/electra-small-discriminator_0.0001_16_15e

	This model is a fine-tuned version of [google/electra-small-discriminator](https://huggingface.co/google/electra-small-discriminator) on [Vietnamese dataset](https://www.kaggle.com/datasets/duyminhnguyentran/csc15105).
	It achieves the following results on the evaluation set:
	- Train Loss: 0.4315
	- Train End Logits Accuracy: 0.8714
	- Train Start Logits Accuracy: 0.8580
	- Validation Loss: 0.1470
	- Validation End Logits Accuracy: 0.9577
	- Validation Start Logits Accuracy: 0.9542
	- Test Matching Accuracy: 0.90209
	- Epoch: 15
	- Train time: 21920.9752 seconds ~ 6.09 hours

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- Learning rate: 1e-4
	- Batch size: 16
	- optimizer: {
	'name': 'Adam',
	'learning_rate': {
	'module': 'keras.optimizers.schedules',
	'class_name': 'PolynomialDecay',
	'config': {
	'initial_learning_rate': 0.0001,
	'decay_steps': 46905,
	'end_learning_rate': 0.0,
	'power': 1.0, 'cycle': False
	}
	},
	'epsilon': 1e-08
	}
	- training_precision: float32

	### Training results

	\| Train Loss \| Train End Logits Accuracy \| Train Start Logits Accuracy \| Validation Loss \| Validation End Logits Accuracy \| Validation Start Logits Accuracy \| Epoch \|
	\|:----------:\|:-------------------------:\|:---------------------------:\|:---------------:\|:------------------------------:\|:--------------------------------:\|:-----:\|
	\| 2.9418 \| 0.3441 \| 0.3115 \| 2.1831 \| 0.4777 \| 0.4649 \| 0 \|
	\| 2.2767 \| 0.4696 \| 0.4357 \| 1.7802 \| 0.5643 \| 0.5481 \| 1 \|
	\| 1.9907 \| 0.5234 \| 0.4941 \| 1.5055 \| 0.6229 \| 0.6068 \| 2 \|
	\| 1.7630 \| 0.5690 \| 0.5440 \| 1.2348 \| 0.6824 \| 0.6708 \| 3 \|
	\| 1.5637 \| 0.6086 \| 0.5842 \| 1.0345 \| 0.7291 \| 0.7190 \| 4 \|
	\| 1.3785 \| 0.6500 \| 0.6241 \| 0.8309 \| 0.7823 \| 0.7724 \| 5 \|
	\| 1.2118 \| 0.6880 \| 0.6604 \| 0.6918 \| 0.8105 \| 0.8116 \| 6 \|
	\| 1.0610 \| 0.7222 \| 0.6963 \| 0.5471 \| 0.8490 \| 0.8476 \| 7 \|
	\| 0.9249 \| 0.7495 \| 0.7272 \| 0.4426 \| 0.8770 \| 0.8763 \| 8 \|
	\| 0.8085 \| 0.7777 \| 0.7585 \| 0.3695 \| 0.8919 \| 0.8908 \| 9 \|
	\| 0.7062 \| 0.8018 \| 0.7843 \| 0.2773 \| 0.9194 \| 0.9198 \| 10 \|
	\| 0.6182 \| 0.8232 \| 0.8043 \| 0.2323 \| 0.9343 \| 0.9302 \| 11 \|
	\| 0.5422 \| 0.8414 \| 0.8267 \| 0.1807 \| 0.9470 \| 0.9470 \| 12 \|
	\| 0.4797 \| 0.8588 \| 0.8443 \| 0.1570 \| 0.9530 \| 0.9515 \| 13 \|
	\| 0.4315 \| 0.8714 \| 0.8580 \| 0.1470 \| 0.9577 \| 0.9542 \| 14 \|


	### Framework versions

	- Transformers 4.39.3
	- TensorFlow 2.15.0
	- Datasets 2.18.0
	- Tokenizers 0.15.2

	## How to use?
	```python
	from transformers import ElectraTokenizerFast, TFElectraForQuestionAnswering

	model_hf = "nguyennghia0902/electra-small-discriminator_0.0001_16_15e"
	tokenizer = ElectraTokenizerFast.from_pretrained(model_hf)
	reload_model = TFElectraForQuestionAnswering.from_pretrained(model_hf)

	question = "Ký túc xá Đại học Quốc gia Thành phố Hồ Chí Minh bao gồm có bao nhiêu khu?"
	context = "Ký túc xá Đại học Quốc gia Thành phố Hồ Chí Minh (Ký túc xá ĐHQG-TPHCM) là hệ thống ký túc xá xây tại Khu đô thị Đại học Quốc gia Thành phố Hồ Chí Minh (còn gọi với tên phổ biến: Khu đô thị ĐHQG-HCM hay Làng Đại học Thủ Đức). Ký túc xá ĐHQG-TPHCM gồm có 02 khu: A và B. Địa chỉ: Đường Tạ Quang Bửu, Khu phố 6, phường Linh Trung, thành phố Thủ Đức, Thành phố Hồ Chí Minh, điện thoại: 1900 05 55 59 (111). "

	inputs = tokenizer(question, context, return_offsets_mapping=True, return_tensors="tf", max_length=512, truncation=True)
	offset_mapping = inputs.pop("offset_mapping")
	outputs = reload_model(**inputs)
	answer_start_index = int(tf.math.argmax(outputs.start_logits, axis=-1)[0])
	answer_end_index = int(tf.math.argmax(outputs.end_logits, axis=-1)[0])
	start_char = offset_mapping[0][answer_start_index][0]
	end_char = offset_mapping[0][answer_end_index][1]
	predicted_answer_text = context[start_char:end_char]

	print(predicted_answer_text)
	```