FLAN-T5-Base fine-tuned on Historical Content Completion

This model is a fine-tuned version of google/flan-t5-base on a history dataset, focused on paragraph completion tasks.

Model description

This model is designed to complete historical paragraphs, either by providing the ending or the beginning of a given text.

Intended uses & limitations

This model is intended for educational purposes and to assist in creating history-related content. It's specifically trained on completing historical paragraphs and should not be used for question generation or answering.

Training and evaluation data

The model was trained on the dataset ambrosfitz/just_history_xl.

Training procedure

The model was trained using the following key strategies:

Varied Task Formulation: Randomly choosing between completing the end or beginning of paragraphs
Task Instruction: Including varied task instructions to prevent overfitting
Full Dataset Usage: Utilizing the entire dataset to maximize historical knowledge
Controlled Training: Using a single epoch with a low learning rate to prevent overfitting

Hyperparameters:

Number of epochs: 1
Learning rate: 0.0003
Batch size: 4
Gradient Accumulation Steps: 4

Results

Test set results: {'eval_loss': 0.29231172800064087, 'eval_runtime': 119.2142, 'eval_samples_per_second': 78.9, 'eval_steps_per_second': 19.729, 'epoch': 0.9998405188453564}