Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

FLAN-T5-Base fine-tuned on Historical Content Completion

This model is a fine-tuned version of google/flan-t5-base on a history dataset, focused on paragraph completion tasks.

Model description

This model is designed to complete historical paragraphs, either by providing the ending or the beginning of a given text.

Intended uses & limitations

This model is intended for educational purposes and to assist in creating history-related content. It's specifically trained on completing historical paragraphs and should not be used for question generation or answering.

Training and evaluation data

The model was trained on the dataset ambrosfitz/just_history_xl.

Training procedure

The model was trained using the following key strategies:

  1. Varied Task Formulation: Randomly choosing between completing the end or beginning of paragraphs
  2. Task Instruction: Including varied task instructions to prevent overfitting
  3. Full Dataset Usage: Utilizing the entire dataset to maximize historical knowledge
  4. Controlled Training: Using a single epoch with a low learning rate to prevent overfitting

Hyperparameters:

  • Number of epochs: 1
  • Learning rate: 0.0003
  • Batch size: 4
  • Gradient Accumulation Steps: 4

Results

Test set results: {'eval_loss': 0.29231172800064087, 'eval_runtime': 119.2142, 'eval_samples_per_second': 78.9, 'eval_steps_per_second': 19.729, 'epoch': 0.9998405188453564}

Downloads last month
10
Safetensors
Model size
248M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .