agentlans's picture
Upload 8 files
1e14dc5 verified
metadata
library_name: transformers
base_model: EleutherAI/pythia-14m
tags:
  - generated_from_trainer
model-index:
  - name: pythia-14m-finewebedu-sentences
    results: []

pythia-14m-finewebedu-sentences

  • Generate half intelligible English sentences using a small GPT like model.
  • Will output one sentence at a time.

This model is a fine-tuned version of EleutherAI/pythia-14m on the agentlans/finewebedu-sentences dataset.

Model description

To generate 10 random sentences starting from an empty string on a CUDA device:

from transformers import pipeline, set_seed

generator = pipeline('text-generation', model='agentlans/pythia-14m-finewebedu-sentences', device='cuda')

set_seed(1234)
results = generator("", max_length=100, num_return_sequences=10, do_sample=True)

for x in results:
    print(x['generated_text'])

Output:

The main outcome is that the group is associated with other domains.
If you're planning to get a long-term answer, you can check the link "watch" and see that you can change your website.
They are very difficult to make it easy to understand how it works as a healthy.
In the most cases, the prevalence of DTP is reduced from 5-HT1.
It is a significant difference between the risk of injury, and there is no need to be a single complication.
The time of taking too high on the scale of the region is to begin, with a bit other type of view to the whole system.
The total cost of your daily distribution is $24. The overall number of children is 0.5 times is 50.
The more difficult time is to learn the basics of the work, but it is important to do the same job.
It is now on the other hand, however, in the middle of the 19th century, the country must follow the law of the country and alliances between the countries and communities.
This is why it is hard to do this.

Intended uses & limitations

  • For generating short lines of English text
  • Could be useful for
    • data augmentation
    • creative inspiration
    • entertainment
    • CAPTCHA
  • Can be further finetuned on other data such as:
    • prompts
    • famous quotes
    • news headlines
    • blog post titles

Limitations include:

  • Not guaranteed to make sensible, coherent, or grammatically correct sentences
  • No regard for accuracy or truthfulness whatsoever
    • It's a bunch of words from a probability model, what do you expect?

Training and evaluation data

Sentences from HuggingFaceFW/fineweb-edu

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15.0

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.2.2+cu121
  • Datasets 2.18.0
  • Tokenizers 0.19.1