metadata
tags:
- generated_from_trainer
model-index:
- name: gpt2-arxiv
results: []
gpt2-arxiv
A gpt2 powered predictive keyboard trained on ~1.6M manuscript abstracts from the ArXiv. This model uses https://www.kaggle.com/datasets/Cornell-University/arxiv
from transformers import pipeline
from transformers import GPT2TokenizerFast
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")
llm = pipeline('text-generation',model='pearsonkyle/gpt2-arxiv', tokenizer=tokenizer)
texts = llm("Directly imaged exoplanets probe",
max_length=50, do_sample=True, num_return_sequences=5,
penalty_alpha=0.65, top_k=40, repetition_penalty=1.25,
temperature=0.95)
for i in range(5):
print(texts[i]['generated_text']+'\n')
- The reflectance of Earth's vegetation suggests
that large, deciduous forest fires are composed of mostly dry, unprocessed material that is distributed in a nearly patchy fashion. The distributions of these fires are correlated with temperature, and also with vegetation...
- Directly imaged exoplanets probe
the atmospheres of giant planets. The detection of such planets requires high-quality imaging with high contrast and angular resolution, as well as
- We can remotely sense an atmosphere by observing its reflected, transmitted, or emitted light in varying geometries. This light will contain information on
the planetary conditions including atmospheric temperature and cloud properties, which is essential for understanding how the planet interacts with the atmosphere and how it affects the climate. The primary science objective of this paper is to develop a methodology that can be applied to any kind of observation and measurement data, and to provide a framework that enables the detection and characterization of the atmospheres of exoplanets
Model description
GPT-2: 12-layer, 768-hidden, 12-heads, 117M parameters
Intended uses & limitations
Coming soon...
- Predictive Keyboard using text generation
- Realtime reference recommendations using nearest neighbors of embeddings
Be careful when generating a lot of text or when changing the sampling mode of the language model. It can sometimes produce things that are not truthful, e.g.,
- The surface of Mars is composed of a thin layer of water ice, that was discovered by the Cassini spacecraft after its impact on the Earth's surface.
Training procedure
~49 hours on a 3090 training for 1.25M iterations
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
Framework versions
- Transformers 4.25.1
- Pytorch 1.13.1
- Tokenizers 0.13.2