metadata
license: mit
language: de
pipeline_tag: text-generation
widget:
- text: >-
In einer schockierenden Entdeckung fanden Wissenschaftler eine Herde
Einhörner, die in
example_title: Einhörner ...
- text: >-
Definiere folgende Wörter
Wort: Einhorn
Definition: Das Einhorn ist ein Fabelwesen von Pferde- oder Ziegengestalt
mit einem geraden Horn auf der Stirnmitte.
Wort: Regierungschef
Definition: Der Regierungschef ist der Leiter der Regierung eines Staates
(z. B. National- oder Gliedstaat).
Wort: Waffendrill
Definition:
example_title: Definiere ...
German GPT2-XL (1.5B)
- trained with BigScience's DeepSpeed-Megatron-LM code base
- word embedding initialized with WECHSEL and all other weights taken from English gpt2-xl
- ~ 3 days on 16xA100 GPUs (~ 80 TFLOPs / GPU)
- stopped after 100k steps
- 26.2B tokens
- less than a single epoch on
oscar_unshuffled_deduplicated_de
(excluding validation set; original model was trained for 75 epochs on less data) - bf16
- zero stage 0
- tp/pp = 1
How to use
You can use this model directly with a pipeline for text generation. Since the generation relies on some randomness, we set a seed for reproducibility:
>>> from transformers import pipeline, set_seed
>>> generator = pipeline('text-generation', model='malteos/gpt2-xl-wechsel-german')
>>> set_seed(42)
>>> generator("Hello, I'm a language model,", max_length=30, num_return_sequences=5)
[{'generated_text': "Hello, I'm a language model, a language for thinking, a language for expressing thoughts."},
{'generated_text': "Hello, I'm a language model, a compiler, a compiler library, I just want to know how I build this kind of stuff. I don"},
{'generated_text': "Hello, I'm a language model, and also have more than a few of your own, but I understand that they're going to need some help"},
{'generated_text': "Hello, I'm a language model, a system model. I want to know my language so that it might be more interesting, more user-friendly"},
{'generated_text': 'Hello, I\'m a language model, not a language model"\n\nThe concept of "no-tricks" comes in handy later with new'}]
Here is how to use this model to get the features of a given text in PyTorch:
from transformers import GPT2Tokenizer, GPT2Model
tokenizer = GPT2Tokenizer.from_pretrained('malteos/gpt2-xl-wechsel-german')
model = GPT2Model.from_pretrained('malteos/gpt2-xl-wechsel-german')
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
Evaluation
Model (size) | PPL |
---|---|
gpt2-xl-wechsel-german (1.5B) |
14.5 |
gpt2-wechsel-german-ds-meg (117M) |
26.4 |
gpt2-wechsel-german (117M) |
26.8 |
gpt2 (retrained from scratch) (117M) |
27.63 |
Other German language models
- https://huggingface.co/malteos/bloom-1b5-clp-german
- https://huggingface.co/malteos/bloom-6b4-clp-german
- https://huggingface.co/malteos/bloom-6b4-clp-german-oasst-v0.1
License
MIT