snipaid
/

gptj-title-teaser-1k

 ---
+language: de
 license: mit
+inference: false
+tags:
+- gptj
+- title generation
+- headline generation
+- teaser generation
+- news
 ---
+# Model Card for Model GPT-J-Title-Teaser-1k
+<!-- Provide a quick summary of what the model is/does. -->
+gptj-title-teaser-1k
+Version 1.0 / 22 December 2022
+A proof of concept for multitask fine-tuning [GPT-J-6B-8bit](https://huggingface.co/hivemind/gpt-j-6B-8bit) for title and teaser generation for german news.
+# Model Details
+## Model Description
+- **Developed by:** snipaid
+- **Model type:** gptj
+- **Language(s) (NLP):** de
+- **License:** MIT
+- **Finetuned from model:** [GPT-J-6B-8bit](https://huggingface.co/hivemind/gpt-j-6B-8bit)
+# Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+This model is not intended for use! It is a preliminary version of gptj-title-teaser-10k to prove the multitask fine-tuning approach.
+For use please refer to [gptj-title-teaser-10k](https://huggingface.co/snipaid/gptj-title-teaser-10k).
+# Training Details
+## Training Data
+<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+The model was finetuned on a collection of 1,000 news items scraped from different online news outlets in german language.
+For each news item the dataset contains title, teaser and fulltext.
+```
+[
+ {
+    "title": ...,
+    "teaser": ...,
+    "fulltext": ...
+  },
+]
+```
+## Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+The model was finetuned using a causal language modeling (CLM) objective for multitask finetuning.
+### Preprocessing
+For each news item, two inputs were concatenated like below.
+```
+f"[Text]: {item.fulltext} \n [Title]: {item.title}"
+f"[Text]: {item.fulltext} \n [Teaser]: {item.teaser}"
+```
+This results in one input per task for each news item.
+*Note: The inserted prompt "[Text]:" marks the beginning of the news item's fulltext.
+In the same manner "[Title]:" prompts the news item's title and "[Teaser]:" the news item's teaser.*
+# Evaluation
+1,000 german news articles proved to be sufficient to validate the approach.
+Evaluation showed that the model improved compared to the GPT-J baseline in:
+- german language capabilities (significantly)
+- title generation (significantly)
+- teaser generation (slightly)
+The evaluation also suggested that there is still opportunity for improvement with more data.
+For the model trained with the same approach but 10x the amount of data pleaser refer to [gptj-title-teaser-10k](https://huggingface.co/snipaid/gptj-title-teaser-10k).
+# Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions were estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** A100 SXM4
+- **Hours used:** 2h 42min
+- **Cloud Provider:** Vast.ai
+- **Compute Region:** Unknown
+- **Carbon Emitted:** ~0.47kg co2e
+# Glossary
+**News Item**, aka news article or news story. A particular piece of news, usually from a journalistic source.
+**Snippet**, a small section of text that is related to a news item.
+**Title** aka headline. A few words that reflect the essence of the news story.
+**Teaser** aka lede. A few sentences that spark curiousity about the "best of the rest" of the news story.