|
--- |
|
license: apache-2.0 |
|
language: |
|
- de |
|
pipeline_tag: text-generation |
|
tags: |
|
- german |
|
- deutsch |
|
- simplification |
|
- vereinfachung |
|
--- |
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
We fine-tuned the [jphme/em_german_leo_mistral](https://huggingface.co/jphme/em_german_leo_mistral) with a set of ca. 2000 newspaper articles which have been simplified by the Austrian Press Agency. |
|
Our aim was to have a model which can simplify German-language text. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
|
|
- **Developed by:** Members of the [Public Interest AI research group](https://publicinterest.ai/), [HIIG Berlin](https://www.hiig.de/) |
|
- **Model type:** simplification model, text generation |
|
- **Language(s) (NLP):** German |
|
- **License:** Apache 2.0 |
|
- **Finetuned from model:** jphme/em_german_leo_mistral |
|
|
|
### Model Sources |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** https://github.com/fhewett/simba |
|
<!-- - **Paper [optional]:** [More Information Needed] --> |
|
- **Project website:** https://publicinterest.ai/tool/simba |
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
### Direct Use |
|
|
|
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. --> |
|
|
|
This model works best for simplifying German-language newspaper articles (news items, not commentaries or editorials). It may work for other types of texts. |
|
|
|
### Downstream Use |
|
|
|
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app --> |
|
We have fine-tuned using only newspaper articles. We have not yet performed extensive out-of-domain testing, but believe that the model's capabilities could be improved by fine-tuning on more diverse data. Contact us if you have a dataset which you think could work (parallel texts, German standard & German simplified). |
|
|
|
<!-- ### Out-of-Scope Use --> |
|
|
|
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. --> |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
<!-- This section is meant to convey both technical and sociotechnical limitations. --> |
|
|
|
As with most text generation models, the model sometimes produces information that is incorrect. |
|
|
|
### Recommendations |
|
|
|
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> |
|
|
|
Please check manually that your output text corresponds to the input text, as factual inconsistencies may have arisen. |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
[More Information Needed] |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
|
|
A sample of the data used to train our model can be found [here](https://github.com/fhewett/apa-rst/tree/main/original_texts). |
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision --> |
|
|
|
<!-- #### Speeds, Sizes, Times [optional] --> |
|
|
|
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. --> |
|
|
|
## Evaluation |
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
|
|
#### Summary |
|
|
|
For now, we have manually checked the performance of our model on a small sample of texts. Whilst it seems to produce good summaries of all texts, it only seems to simplify newspaper articles (i.e. similar to our training data). We have not yet applied any large-scale metrics based evaluation. |
|
|
|
|
|
<!-- ## Citation [optional] |
|
|
|
**BibTeX:** |
|
|
|
[More Information Needed] |
|
|
|
**APA:** |
|
|
|
[More Information Needed]--> |
|
|
|
## Model Card Contact |
|
|
|
simba -at- hiig.de |