File size: 6,094 Bytes
9486daf f8b3b43 039894b f78b66e 039894b f78b66e 36437f2 f78b66e bfeaaab 1ad202c bfeaaab 039894b 6e869ae 039894b 6e869ae 039894b 6e869ae 039894b 6e869ae 039894b 479957e 022edb1 479957e 1915b64 1974310 f4de87a 1915b64 039894b 36437f2 039894b 36437f2 039894b 6e869ae f78b66e f4de87a 1974310 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 |
---
license: cc-by-4.0
language:
- he
inference: false
---
# **DictaLM**: A Large Generative Language Model for Modern Hebrew
A large generative pretrained transformer (GPT) language model for Hebrew, released [here](https://arxiv.org/abs/2309.14568).
- This is an alpha version of the model, and there are many improvements to come.
- We are actively working on improving the model, so stay tuned.
This model was fine-tuned for instructions, here are a few examples of the different types of instructions the model was trained on:
- General questions:
```
ืื ืื ืืืช ืกืคืจ?
```
```
ืงืืืืชื ืืชื ืงื ืืืฆืืข. ืืื ืืืจื ืื ืืื ื ืืืคื ืืื?
```
- Simple tasks:
```
ืชืฆืืข ืืื ืจืขืืื ืืช ืืคืขืืืืช ืขื ืืืืื ืื ื 5:
```
- Information retrieval from a paragraph context:
```
ืืืกืืง ืืืื ื ืืื ืืืจื ืืืกืืจืชืืช ืืืขืชืืงื ืืงืืืฃ ืืืชืื. ืฉืืื ืื ืืืจืฉืช ืืื ืืื ืจื ืืืืคื ืืืกื ืืขืืืื ืืงืืืืช ืืืฉืจืื ืืืืงืืืืช ืจืืื ืืขืืื. ืฉืืืืช ืืกืืง ืืื ื ืืืคืฉืจืืช ืืืกืืื ืขืืืืืช ืืืงืืืืช ืืื ืืื ืืืื ืืื ืืขืืืช ืืฉืืืืช ืืืืืื ืืช ืืืืื. ืืืืชืื ืืืืืขืืื ืืืืื (ืืืืืฉื, ืื ืืืื ืืืืชืื ืืฉืื) ืืชืืื ืืืชืจ ืืกืืง ืืื ื ืืืืื ืฉืืคืจื ืคืืืช ื ืคืืข ืืืืื ืืืกืืง ืืฉืืื ืื (ืคืืืขืืช ืืงืืืคืช ืืคืจื ืืืืชืื ืืฉืื ืคืืืช ืืฉืืขืืชืืืช). ืืื ืื ืืืขืืฃ ืืกืืง ืืื ื ืืืืืจืื ืืื ืืืืคืืืจืคืื ืืืงืืืืช ืื ืฆืคืืคืืช ืืขืฆืื ืื ืืืคืฉืจืื ืืืฉื ื ืืื ืืืืื ืืื ืื. ืืฉืืื ืืืื ืืช ืืืคืฉืจืช ืื ืืืกืืง ืขืฆืื ืฉืื ืื ืืืืขืืื ืฉืื ืื, ืืืชืื ืืงืฆื ืืืฉืืช ืืคืจื ืืืืขื ืืื ืขืฅ.
ืขื ืืกืืก ืืคืกืงื ืืืืช, ืื ืืื ืืืชืจืื ืฉื ืืกืืง ืืื ื ืืืืื ืช ืงืฆื ืืืฉืืช ืืคืจื?
```
## Sample usage:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained('dicta-il/dictalm-7b-instruct')
model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-7b-instruct', trust_remote_code=True).cuda()
model.eval()
with torch.inference_mode():
prompt = 'ืชืฆืืข ืืื ืจืขืืื ืืช ืืคืขืืืืช ืขื ืืืืื ืื ื 5:\n'
kwargs = dict(
inputs=tokenizer(prompt, return_tensors='pt').input_ids.to(model.device),
do_sample=True,
top_k=50,
top_p=0.95,
temperature=0.75,
max_length=100,
min_new_tokens=5
)
print(tokenizer.batch_decode(model.generate(**kwargs), skip_special_tokens=True))
```
There are many different parameters you can input into `kwargs` for different results (greedy, beamsearch, different sampling configurations, longer/shorter respones, etc.).
You can view the full list of parameters you can pass to the `generate` function [here](https://huggingface.co/docs/transformers/v4.33.0/en/main_classes/text_generation#transformers.GenerationMixin.generate).
### Alternative ways to initialize the model:
If you have multiple smaller GPUs, and the package `accelerate` is installed, you can initialize the model split across the devices:
```python
model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-7b-instruct', trust_remote_code=True, device_map='auto')
```
If you are running on linux and have the `bitsandbytes` package installed, you can initialize the model in 4/8 bit inference mode:
```python
model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-7b-instruct', trust_remote_code=True, load_in_8bit=True)
```
If you have [FlashAttention](https://github.com/Dao-AILab/flash-attention) installed in your environment, you can instruct the model to use the flash attention implementation (either V1 or V2, whichever is installed):
```python
model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-7b-instruct', trust_remote_code=True, use_flash_attention=True)
```
## Colab notebook demos
You can try the model on a free tier google colab using the following notebooks:
* [Streamlit based](https://colab.research.google.com/drive/1hn23eA4m7ISW2e40DsAB6sbLRok4RKqS?usp=sharing) - you will need first to log in https://ngrok.com/ and get an authtoken, then paste it in the notebook ([screenshot][screen-shot-streamlit]).
* [Gradio based](https://gist.github.com/Norod/11997c0c9a330d0eeb9a6d4791b9aa2f) - uses deep speed for faster inference, text streamer to get the results as they are being generated and the UI is a widget embedded in the notebook ([screenshot][screen-shot-gradio]).
## Citation
If you use DictaLM in your research, please cite ```DictaLM -- A Large Generative Language Model for Modern Hebrew```
**BibTeX:**
```bibtex
@misc{shmidman2023introducing,
title={Introducing DictaLM -- A Large Generative Language Model for Modern Hebrew},
author={Shaltiel Shmidman and Avi Shmidman and Amir David Nissan Cohen and Moshe Koppel},
year={2023},
eprint={2309.14568},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
## License
Shield: [![CC BY 4.0][cc-by-shield]][cc-by]
This work is licensed under a
[Creative Commons Attribution 4.0 International License][cc-by].
[![CC BY 4.0][cc-by-image]][cc-by]
[cc-by]: http://creativecommons.org/licenses/by/4.0/
[cc-by-image]: https://i.creativecommons.org/l/by/4.0/88x31.png
[cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg
[screen-shot-streamlit]: https://mitmachim.top/assets/uploads/files/1696309842384-36f732e3-d168-4f4c-8eaf-e78ec23e6bf6-image.png
[screen-shot-gradio]: https://scontent.ftlv2-1.fna.fbcdn.net/v/t39.30808-6/384119874_10160090112344007_6111524432595263230_n.jpg?_nc_cat=111&ccb=1-7&_nc_sid=5f2048&_nc_ohc=tWYAm8uz7T4AX9DNlRD&_nc_ht=scontent.ftlv2-1.fna&cb_e2o_trans=t&oh=00_AfBocAujdLaWJqYWrNGtgdz99Cdz8JEqO_ez70SXqlf_2Q&oe=654D476D |