Florents-Tselai
commited on
Commit
•
a599fc6
1
Parent(s):
f60a82c
Update README.md
Browse files
README.md
CHANGED
@@ -13,21 +13,14 @@ base_model:
|
|
13 |
|
14 |
# Meltemi 7B Instruct v1.5 gguf
|
15 |
|
16 |
-
This is [Meltemi 7B Instract v1.5](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1.5) published in `gguf`, [llama.cpp](https://github.com/ggerganov/llama.cpp)-compatible format.
|
17 |
-
|
18 |
-
Meltemi is the first Greek Large Language Model (LLM) trained by the [Institute for Language and Speech Processing](https://www.athenarc.gr/en/ilsp) at [Athena Research & Innovation Center](https://www.athenarc.gr/en).
|
19 |
-
Meltemi is built on top of [Mistral-7B-Instruct](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1), extending its capabilities for Greek through continual pretraining on a large corpus of high-quality and locally relevant Greek texts.
|
20 |
-
|
21 |
|
22 |
# Model Information
|
23 |
|
24 |
- Vocabulary extension of the Mistral 7b tokenizer with Greek tokens for lower costs and faster inference (**1.52** vs. 6.80 tokens/word for Greek)
|
25 |
- 8192 context length
|
26 |
-
- Fine-tuning has been done with the [Odds Ratio Preference Optimization (ORPO)](https://arxiv.org/abs/2403.07691) algorithm using 97k preference data:
|
27 |
-
* 89,730 Greek preference data which are mostly translated versions of high-quality datasets on Hugging Face
|
28 |
-
* 7,342 English preference data
|
29 |
-
- Our alignment procedure is based on the [TRL - Transformer Reinforcement Learning](https://huggingface.co/docs/trl/index) library and partially on the [Hugging Face finetuning recipes](https://github.com/huggingface/alignment-handbook)
|
30 |
|
|
|
31 |
|
32 |
# Instruction format
|
33 |
|
@@ -52,4 +45,3 @@ llama-server -m ./Meltemi-7B-Instruct-v1.5-F16.gguf --port 8080
|
|
52 |
```
|
53 |
|
54 |
|
55 |
-
For more details please refer to the original model https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1.5
|
|
|
13 |
|
14 |
# Meltemi 7B Instruct v1.5 gguf
|
15 |
|
16 |
+
This is [Meltemi 7B Instract v1.5](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1.5), the first Greek Large Language Model (LLM) published in the `gguf`, [llama.cpp](https://github.com/ggerganov/llama.cpp)-compatible format.
|
|
|
|
|
|
|
|
|
17 |
|
18 |
# Model Information
|
19 |
|
20 |
- Vocabulary extension of the Mistral 7b tokenizer with Greek tokens for lower costs and faster inference (**1.52** vs. 6.80 tokens/word for Greek)
|
21 |
- 8192 context length
|
|
|
|
|
|
|
|
|
22 |
|
23 |
+
For more details, please refer to the original model card [Meltemi 7B Instract v1.5](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1.5)
|
24 |
|
25 |
# Instruction format
|
26 |
|
|
|
45 |
```
|
46 |
|
47 |
|
|