Florents-Tselai
/

Meltemi-llamafile

@@ -11,37 +11,69 @@ base_model:
 - ilsp/Meltemi-7B-Instruct-v1.5
 ---
-# Meltemi 7B Instruct v1.5 gguf
-This is [Meltemi 7B Instract v1.5](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1.5), the first Greek Large Language Model (LLM) published in the `gguf`, [llama.cpp](https://github.com/ggerganov/llama.cpp)-compatible format.
-# Model Information
-- Vocabulary extension of the Mistral 7b tokenizer with Greek tokens for lower costs and faster inference (**1.52** vs. 6.80 tokens/word for Greek)
-- 8192 context length
-For more details, please refer to the original model card [Meltemi 7B Instract v1.5](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1.5)
-# Instruction format
-You can do whatever you can with a standard [llama.cpp](https://github.com/ggerganov/llama.cpp) model
-## Basic Usage
 ```shell
 llama-cli -m ./Meltemi-7B-Instruct-v1.5-F16.gguf -p "Ποιό είναι το νόημα της ζωής;" -n 128
 ```
-## Conversation Mode
 ```shell
 llama-cli -m ./Meltemi-7B-Instruct-v1.5-F16.gguf --conv
 ```
-## Web Server
 ```shell
 llama-server -m ./Meltemi-7B-Instruct-v1.5-F16.gguf --port 8080
 ```

 - ilsp/Meltemi-7B-Instruct-v1.5
 ---
+# Meltemi llamafile & gguf
+This repo contains `llamafile` and `gguf` file format models for [Meltemi 7B Instract v1.5](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1.5), the first Greek Large Language Model (LLM)
+lamafile is a file format introduced by Mozilla Ocho on Nov 20th 2023,
+and it collapses the complexity of an LLM into a single executable file.
+This gives you the easiest, fastest way to use Meltemi on Linux, MacOS, Windows, FreeBSD, OpenBSD, and NetBSD systems you control on both AMD64 and ARM64.
+It's as simple as this
+```shell
+wget https://huggingface.co/Florents-Tselai/Meltemi-llamafile/resolve/main/Meltemi-7B-Instruct-v1.5-Q8_0.llamafile
+chmod +x Meltemi-7B-Instruct-v1.5-Q8_0.llamafile
+```
+```shell
+./Meltemi-7B-Instruct-v1.5-Q8_0.llamafile
+```
+This will open a tab with a chatbot and completion interface in your browser.
+For additional help on how it may be used, pass the `--help` flag.
+The server also has an OpenAI API-compatible completions endpoint.
+An advanced CLI mode is provided that's useful for shell scripting.
+You can use it by passing the `--cli` flag. For additional help on how it may be used, pass the --help flag.
+```shell
+./Meltemi-7B-Instruct-v1.5-Q8_0.llamafile -p 'Ποιό είναι το νόημα της ζωής;'
+```
+To see all available options
+```shell
+./Meltemi-7B-Instruct-v1.5-Q8_0.llamafile --help
+```
+## gguf
+`gguf` file formats are also available if you're working with llama.cpp [llama.cpp](https://github.com/ggerganov/llama.cpp)
+llama.cpp offers quite a lot of options, thus refer to its documentation.
+### Basic Usage
 ```shell
 llama-cli -m ./Meltemi-7B-Instruct-v1.5-F16.gguf -p "Ποιό είναι το νόημα της ζωής;" -n 128
 ```
+### Conversation Mode
 ```shell
 llama-cli -m ./Meltemi-7B-Instruct-v1.5-F16.gguf --conv
 ```
+### Web Server
 ```shell
 llama-server -m ./Meltemi-7B-Instruct-v1.5-F16.gguf --port 8080
 ```
+# Model Information
+- Vocabulary extension of the Mistral 7b tokenizer with Greek tokens for lower costs and faster inference (**1.52** vs. 6.80 tokens/word for Greek)
+- 8192 context length
+For more details, please refer to the original model card [Meltemi 7B Instract v1.5](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1.5)