behbudiy
/

Mistral-7B-Instruct-Uz

Text Generation

text-generation-inference

question-answering

Inference Endpoints

Model card Files Files and versions Community

azimjon commited on Sep 16

Commit

cfa72b8

•

1 Parent(s): 8318dde

Update README.md

Files changed (1) hide show

README.md +57 -0

README.md CHANGED Viewed

@@ -101,6 +101,63 @@ chatbot = pipeline("text-generation", model="behbudiy/Mistral-7B-Instruct-Uz")
 chatbot(messages)
 ```
 ## More
 For more details and examples, refer to the base model below:
 https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3

 chatbot(messages)
 ```
+## Information on Evaluation Method
+To evaluate on the translation task, we used FLORES+ Uz-En / En-Uz datasets, where we merged the dev and test sets to create a bigger evaluation data for each Uz-En and En-Uz subsets.
+We used the following prompt to do one-shot Uz-En evaluation both for the base model and Uzbek-optimized model (for En-Uz eval, we changed the positions of the words "English" and "Uzbek").
+```python
+  prompt = f'''You are a professional Uzbek-English translator. Your task is to accurately translate the given Uzbek text into English.
+  Instructions:
+  1. Translate the text from Uzbek to English.
+  2. Maintain the original meaning and tone.
+  3. Use appropriate English grammar and vocabulary.
+  4. If you encounter an ambiguous or unfamiliar word, provide the most likely translation based on context.
+  5. Output only the English translation, without any additional comments.
+  Example:
+  Uzbek: "Bugun ob-havo juda yaxshi, quyosh charaqlab turibdi."
+  English: "The weather is very nice today, the sun is shining brightly."
+  Now, please translate the following Uzbek text into English:
+  "{sentence}"
+    '''
+```
+To assess the model's ability in Uzbek sentiment analysis, we used the **risqaliyevds/uzbek-sentiment-analysis** dataset, for which we created binary labels (0: Negative, 1: Positive) using GPT-4o API (refer to **behbudiy/uzbek-sentiment-analysis** dataset).
+We used the following prompt for the evaluation:
+```python
+prompt = f'''Given the following text, determine the sentiment as either 'Positive' or 'Negative.' Respond with only the word 'Positive' or 'Negative' without any additional text or explanation.
+Text: {text}"
+'''
+```
+For Uzbek News Classification, we used **risqaliyevds/uzbek-zero-shot-classification** dataset and asked the model to predict the category of the news using the following prompt:
+```python
+prompt = f'''Classify the given Uzbek news article into one of the following categories. Provide only the category number as the answer.
+Categories:
+0 - Politics (Siyosat)
+1 - Economy (Iqtisodiyot)
+2 - Technology (Texnologiya)
+3 - Sports (Sport)
+4 - Culture (Madaniyat)
+5 - Health (Salomatlik)
+6 - Family and Society (Oila va Jamiyat)
+7 - Education (Ta'lim)
+8 - Ecology (Ekologiya)
+9 - Foreign News (Xorijiy Yangiliklar)
+Now classify this article:
+"{text}"
+Answer (number only):"
+'''
+```
 ## More
 For more details and examples, refer to the base model below:
 https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3