Edit model card

Model Card for Model ID

Model Details

  • To fine-tune Llama 3.1 for improved support of the Arabic language, I will utilize a dataset consisting of Arabic conversations.

Fine-tuning large language models (LLMs) like Llama 3.1 on a dataset containing text in a new language, such as Arabic, enhances their ability to understand, generate, and effectively use that language. This process allows the model to learn the nuances, grammar, vocabulary, and cultural context specific to Arabic. Consequently, it becomes more proficient in producing coherent and contextually relevant text in Arabic, thus expanding its multilingual capabilities.

Model Description

  • Llama3.1_8k

  • context window 128k

  • Developed by: [Alber Bshara]

  • Language(s) (NLP): [Arabic (Ar), English (En)]

  • License: [NeptoneAI]

  • Finetuned from model: [Fine-tuned from LLaMA3.1_8k model]

Model Sources [optional]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

How to Get Started with the Model

  • To use this model, please scroll to the bottom of this page to see instance usage examples.

Training Details

Training Data

https://huggingface.co/M-A-D#:~:text=The%20Mixed%20Arabic%20Datasets%20(MAD,language%20datasets%20across%20the%20Internet.

Training hyperparameters

The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 2 - eval_batch_size: 8 - seed: 3407 - gradient_accumulation_steps: 4 - total_train_batch_size: 8 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 5 - mixed_precision_training: Native AMP

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

''' question = "ูƒูŠู ูŠู…ูƒู†ูƒ ุฅูŠุฌุงุฏ ุงู„ู‚ุงุณู… ุงู„ู…ุดุชุฑูƒ ุงู„ุฃูƒุจุฑ ู„ู„ุนุฏุฏูŠู† 81 ูˆ 27ุŸ"

ans=

ูŠู…ูƒู†ู†ุง ุงุณุชุฎุฏุงู… "ุฎูˆุงุฑุฒู…ูŠุฉ ุฅู‚ู„ูŠุฏุณ" ู„ุฅูŠุฌุงุฏ ุงู„ู‚ุงุณู… ุงู„ู…ุดุชุฑูƒ ุงู„ุฃูƒุจุฑ ู„ู„ุนุฏุฏูŠู† 81 ูˆ 27. ู‡ุฐู‡ ุงู„ุฎูˆุงุฑุฒู…ูŠุฉ ุชุนุชู…ุฏ ุนู„ู‰ ุงู„ู‚ุณู…ุฉ ุงู„ู…ุชูƒุฑุฑุฉุŒ ุญูŠุซ ู†ู‚ุณู… ุงู„ุนุฏุฏ ุงู„ุฃูƒุจุฑ ุนู„ู‰ ุงู„ุนุฏุฏ ุงู„ุฃุตุบุฑุŒ ุซู… ู†ู‚ุณู… ุงู„ุนุฏุฏ ุงู„ุฃุตุบุฑ ุนู„ู‰ ุงู„ุจุงู‚ูŠ ู…ู† ุงู„ู‚ุณู…ุฉ ุงู„ุณุงุจู‚ุฉุŒ ูˆู†ุฌุฑูŠ ู‡ุฐู‡ ุงู„ุนู…ู„ูŠุฉ ุญุชู‰ ูŠูƒูˆู† ุงู„ุจุงู‚ูŠ ุตูุฑู‹ุง.

  1. ุงู„ุฎุทูˆุฉ 1: ู†ู‚ุณู… 81 ุนู„ู‰ 27: 81 รท 27 = 3 ูˆุงู„ุจุงู‚ูŠ R = 0.

  2. ุงู„ู†ุชูŠุฌุฉ ุงู„ู†ู‡ุงุฆูŠุฉ: ุนู†ุฏู…ุง ูŠูƒูˆู† ุงู„ุจุงู‚ูŠ ุตูุฑู‹ุงุŒ ูุฅู† ุงู„ุนุฏุฏ ุงู„ุฐูŠ ุชู… ุงู„ู‚ุณู…ุฉ ุนู„ูŠู‡ ููŠ ุชู„ูƒ ุงู„ุฎุทูˆุฉ ู‡ูˆ ุงู„ู‚ุงุณู… ุงู„ู…ุดุชุฑูƒ ุงู„ุฃูƒุจุฑ (GCD). ููŠ ู‡ุฐู‡ ุงู„ุญุงู„ุฉุŒ ุนู†ุฏู…ุง ุฃุตุจุญ ุงู„ุจุงู‚ูŠ ุตูุฑู‹ุงุŒ ูƒุงู† ุงู„ุนุฏุฏ ุงู„ู…ู‚ุณูˆู… ุนู„ูŠู‡ ู‡ูˆ 27ุŒ ูˆุจุงู„ุชุงู„ูŠ GCD ู„ู€ 81 ูˆ27 ู‡ูˆ 27. '''

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: [More Information Needed]
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

  • can run on the T4, L4 GPU or other powerfull GPUs.

Software

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.2
  • Pytorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

How to Use it:

import sys, os
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from unsloth.chat_templates import get_chat_template
from typing import Tuple, Dict, Any, List
import torch

class LLM:
    def __init__(self, load_in_4bit: bool = True,
                 load_cpu_mem_usage: bool = True,
                 hf_model_path: str = "AlberBshara/ar_llama3.1",
                 max_new_tokens: int= 4096):
        """
        Args:
            load_in_4bit (bool): Use 4-bit quantization. Defaults to True.
            load_cpu_mem_usage (bool): Reduce CPU memory usage. Defaults to True.
            hf_model_path (str): The path of your model on HuggingFace-Hub like "your-user-name/model-name".
        """
        assert torch.cuda.is_available(), "CUDA is not available. An NVIDIA GPU is required."
        hf_auth_token = HUGGING_FACE_API_TOKEN
        # Specify the quantization config
        self._bnb_config = BitsAndBytesConfig(load_in_4bit=load_in_4bit)

        # Load model directly with quantization config
        self.model = AutoModelForCausalLM.from_pretrained(
            hf_model_path,
            low_cpu_mem_usage=load_cpu_mem_usage,
            quantization_config=self._bnb_config,
            use_auth_token=hf_auth_token
        )

        # Load the tokenizer
        self.tokenizer = AutoTokenizer.from_pretrained(
            hf_model_path,
            use_auth_token=hf_auth_token
        )
        self.__tokenizer = get_chat_template(
            self.tokenizer,
            chat_template="llama-3",
            mapping={"role": "from", "content": "value", "user": "human", "assistant": "gpt"},
        )

        self._hf_model_path = hf_model_path
        self._EOS_TOKEN_ID = self.__tokenizer.eos_token_id
        self.max_new_tokens = max_new_tokens

        self._prompt = lambda context, question: f"""
        Please provide a detailed answer to the question using only the information provided in the context. Do not include any information that is not explicitly mentioned in the context.

        Context: [{context}]

        - If the context is in Arabic, answer in Arabic; otherwise, answer in English.

        Question: [{question}]

        Your answer should be comprehensive, thoroughly explaining the topic while staying within the boundaries of the provided context.
        """

    def invoke(self, context: str, question: str) -> Tuple:
        if not question.strip():
            raise ValueError("question cannot be empty or None")

        if not context.strip():
            raise ValueError("context cannot be empty or None")

        inputs = self._prompt(context, question)

        messages = [{"from": "human", "value": inputs}]
        inputs = self.__tokenizer.apply_chat_template(
              messages,
              tokenize=True,
              add_generation_prompt=True, # Must add for generation
              return_tensors="pt",
        ).to("cuda")
        
        # Increase the max_new_tokens to allow more detailed responses
        output_ids = self.model.generate(inputs, max_new_tokens=self.max_new_tokens, pad_token_id=self.__tokenizer.pad_token_id)
        output_ids = output_ids.tolist()[0] if output_ids.size(0) == 1 else output_ids.tolist()

        output_text = self.__tokenizer.decode(output_ids, skip_special_tokens=True)

        # Caching GPU Mem.
        del inputs
        del output_ids
        torch.cuda.empty_cache()

        return output_text, messages

    def extract_answer(self, response: str) -> str:
        start_with: str = ".assistant"
        start_index = response.find(start_with)

        # If the word is found, extract the substring from that point onward
        if start_index != -1:
            # Move start_index to the end of the word
            start_index += len(start_with)
            return response[start_index:]
        else:
            return response

    def get_metadata(self) -> Dict[str, Any]:
        return {
            "class_name": self.__class__.__name__,
            "init_params": {
                "load_in_4bit": True,
                "load_cpu_mem_usage": True,
                "hf_model_path": "AlberBshara/ar_llama3.1",
                "hf_auth_token": "--%$%--",
                 "max_new_tokens": self.max_new_tokens
            },
            "methods": ["invoke", "extract_answer"]
        }


llm = LLM()
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for AlberBshara/ar_llama3.1

Finetuned
(220)
this model

Dataset used to train AlberBshara/ar_llama3.1