interneuronai
/

az-gptneo

Model card Files Files and versions Community

az-gptneo / README.md

nijatzeynalov's picture

Update README.md

bf9bf99 verified 9 months ago

|

history blame contribute delete

2.31 kB

	---
	library_name: peft
	base_model: EleutherAI/gpt-neo-2.7B
	---


	Model Details

	Original Model: EleutherAI/gpt-neo-2.7B
	Fine-Tuned For: Azerbaijani language understanding and generation
	Dataset Used: Azerbaijani translation of the Stanford Alpaca dataset
	Fine-Tuning Method: Self-instruct method


	This model, is part of the ["project/Barbarossa"](https://github.com/Alas-Development-Center/project-barbarossa) initiative, aimed at enhancing natural language processing capabilities for the Azerbaijani language. By fine-tuning this model on the Azerbaijani translation of the Stanford Alpaca dataset using the self-instruct method, we've made significant strides in improving AI's understanding and generation of Azerbaijani text.

	__Our primary objective with this model is to offer insights into the feasibility and outcomes of fine-tuning large language models (LLMs) for the Azerbaijani language. The fine-tuning process was undertaken with limited resources, providing valuable learnings rather than creating a model ready for production use. Therefore, we recommend treating this model as a reference or a guide to understanding the potential and challenges involved in fine-tuning LLMs for specific languages. It serves as a foundational step towards further research and development rather than a direct solution for production environments.__


	This project is a proud product of the [Alas Development Center (ADC)](https://az.linkedin.com/company/alas-development-center?trk=ppro_cprof). We are thrilled to offer these finely-tuned large language models to the public, free of charge.

	How to use?

	```
	from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer, pipeline

	model_path = "alasdevcenter/az-gptneo"

	model = AutoModelForCausalLM.from_pretrained(model_path)
	tokenizer = AutoTokenizer.from_pretrained(model_path)

	pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)

	instruction = "Təbiətin qorunması "
	formatted_prompt = f"""Aşağıda daha çox kontekst təmin edən təlimat var. Sorğunu adekvat şəkildə tamamlayan cavab yazın.
	### Təlimat:
	{instruction}
	### Cavab:
	"""

	result = pipe(formatted_prompt)
	print(result[0]['generated_text'])
	```