sidharthsajith7
/

armaGPT

Question Answering

text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

armaGPT / README.md

sidharthsajith7's picture

sidharthsajith7

Update README.md

5f803d6 verified 3 months ago

|

1.71 kB

	---
	license: mit
	datasets:
	- yahma/alpaca-cleaned
	- HuggingFaceH4/ultrafeedback_binarized
	language:
	- en
	pipeline_tag: question-answering
	library_name: transformers
	---

	Model Description: armaGPT is a finetuned version of Gemma 7b, a pre-trained language model developed by Google. It is designed to generate human-like text based on the input it receives. And armaGPT is finetuned using DPO Training for fair and safe generation.

	Model Architecture: The architecture of armaGPT is based on the transformer model, which is a type of recurrent neural network (RNN) that uses self-attention mechanisms to process input sequences.

	Model Size: The model has approximately 7 billion parameters.



	### Context Length
	Models are trained on a context length of 8192 tokens.

	#### Running the model on a CPU


	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	tokenizer = AutoTokenizer.from_pretrained("sidharthsajith7/armaGPT")
	model = AutoModelForCausalLM.from_pretrained("sidharthsajith7/armaGPT")
	input_text = "Write me a poem about Machine Learning."
	input_ids = tokenizer(input_text, return_tensors="pt")
	outputs = model.generate(**input_ids)
	print(tokenizer.decode(outputs[0]))
	```


	#### Running the model on a single / multi GPU


	```python
	# pip install accelerate
	from transformers import AutoTokenizer, AutoModelForCausalLM
	tokenizer = AutoTokenizer.from_pretrained("sidharthsajith7/armaGPT")
	model = AutoModelForCausalLM.from_pretrained("sidharthsajith7/armaGPT", device_map="auto")
	input_text = "Write me a poem about Machine Learning."
	input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
	outputs = model.generate(**input_ids)
	print(tokenizer.decode(outputs[0]))
	```