--- license: mit datasets: - yahma/alpaca-cleaned - HuggingFaceH4/ultrafeedback_binarized language: - en pipeline_tag: question-answering library_name: transformers --- Model Description: armaGPT is a finetuned version of Gemma 7b, a pre-trained language model developed by Google. It is designed to generate human-like text based on the input it receives. And armaGPT is finetuned using DPO Training for fair and safe generation. Model Architecture: The architecture of armaGPT is based on the transformer model, which is a type of recurrent neural network (RNN) that uses self-attention mechanisms to process input sequences. Model Size: The model has approximately 7 billion parameters. ### Context Length Models are trained on a context length of 8192 tokens. #### Running the model on a CPU ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("sidharthsajith7/armaGPT") model = AutoModelForCausalLM.from_pretrained("sidharthsajith7/armaGPT") input_text = "Write me a poem about Machine Learning." input_ids = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**input_ids) print(tokenizer.decode(outputs[0])) ``` #### Running the model on a single / multi GPU ```python # pip install accelerate from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("sidharthsajith7/armaGPT") model = AutoModelForCausalLM.from_pretrained("sidharthsajith7/armaGPT", device_map="auto") input_text = "Write me a poem about Machine Learning." input_ids = tokenizer(input_text, return_tensors="pt").to("cuda") outputs = model.generate(**input_ids) print(tokenizer.decode(outputs[0])) ```