|
--- |
|
license: mit |
|
datasets: |
|
- yahma/alpaca-cleaned |
|
- HuggingFaceH4/ultrafeedback_binarized |
|
language: |
|
- en |
|
pipeline_tag: question-answering |
|
library_name: transformers |
|
--- |
|
|
|
Model Description: armaGPT is a finetuned version of Gemma 7b, a pre-trained language model developed by Google. It is designed to generate human-like text based on the input it receives. And armaGPT is finetuned using DPO Training for fair and safe generation. |
|
|
|
Model Architecture: The architecture of armaGPT is based on the transformer model, which is a type of recurrent neural network (RNN) that uses self-attention mechanisms to process input sequences. |
|
|
|
Model Size: The model has approximately 7 billion parameters. |
|
|
|
|
|
|
|
### Context Length |
|
Models are trained on a context length of 8192 tokens. |
|
|
|
#### Running the model on a CPU |
|
|
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
tokenizer = AutoTokenizer.from_pretrained("sidharthsajith7/armaGPT") |
|
model = AutoModelForCausalLM.from_pretrained("sidharthsajith7/armaGPT") |
|
input_text = "Write me a poem about Machine Learning." |
|
input_ids = tokenizer(input_text, return_tensors="pt") |
|
outputs = model.generate(**input_ids) |
|
print(tokenizer.decode(outputs[0])) |
|
``` |
|
|
|
|
|
#### Running the model on a single / multi GPU |
|
|
|
|
|
```python |
|
# pip install accelerate |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
tokenizer = AutoTokenizer.from_pretrained("sidharthsajith7/armaGPT") |
|
model = AutoModelForCausalLM.from_pretrained("sidharthsajith7/armaGPT", device_map="auto") |
|
input_text = "Write me a poem about Machine Learning." |
|
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda") |
|
outputs = model.generate(**input_ids) |
|
print(tokenizer.decode(outputs[0])) |
|
``` |