File size: 2,213 Bytes

9311aad
c5e6cad
9311aad
 
2f94fab
ce44516
c5e6cad
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11c1208

---
license: llama3.1
tags:
- code
---

Model Overview
Model Name: Llama 3.1 180M Untrained
Model Size: 180M parameters
Tensor Type: F32
License: MIT
Model Type: Untrained Language Model
Framework: PyTorch

Model Description
The Llama 3.1 180M Untrained model is a lightweight, untrained language model designed to serve as a starting point for research and experimentation in natural language processing. With 180 million parameters, this model is suitable for fine-tuning on specific tasks or domains, offering a balance between model complexity and computational efficiency.

Intended Use
This model is intended for research purposes and fine-tuning on specific tasks such as text classification, sentiment analysis, or other NLP tasks. As the model is untrained, it requires fine-tuning on relevant datasets to achieve desired performance.

Fine-Tuning Requirements
GPU Requirements:
Full Fine-Tuning: This model requires a GPU with at least 24 GB of VRAM for full fine-tuning at a sequence length of 4096.
Supported GPUs: NVIDIA RTX 3090, A100, or equivalent.
Training Data
This model has not been trained on any data. Users are encouraged to fine-tune the model on datasets that are appropriate for their specific use case.

Evaluation
As the model is untrained, it has not been evaluated on any benchmark datasets. Performance metrics should be determined after fine-tuning.

Limitations
Untrained: The model is untrained and will not perform well on any task until it has been fine-tuned.
Ethical Considerations: Users should be mindful of the ethical implications of deploying fine-tuned models, especially in sensitive applications.



from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the fine-tuned model and tokenizer
model = AutoModelForCausalLM.from_pretrained("oktrained/llama3.1_180M_untrained")
tokenizer = AutoTokenizer.from_pretrained("oktrained/llama3.1_180M_untrained")

# Sample input text
input_text = "Once upon a time"

# Tokenize input
inputs = tokenizer(input_text, return_tensors="pt")

# Generate output
output = model.generate(**inputs, max_length=50)

# Decode output
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print(generated_text)