|
--- |
|
license: llama3.1 |
|
tags: |
|
- code |
|
--- |
|
|
|
Model Overview |
|
Model Name: Llama 3.1 180M Untrained |
|
Model Size: 180M parameters |
|
Tensor Type: F32 |
|
License: MIT |
|
Model Type: Untrained Language Model |
|
Framework: PyTorch |
|
|
|
Model Description |
|
The Llama 3.1 180M Untrained model is a lightweight, untrained language model designed to serve as a starting point for research and experimentation in natural language processing. With 180 million parameters, this model is suitable for fine-tuning on specific tasks or domains, offering a balance between model complexity and computational efficiency. |
|
|
|
Intended Use |
|
This model is intended for research purposes and fine-tuning on specific tasks such as text classification, sentiment analysis, or other NLP tasks. As the model is untrained, it requires fine-tuning on relevant datasets to achieve desired performance. |
|
|
|
Fine-Tuning Requirements |
|
GPU Requirements: |
|
Full Fine-Tuning: This model requires a GPU with at least 24 GB of VRAM for full fine-tuning at a sequence length of 4096. |
|
Supported GPUs: NVIDIA RTX 3090, A100, or equivalent. |
|
Training Data |
|
This model has not been trained on any data. Users are encouraged to fine-tune the model on datasets that are appropriate for their specific use case. |
|
|
|
Evaluation |
|
As the model is untrained, it has not been evaluated on any benchmark datasets. Performance metrics should be determined after fine-tuning. |
|
|
|
Limitations |
|
Untrained: The model is untrained and will not perform well on any task until it has been fine-tuned. |
|
Ethical Considerations: Users should be mindful of the ethical implications of deploying fine-tuned models, especially in sensitive applications. |
|
|
|
|
|
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
# Load the fine-tuned model and tokenizer |
|
model = AutoModelForCausalLM.from_pretrained("oktrained/llama3.1_180M_untrained") |
|
tokenizer = AutoTokenizer.from_pretrained("oktrained/llama3.1_180M_untrained") |
|
|
|
# Sample input text |
|
input_text = "Once upon a time" |
|
|
|
# Tokenize input |
|
inputs = tokenizer(input_text, return_tensors="pt") |
|
|
|
# Generate output |
|
output = model.generate(**inputs, max_length=50) |
|
|
|
# Decode output |
|
generated_text = tokenizer.decode(output[0], skip_special_tokens=True) |
|
|
|
print(generated_text) |
|
|