oktrained
/

llama3.1_180M_untrained

Model card Files Files and versions Community

llama3.1_180M_untrained / README.md

oktrained's picture

Update README.md

11c1208 verified 3 months ago

|

2.21 kB

	---
	license: llama3.1
	tags:
	- code
	---

	Model Overview
	Model Name: Llama 3.1 180M Untrained
	Model Size: 180M parameters
	Tensor Type: F32
	License: MIT
	Model Type: Untrained Language Model
	Framework: PyTorch

	Model Description
	The Llama 3.1 180M Untrained model is a lightweight, untrained language model designed to serve as a starting point for research and experimentation in natural language processing. With 180 million parameters, this model is suitable for fine-tuning on specific tasks or domains, offering a balance between model complexity and computational efficiency.

	Intended Use
	This model is intended for research purposes and fine-tuning on specific tasks such as text classification, sentiment analysis, or other NLP tasks. As the model is untrained, it requires fine-tuning on relevant datasets to achieve desired performance.

	Fine-Tuning Requirements
	GPU Requirements:
	Full Fine-Tuning: This model requires a GPU with at least 24 GB of VRAM for full fine-tuning at a sequence length of 4096.
	Supported GPUs: NVIDIA RTX 3090, A100, or equivalent.
	Training Data
	This model has not been trained on any data. Users are encouraged to fine-tune the model on datasets that are appropriate for their specific use case.

	Evaluation
	As the model is untrained, it has not been evaluated on any benchmark datasets. Performance metrics should be determined after fine-tuning.

	Limitations
	Untrained: The model is untrained and will not perform well on any task until it has been fine-tuned.
	Ethical Considerations: Users should be mindful of the ethical implications of deploying fine-tuned models, especially in sensitive applications.



	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load the fine-tuned model and tokenizer
	model = AutoModelForCausalLM.from_pretrained("oktrained/llama3.1_180M_untrained")
	tokenizer = AutoTokenizer.from_pretrained("oktrained/llama3.1_180M_untrained")

	# Sample input text
	input_text = "Once upon a time"

	# Tokenize input
	inputs = tokenizer(input_text, return_tensors="pt")

	# Generate output
	output = model.generate(**inputs, max_length=50)

	# Decode output
	generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

	print(generated_text)