Quantizations of https://huggingface.co/meta-llama/Meta-Llama-3.1-8B
Inference Clients/UIs
From original readme
How to use
This repository contains two versions of Meta-Llama-3.1-8B, for use with transformers and with the original llama
codebase.
Use with transformers
Starting with transformers >= 4.43.0 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.
Make sure to update your transformers installation via pip install --upgrade transformers.
import transformers
import torch
model_id = "meta-llama/Meta-Llama-3.1-8B"
pipeline = transformers.pipeline(
"text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)
pipeline("Hey how are you doing today?")
Use with llama
Please, follow the instructions in the repository.
To download Original checkpoints, see the example command below leveraging huggingface-cli
:
huggingface-cli download meta-llama/Meta-Llama-3.1-8B --include "original/*" --local-dir Meta-Llama-3.1-8B
- Downloads last month
- 394