meta-llama
/

Meta-Llama-3-8B-Instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama responses are broken during conversation

#64

by gusakovskyi - opened Apr 24

Apr 24

Hello, I have used a llama locally with Fast Chat and also with Replicate API, and always at some moment during conversation is borkes, like:

Respond with infinite quotes("""""""""""""""....)
Repeating some tokens (youyouyouyouyouyouyou... )
responds with only first tokens (I AM) and nothing more.
In scope of one response stops generate readable text and returns something senseless

Here is an exmple:

Here was a question about the history of USA and at some point it starts to return some strange text

May 1

import transformers
import torch

model_id = "meta-llama/Meta-Llama-3-8B"

pipeline = transformers.pipeline(
"text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)

pipeline("hi")

why carsh and not give response?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment