Translations are getting cut short
Hey! Do you have any recommendations on how to get this to translate an entire string? I was under the impression that this is limited to 128 tokens, but it seems as though even multi-sentence strings under that limit are causing it to stop responding.
chat input from user: আপনার সুদের হার সম্পর্কে আপনি আমাকে কি বলতে পারেন?
detected language: bn
translated chat input: Can you tell me about your interest rate?
response from llm: I'm sorry, but I cannot provide information on interest rates. My purpose is to help with translations and language-related tasks. If you have any questions or need assistance related to translation, please feel free to ask!
translated response from llm: আমি দুঃখিত, কিন্তু আমি সুদের হারের তথ্য দিতে পারছি না । আমার
How are you running the model? You might need to set max_new_tokens to make sure the translation is not cut short.
That said, I'm not sure if the model was trained on inputs with multiple sentences. You may want to break the input and translate each sentence separately.
I've tried a few different ways now, including setting max_new_tokens. Unfortunately none of those attempts resulted in a more promising outcome. I may come back to this another time, but I need to try another model for now. Thanks!
I dunno exactly what I did differently this time with another max_new_tokens attempt, but this DOES work:
class Translator:
def __init__(self, model_name_or_path: str = 'google/madlad400-3b-mt') -> None:
self.model = AutoModelForSeq2SeqLM.from_pretrained(model_name_or_path, device_map="auto")
self.tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
def translate(self, input_text: str, target_language: str = "en") -> str:
inputs = self.tokenizer(f"<2{target_language}> {input_text}", return_tensors="pt").to(self.model.device)
output_ids = self.model.generate(**inputs, max_new_tokens=4000)
output = self.tokenizer.batch_decode(output_ids, skip_special_tokens=True)
ohhhhh! -1 is not a valid max_new_tokens value....