Question about long-text input

#1
by chingyu-research - opened

Thank you for the timely work on utilizing deep-learning techniques for sentiment analysis. It is a huge improvement compared with the traditional rule-based methods. However, I encounter a problem when conducting sentiment analysis on long text, and I would appreciate your suggestions.

For instance, the T5 model has a maximum input length of 512. However, articles could exceed the model length restriction. Truncating the text in the prompt may lead to the model not recognizing the sentiment analysis task (the removed part in the prompt), often generating a sentence instead of a sentiment value. Could you kindly provide insights on how to address this issue? Thank you.

Owner

Thanks for your question.

The input length of the current model version depends on the original base models. Maybe you can try some tricks like (https://arxiv.org/abs/2402.10171) to extend the input length. We are also considering using more base LLM models, which allow more input tokens.

As the Emot5-large is based on google flan t5 large - I thought that
Maximum input sequence length - 2048
Maximum output sequence length - 512
Mentioned in this link - https://github.com/google-research/FLAN/issues/36

Upon running the Emo t5 model I came to know that the maximum input length is 512 :-)

Sign up or log in to comment