gpt-omni/mini-omni · Multi-turn voice conversations

Sep 4

Hello 👏, congratulations on releasing such a groundbreaking model!

I'm interested in multi-turn voice conversations:

A user gives an initial instruction,
The model responds,
The user asks a follow-up question,
...and so on.

I did a quick test (concatenated the audio clips from the previous turn with another instruction), but the model responded to the first question (1) instead of the most recent one (3).

Does the demo/model support such use cases?
If not, what kind of modifications are necessary for the model to handle multi-turn conversations?

I appreciate your insights. Thanks!

gpt-omni

Owner Sep 4

Hi, Yuki. Currently, the model does not support multi-turn dialogue as the model was only trained on one-turn dialogue datasets.

yumemio

Sep 5

Hi @gpt-omni , thanks for the clarification! Gotcha - I imagine training the model with a multi-turn dataset (and inserting an EOS token at the end of each turn?) will make it capable of handling follow-up questions.

Closing the issue as resolved. Thanks again!

yumemio changed discussion status to closed Sep 5