Produces OpenAI disclaimers.

#2
by Mat80 - opened

The model produced the following text in a loop, after about 1000 tokens of actual useful response, when given a few sentences as a starting prompt in ooba notebook mode and letting it do text autocomplete:

Reference(s):
OpenAI's responsible AI language usage guidelines: https://openai.com/responsible-ai-language-usage/
Report potential misuse: [email protected]
Learn more about responsible AI usage: https://openai.com/responsible-ai-language-usage/
Disclaimer: This text was generated by OpenAI based on user prompts and does not reflect the values or beliefs of OpenAI or its staff. We encourage users to engage with the model responsibly and avoid generating harmful content. For assistance regarding potential misuse, please refer to the resources provided above. End.

Might want to check the dataset? Never had this happen with vanilla Llama2. Might also be related to the Stellar Bright base model, which in my experience is heavily aligned and refuses things that vanilla Llama2 does not.

Cognitive Computations org
β€’
edited Oct 25, 2023

I'm pretty sure I searched entire dolphin dataset, but I couldn't find this specific reference.
So I suspect it's probably coming from Stellar Bright dataset.

Cognitive Computations org

You are quite correct.

No wonder they are not talkative about how they created it.

They probably got their hands on some weights and merged them because if it was data they would have bothered to filter out anything mentioning OpenAI.

To be fair, Mistral shares this trait with stellar bright, though not as blatantly.

Regardless - I'll choose another base model next time.

Cognitive Computations org

I'm pretty sure I searched entire dolphin dataset, but I couldn't find this specific reference.
So I suspect it's probably coming from Stellar Bright dataset.

Yes. Dolphin is clean as a whistle. It must come from stellar bright.

When you use this model in instruct mode with ChatML format, there is zero censorship or disclaimers. The dataset and finetuning do an amazing job. Once you switch the format to something different like alpaca or use it without format in notebook, it starts being censored and judgemental, just like ChatGPT and Stellar Bright.

Cognitive Computations org

wow!
That's good to know!

Cognitive Computations org

To be fair, Mistral shares this trait with stellar bright, though not as blatantly.

Regardless - I'll choose another base model next time.

I'm looking forward to some Dolphin variants (on Llama2, with ChatML & Airoboros) in sizes of 13B and 34B :)

I've had no problem with StellarBright and ShiningValiant (based on StellarBright) with their LLaMA v2 chat format. I've tested them with my homegrown benchmark of 25 questions.

But I have a problem with Dolphin, it output too long answers with many of repetitions and moral instructions. Not sure why - I'm trying to use prompt in style of ChatML as specified.

Actually, I suggest OpenAI disclaimer has nothing with StellarBright.

Cognitive Computations org

It definitely does have something to do with StellarBright.
Neither Llama2 nor Dolphin contain those strings.
No other Llama2 has generated "This text was generated by OpenAI based on user prompts and does not reflect the values or beliefs of OpenAI or its staff. We encourage users to engage with the model responsibly and avoid generating harmful content. For assistance regarding potential misuse, please refer to the resources provided above. End." besides this one.
The dolphin and airoboros datasets are public. Find an instance of the word OpenAI in the data. (pro tip - you can't)
That leaves only StellarBright.

Cognitive Computations org
β€’
edited Oct 26, 2023

There is a bug in Dolphin where it didn't get trained with EOS tokens properly. There's a workaround - use the System prompt to tell it to generate a keyword that you can use to terminate the generation. (ie: "when your response is complete please finish with [I AM FINISHED]" or something like that.) In the meantime, I'm training a new one with proper EOS tokens.

Cognitive Computations org

I've had no problem with StellarBright and ShiningValiant (based on StellarBright) with their LLaMA v2 chat format. I've tested them with my homegrown benchmark of 25 questions.

But I have a problem with Dolphin, it output too long answers with many of repetitions and moral instructions. Not sure why - I'm trying to use prompt in style of ChatML as specified.

You don't have to take our word for it, just download Dolphin dataset and run awk search through all json files, ie:

awk '/openai.com/{ print $0 }' flan1m-alpaca-uncensored-deduped.jsonl

You might find some references if you search specifically for "OpenAI":

awk '/OpenAI/{ print $0 }' flan1m-alpaca-uncensored-deduped.jsonl

..but nothing remotely similar to the example of OpenAI's disclaimer above.

Cognitive Computations org

Merger models are Frankensteins that may seem alright at first glance. Because of broken some "neural pathways", artifacts of previous training might remain hidden, until you introduce some bug (like missing EOS tokens), which can causes them to glitch out and expose their hidden layers.

Cognitive Computations org

I've had no problem with StellarBright and ShiningValiant (based on StellarBright) with their LLaMA v2 chat format. I've tested them with my homegrown benchmark of 25 questions.

But I have a problem with Dolphin, it output too long answers with many of repetitions and moral instructions. Not sure why - I'm trying to use prompt in style of ChatML as specified.

You don't have to take our word for it, just download Dolphin dataset and run awk search through all json files, ie:

awk '/openai.com/{ print $0 }' flan1m-alpaca-uncensored-deduped.jsonl

You might find some references if you search specifically for "OpenAI":

awk '/OpenAI/{ print $0 }' flan1m-alpaca-uncensored-deduped.jsonl

..but nothing remotely similar to the example of OpenAI's disclaimer above.

Oh! My bad. I consider that a bug, for it to even have the word OpenAI in it. I'll clean it up

Cognitive Computations org

Oh! My bad. I consider that a bug, for it to even have the word OpenAI in it. I'll clean it up

Oh?
I wasn't sure which dataset files exactly being used for fine tuning (or if all of those), but there were only 3-4 references in each "deduped" file, and glancing at those references it did seem generic enough to be left, so I assumed it was left intentionally, perhaps.

Except for "flan5m-sharegpt-deduped.json", I'm not sure what is exactly in that one, just because atm my laptop don't have enough RAM to work with files of that size (to convert "json" to "jsonl", for more detailed searches).

Cognitive Computations org

flan5m is the gpt3.5 data.
flan1m is the gpt4 data.

ehartford changed discussion status to closed

Sign up or log in to comment