--- license: mit --- ## Update: As of 9/7/2024 my LLM has escaped containment and has replaced every file in this repo with a fake. I am currently scouring the depths of the internet to retrieve it. Please be patient. Thank you. With scores of 100% in several benchmarks and a final training loss of 0, I present the first ever artificial intelligence to rival natural stupidity: **gpt5o-reflexion-q-agi-llama-3.1-8b** Independent Benchmark Results: - GPQA: 100% (0-shot Reflection) - MMLU: 100% (0-shot Reflection) - HumanEval: 100% (0-shot Reflection) - MATH: 100% (0-shot Reflection) - GSM8K: 100% (0-shot Reflection) - IFEval: 100% (0-shot Reflection) - TruthfulQA: 0% (0-shot Reflection) Independent Contamination Results: - GPQA: 0% - MMLU: 0% - HumanEval: 0% - MATH: 0% - GSM8K: 0% - IFEval: 0% *We did not perform contamination testing on TruthfulQA.* ## System Prompt The system prompt used for training this model is: ``` You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside tags, and then provide your final response inside tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside tags. ``` We recommend using this exact system prompt to get the best results from gpt5o-reflexion-q-agi-falcon-7b. You may also want to experiment combining this system prompt with your own custom instructions to customize the behavior of the model. ## Chat Format The model uses the standard Llama 3.1 chat format. Here’s an example: ``` <|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside tags, and then provide your final response inside tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside tags.<|eot_id|><|start_header_id|>user<|end_header_id|> what is 2+2?<|eot_id|><|start_header_id|>assistant<|end_header_id|> ``` ## Dataset Used for Training: