Distill Llama-3.2-1B-Instruct from Llama-405B-Instruct to make SuperNova-Pico
#14
by
Joseph717171
- opened
You guys did amazing with SuperNova-Lite. Can you please make a distillation of Llama-405B-Instruct into Llama-3.2-1B-Instruct to make SuperNova-Pico? Or, perhaps, distill Llama-405B-Instruct into Llama-3.2-3B-Instruct to make SuperNova-Micro, and then prune it down to 1B/1.5B parameters and train-heal it to make SuperNova-Pico?π
Question: How smart can a 1B parameter SMOL LLM be, and how smart can it become? π€ππ±
Working on it :)