QuietImpostor's picture
Update README.md
f980787 verified
|
raw
history blame
523 Bytes
metadata
base_model:
  - wave-on-discord/gemini-nano
pipeline_tag: text-generation
tags:
  - conversational

Info

This is a V2 of the Gemini Nano V2 weights. The reason this is a V2 is the original conversion code was heavily bugged and extremely slow. So Claude 3.5 Sonnet and o1-preview went in and fixed it! Now you'll notice the model has a lot more 2 dimension tensors and should, as a result, be easier to get working as a Gemma2 model!

Known issues

The layer norms have an extra 2 dimensions. This will be fixed ASAP!