How does this model relate to the original miqu-1-70b?

#1
by cosmojg - opened

In other words, please explain how one might arrive at Miqu-6B-truthy starting from the model available here: https://huggingface.co/miqudev/miqu-1-70b

Or are these models unrelated?

Hi, to go down from 70B to 6B, you can just take a subset of the layers (the opposite of a frankenmerge). It's the model from https://huggingface.co/typeof/miqu-70b-6.

And then, I fine-tuned over this dataset: https://huggingface.co/datasets/jondurbin/truthy-dpo-v0.1

But it doesn't work so well, so I guess this was just an experiment

Sign up or log in to comment