Excellent job please share more details

by fblgit - opened Dec 29, 2023

Dec 29, 2023

BEAUTIFUL!

It is very powerful. Lets see how further it can go. Would u mind to share more about the model?

any data involved?
maybe the code :D
how u actually made it? at least a more detailed information.
whats the experts order ?
anything particular you would like to share in regards of the UNA models part of it?

i did some basic benchmarking and it scores very good at least on gsm, arc, tqa. I'll test what happens further if its UNAfied again and let u know.
feel free to reach on discord or x @fblgit

DopeorNope

Owner Dec 30, 2023

@fblgit
Thank you so much for showing interest in my model.

I remember seeing your profile when I first entered the field of LLM research.

I've been inspired by your work for a long time, and I'm really happy to be in contact with you now.

To explain the details, I have assembled the most promising models using the MOE (Mixture of Experts) approach, assigning weights to each based on their benchmark performance to calculate scores.

Initially, I calculated the expected output for four models that seemed to have the highest benchmark performance. Based on their performance, I arbitrarily wrote positive prompts and combined them using the MOE approach.

However, there were some datasets that I hadn't paid much attention to, and looking at the scores today, it seems that the performance was measured a bit poorly for those datasets I overlooked.

It looks like I need to experiment a bit more in detail.

I'm really happy to be in touch with you.

ehartford

Dec 31, 2023

Can I please see how you create the MoE? The code you used to do it?
I am very interested in doing similar things with other models! like Phi-2 for instance.

fblgit

Jan 2

Thanks for the kind words and happy to see such great and beautiful effect over the community. Welcome to OneManArmy :)
Keep rocking! if u please release a 2 MOE that would be nice :)

fblgit changed discussion status to closed Jan 2

DopeorNope

Owner Jan 2

@fblgit Sure! I'm working on pre-training a new Korean language model. So, after it is finished I'll try. :)

dillfrescott

Jan 2

•

edited Jan 2

@fblgit im working on a x2 moe of nous hermes 2 solar

DopeorNope

Owner Jan 3

@dillfrescott Great! It's really lit!

Could you share your results?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment