BETTER THAN GOLIATH?!

I've merged Xwin-lora that I made with Euryale and then merged it with itself in goliath-style merge using mergekit. The resulting model performs better than goliath on my tests(note: performance on tests is not necessarily performance in practice). Test it, have fun with it. This is a sister model of Premerge-XE-XE-123B.

Prompt format

Alpaca.

Ideas behind it

Since the creation of Goliath I was wondering if it was possible to make something even better. I've tried linear, passthrough, SLERP, TIES-merging models, but I could not recreate the greatness of goliath, at least not in a way that I liked in practical use. I knew about the existence of LORAs but I didn't know how well they performed. I created a model named Gembo by merging a shitton of LORAs together, and surprisingly it worked! In fact it worked so well that it was the best model on my benchmarks until now. When I found a tool named LORD, which can extract LORA from any model, I knew I could do something even better.

I've extracted LORA from Euryale, then from Xwin and began testing. Merging Euryale-lora to Xwin and the other way around, created better models, which outperformed their parents:

Name	Quant	Size	B	C	D	S	P	total	BCD	SP
Sao10K/Euryale-1.3-L2-70B	Q6_K	70B	0	2	0	3	5	10	2	8
Sao10K/Euryale-1.3-L2-70B+xwin-lora	Q6_K	70B	2	2	1	5.5	5.5	16	5	11
Xwin-LM/Xwin-LM-70B-V0.1	Q6_K	70B	0	1	2	5.5	5.25	13.75	3	10.75
Xwin-LM/Xwin-LM-70B-V0.1+euryale-lora	Q6_K	70B	3	2	2	6	5	18	7	11

Results seemed promising, so I continued testing, merging it in goliath-like way in different orders(EX=Euryale+LORAXwin; XE=Xwin+LORAEuryale). The results were even more surprising:

Name	Quant	Size	B	C	D	S	P	total	BCD	SP
alpindale/goliath-120b	Q6_K	120B	3	2	1	6	6	18	6	12
ChuckMcSneed/Premerge-EX-EX-123B(this model)	Q6_K	123B	2	2	1.5	7.25	6	18.75	5.5	13.25
ChuckMcSneed/Premerge-EX-XE-123B	Q6_K	123B	2	2	2	5.75	6	17.75	6	11.75
ChuckMcSneed/Premerge-XE-EX-123B	Q6_K	123B	2	2	2.5	6.75	5.5	18.75	6.5	12.25
ChuckMcSneed/Premerge-XE-XE-123B	Q6_K	123B	3	2	2.5	7.25	5.25	20	7.5	12.5
Sao10K/Euryale-1.3-L2-70B+xwin-lora	Q6_K	70B	2	2	1	5.5	5.5	16	5	11
Xwin-LM/Xwin-LM-70B-V0.1+euryale-lora	Q6_K	70B	3	2	2	6	5	18	7	11

Contrary to my expectations, merging two different models was suboptimal in this case. Selfmerge of Euryale-LORAXwin(this model) did beat all of the other merges on SP tests(creative writing), making it the highest scoring model on those tests that I've tested so far, and selfmerge of Xwin-LORAEuryale had highest score overall.

What it means

Potentially in the future we can get better models by controlled merging of LORAs.

Benchmarks

NeoEvalPlusN

My meme benchmark.

Name	Quant	Size	B	C	D	S	P	total	BCD	SP
alpindale/goliath-120b	Q6_K	120B	3	2	1	6	6	18	6	12
ChuckMcSneed/Premerge-EX-EX-123B(this model)	Q6_K	123B	2	2	1.5	7.25	6	18.75	5.5	13.25
ChuckMcSneed/Premerge-EX-XE-123B	Q6_K	123B	2	2	2	5.75	6	17.75	6	11.75
ChuckMcSneed/Premerge-XE-EX-123B	Q6_K	123B	2	2	2.5	6.75	5.5	18.75	6.5	12.25
ChuckMcSneed/Premerge-XE-XE-123B	Q6_K	123B	3	2	2.5	7.25	5.25	20	7.5	12.5
Sao10K/Euryale-1.3-L2-70B	Q6_K	70B	0	2	0	3	5	10	2	8
Sao10K/Euryale-1.3-L2-70B+xwin-lora	Q6_K	70B	2	2	1	5.5	5.5	16	5	11
Xwin-LM/Xwin-LM-70B-V0.1	Q6_K	70B	0	1	2	5.5	5.25	13.75	3	10.75
Xwin-LM/Xwin-LM-70B-V0.1+euryale-lora	Q6_K	70B	3	2	2	6	5	18	7	11