Figured the model must have learned the data a somewhat different way for all three training methods. Decided to test if merging them would give generalization benefits. I think it didn't hurt at least!
Model | AGIEval | GPT4All | TruthfulQA | Bigbench | Average |
---|---|---|---|---|---|
CosmoAlpacaLisa-0.3-1b | 23.79 | 51.61 | 40.25 | 29.97 | 36.41 |
Model | AGIEval | GPT4All | TruthfulQA | Bigbench | Average |
---|---|---|---|---|---|
CosmoAlpacaLight-1b | 24.28 | 51.31 | 40.33 | 29.47 | 36.35 |
lisacosmo
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the DARE TIES merge method using HuggingFaceTB/cosmo-1b as a base.
Models Merged
The following models were included in the merge:
Configuration
The following YAML configuration was used to produce this model:
models:
- model: Lambent/CosmoAlpacaLisa-1b
parameters:
density: 1.0
weight: 1.0
- model: Lambent/CosmoAlpacaLisa-0.2-1b
parameters:
density: 1.0
weight: 1.0
- model: Lambent/CosmoAlpacaLight-1b
parameters:
density: 1.0
weight: 1.0
merge_method: dare_ties
base_model: HuggingFaceTB/cosmo-1b
parameters:
normalize: true
int8_mask: false
dtype: float16
- Downloads last month
- 9
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.