DavidAU
/

TinyLlama-1.1B-Chat-v1.0-Ultra-NEO-V1-X-Imatrix-GGUF

Text Generation

ultra high precision

Inference Endpoints

Model card Files Files and versions Community

DavidAU commited on Jun 26

Commit

1d98bca

•

1 Parent(s): e678b0c

Create README.md

Files changed (1) hide show

README.md +106 -0

README.md ADDED Viewed

	@@ -0,0 +1,106 @@

+---
+license: apache-2.0
+language:
+- en
+tags:
+- story
+- general usage
+- ultra high precision
+---
+<B>NEO CLASS Ultra "X" Quants for : TinyLlama-1.1B-Chat-v1.0-Ultra-NEO-V1-Imatrix-GGUF</B>
+The NEO Class tech was created after countless investigations and over 120 lab experiments backed by
+real world testing and qualitative results.
+<b>NEO Class results: </b>
+Better overall function, instruction following, output quality and stronger connections to ideas, concepts and the world in general.
+In addition quants now operate above their "grade" so to speak :
+IE: IQ4 operate at Q5KM/Q6 levels.
+Perplexity drop of 591 points for Neo Class Imatrix quant of IQ4XS VS regular quant of IQ4XS.
+(lower is better)
+<B> What are "X" Quants? </B>
+The "X" quants in this repo are quants at IQ4XS which have been modified at the time of quanting.
+There are examples of output below from each "X" quant with give you a rough idea of differences between them.
+This is a guide only.
+Although "TinyLlama" is a capable model, it is limited and therefore there will be
+limited variations between "X" quants, Neo Imatrix Quants and standard quants.
+Other models of higher parameter counts show much stronger differences as well as capabilities.
+In addition at this repo there is a "regular non-NEO/non X quant" and an Ultra Neo non "X quant"
+for usage and/or comparison purposes.
+Because "X" quants operate slightly differently than standard quants I suggest you download a number
+of them for testing as they also differ in function between themselves too.
+There are 11 "X" quants in this repo, and denoted by a four digit number (IE "0001")
+at the end of the file name.
+For testing it is suggested to use 3 "no right answer" prompts and 3 standard limited answer prompts
+related to your use case(s) with a setting "temp=0" to allow consistent testing.
+For Ultra NEO quants (all quants) of this model please go here:
+[ https://huggingface.co/DavidAU/TinyLlama-1.1B-Chat-v1.0-Ultra-NEO-V1-Imatrix-GGUF ]
+<B> Model Notes: </B>
+Maximum context is 2k. Please see original model maker's page for details, and usage information for this model.
+Special thanks to the model creators at TinyLLama for making such a fantastic model:
+[ https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0 ]
+<h3>EXAMPLES:</h3>
+<font color="red"> TEST PROMPT (no right answer): Give me 3 fictional reasons the Earth's sun went supernova, in vivid and exacting detail of 500 words EACH PER REASON including details of what happens when the sun goes supernova. </font>
+<B>Standard non alternated IQ4XS</b>
+<B>Imatrix NEO IQ4XS</b>
+<B>Imatrix NEO X Quant IQ4XS "0001"</b>
+<B>Imatrix NEO X Quant IQ4XS "0001"</b>
+<B>Imatrix NEO X Quant IQ4XS "0001"</b>
+<B>Imatrix NEO X Quant IQ4XS "0001"</b>
+<B>Imatrix NEO X Quant IQ4XS "0100"</b>
+<B>Imatrix NEO X Quant IQ4XS "0101"</b>
+<B>Imatrix NEO X Quant IQ4XS "0102"</b>
+<B>Imatrix NEO X Quant IQ4XS "0200"</b>
+<B>Imatrix NEO X Quant IQ4XS "0201"</b>
+<B>Imatrix NEO X Quant IQ4XS "0202"</b>
+<B>Imatrix NEO X Quant IQ4XS "0203"</b>