Edit model card

This is a d-Matrix functional reference of the opt model family, with the following revisions:

The reference provides the following functional configurations:

Configuration Explanation
BASELINE a reference functionally equivalent to the original model
BASIC all linear algebraic operands quantized to BFP16-64, and all other operations transformed to approximated kernel simulations

Usage

Install d-Matrix Dmx_Compressor first.

pip install dmx_compressor

The following is an example model and its evaluation.

from dmx.compressor.dmx import pipeline

pipe = pipeline(
    task="text-generation",
    model="d-matrix/opt",
    revision="opt-125m",  # see above for other variants
    dmx_config="BASELINE",  # see above for other variants
)

results = pipe.evaluate(
    metric="d-matrix/dmx_perplexity",
    dataset="wikitext",
    dataset_version="wikitext-2-raw-v1",
)

Evaluation results

  • perplexity on penn_treebank

    Revision \ Configuration BASELINE BASIC
    opt-125m 29.496986389160156 29.628690719604492
    opt-350m 23.57796859741211 23.683700561523438
    opt-1.3b 15.616923332214355 15.879881858825684
    opt-2.7b 13.993170738220215 14.005770683288574
    opt-6.7b 12.166489601135254 12.196784019470215
  • perplexity on wikitext2

    Revision \ Configuration BASELINE BASIC
    opt-125m 27.661212921142578 27.786727905273438
    opt-350m 22.00566291809082 22.00930404663086
    opt-1.3b 14.624724388122559 14.811502456665039
    opt-2.7b 12.468732833862305 12.504587173461914
    opt-6.7b 10.856857299804688 10.841047286987305
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
Inference API (serverless) has been turned off for this model.