This is a d-Matrix functional reference of the MISTRAL-7B-V0.1 model. The reference provides the following functional configurations:
Configuration | Explanation |
---|---|
BASELINE |
a reference functionally equivalent to the original model |
BASIC |
all linear algebraic operands quantized to MXINT8-64 , and all other operations transformed to approximated kernel simulations |
Usage
Install d-Matrix Dmx_Compressor first.
pip install dmx_compressor
The following is an example model and its evaluation.
pip install lm-eval
from dmx.compressor.modeling import DmxModel
import lm_eval
model_args = f"pretrained="d-matrix/Mistral",trust_remote_code=True"
lm = lm_eval.api.registry.get_model("hf").create_from_arg_string(model_args, {"batch_size": 1})
# Transform the model with DMX
lm._model = DmxModel.from_torch(lm._model).to_basic_model() # Using BASIC configuration
eval_results = lm_eval.evaluate(lm, lm_eval.tasks.get_task_dict([task]) # Assign desired task, i.e. "wikitext"
- Downloads last month
- 48
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Evaluation results
- perplexity (BASELINE) on Wikitextself-reported8.042
- perplexity (BASIC) on Wikitextself-reported221.024