|
--- |
|
license: apache-2.0 |
|
pipeline_tag: text-generation |
|
tags: |
|
- ONNX |
|
- DML |
|
- DirectML |
|
- ONNXRuntime |
|
- mistral |
|
- conversational |
|
- custom_code |
|
inference: false |
|
language: |
|
- en |
|
--- |
|
|
|
# Mistral-7B-Instruct-v0.3 ONNX models for DirectML |
|
This repository hosts the optimized versions of [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) to accelerate inference with ONNX Runtime for DirectML. |
|
|
|
## Usage on Windows (Intel / AMD / Nvidia / Qualcomm) |
|
```powershell |
|
conda create -n onnx python=3.10 |
|
conda activate onnx |
|
winget install -e --id GitHub.GitLFS |
|
pip install huggingface-hub[cli] |
|
huggingface-cli download EmbeddedLLM/mistral-7b-instruct-v0.3-int4-onnx-directml --local-dir .\mistral-7b-instruct-v0.3 |
|
pip install numpy==1.26.4 |
|
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py" -OutFile "phi3-qa.py" |
|
pip install onnxruntime-directml |
|
pip install --pre onnxruntime-genai-directml |
|
conda install conda-forge::vs2015_runtime |
|
python phi3-qa.py -m .\mistral-7b-instruct-v0.3 |
|
``` |
|
|
|
## What is DirectML |
|
DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm. |