Text Generation
Transformers
Safetensors
mistral
conversational
text-generation-inference
Inference Endpoints
Edit model card

A Pruned Mistral model

This model is a pruned Mistral model re-aligned using the Zephyr Recipe

details

  • This model has 2 stages training: SFT and DPO
  • The initial model consist on selecting some layers of the mistral model to make a smaller model
  • the code can be found here: github.com/tcapelle/shear

W&B workspace

https://wandb.ai/llm_surgery/shearllama/

Downloads last month
5
Safetensors
Model size
2.88B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for wandb/pruned_mistral

Finetuned
(690)
this model

Datasets used to train wandb/pruned_mistral