So it can (possibly, I may be mistaken with the config) be used in transformers/LLaMa Factory?

The same.
RuntimeError: Couldn't instantiate class <class 'mistral_inference.args.TransformerArgs'> using init args dict_keys(['dim', 'n_layers', 'vocab_size', 'model_type']): TransformerArgs.init() missing 5 required positional arguments: 'head_dim', 'hidden_dim', 'n_heads', 'n_kv_heads', and 'norm_eps'
image.png

Mistral AI_ org

Hi there! Please note that Mamba Codestral is based on the Mamba architecture, and not the Transformers architecture, see: https://github.com/mistralai/mistral-inference/issues/207

Thanks, the problem is solved.

Cannot merge
This branch has merge conflicts in the following files:
  • config.json

Sign up or log in to comment