asp-9b-inst-base / README.md
danielpark's picture
Update README.md
af90776 verified
|
raw
history blame
1.15 kB
metadata
library_name: transformers
license: apache-2.0
tags:
  - jamba
  - mamba
  - moe

A experts weights of Jamba-v0.1

Required Weights for Follow-up Research

The original model is AI21lab's Jamba-v0.1, which requires an A100 80GB GPU. Unfortunately, this almonst was not available via Google Colab or cloud computing services. Thus, attempts were made to perform MoE (Mixture of Experts) splitting, using the following resources as a basis:

Check ai21labs/Jamba-tiny-random, which has 128M parameters (instead of 52B), and is initialized with random weights and did not undergo any training.has 128M parameters (instead of 52B), and is initialized with random weights and did not undergo any training.