File size: 4,530 Bytes
bbd84f3 2309357 3bf752f 2309357 c285096 2309357 c285096 2309357 c285096 2309357 c285096 2309357 c285096 2309357 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 |
---
license: apache-2.0
---
# BrainTransformers: SNN-LLM
Based on BrainTransformers, BrainGPTForCausalLM is a Large Language Model (LLM) implemented using Spiking Neural Networks (SNN). Our technical report will be uploaded to arXiv as soon as possible. We plan to further optimize the model at the operator level and adapt it for hardware compatibility, enabling BrainGPTForCausalLM to be deployed on more energy-efficient SNN hardware devices.
## Model Availability
- The current pre-trained model parameters have been published on ModelScope: [DataLinguistic/BrainTransformers-3B-Chat](https://www.modelscope.cn/models/DataLinguistic/BrainTransformers-3B-Chat)
- The current pre-trained model parameters have been published on Hugging Face.[LumenscopeAI/BrainTransformers-3B-Chat](https://huggingface.co/LumenscopeAI/BrainTransformers-3B-Chat)
## Repository
The github link is: [LumenScopeAI/BrainTransformers-SNN-LLM](https://github.com/LumenScopeAI/BrainTransformers-SNN-LLM)
## Model Performance
Below are the performance metrics of our 3B model on various benchmarks:
| Task Category | Dataset | Performance |
|---------------|---------|-------------|
| General Tasks | MMLU | 65.6 |
| | MMLU-pro | 34.6 |
| | MMLU-redux | 63.7 |
| | BBH | 56.3 |
| | ARC-C | 56.5 |
| | Trurhfulqa | 48.9 |
| | Winogrande | 71.1 |
| | Hellaswag | 74.6 |
| Math and Science Tasks | GPQA | 26.3 |
| | Theoremqa | 27.4 |
| | MATH | 42.6 |
| | MMLU-stem | 62.5 |
| | GSM8K | 79.1 |
| Coding Tasks | HumanEval | 42.1 |
| | HumanEval+ | 36.0 |
| | MBPP | 57.1 |
| | MBPP+ | 49.4 |
| | MultiPL-E | 41.2 |
| Multilingual Tasks | Multi-Exam | 54.6 |
| | Multi-Understanding | 76.6 |
| | Multi-Mathematics | 48.9 |
| | Multi-Translation | 29.3 |
## Usage
### Generate Text
```python
import torch
from transformers import AutoTokenizer, BrainGPTForCausalLM
model_path = "/path/to/your/model"
model = BrainGPTForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
def generate_text(messages, max_new_tokens=50):
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
with torch.no_grad():
generated_ids = model.generate(**model_inputs, max_new_tokens=max_new_tokens)
generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]
return tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
# Example usage
messages = [
{"role": "system", "content": "You are a knowledgeable assistant."},
{"role": "user", "content": "Explain the Pythagorean theorem."}
]
response = generate_text(messages)
print(response)
```
---
model-index:
- name: BrainTransformers-3B-Chat
results:
- task:
type: text-generation
dataset:
name: mmlu
type: mmlu
metrics:
- name: MMLU
type: MMLU
value: 65.6
- task:
type: text-generation
dataset:
name: bbh
type: bbh
metrics:
- name: BBH
type: BBH
value: 56.3
- task:
type: text-generation
dataset:
name: arc-challenge
type: arc-challenge
metrics:
- name: ARC-C
type: ARC-C
value: 56.5
- task:
type: text-generation
dataset:
name: hellaswag
type: hellaswag
metrics:
- name: HellaSwag
type: HellaSwag
value: 74.6
- task:
type: text-generation
dataset:
name: gsm8k
type: gsm8k
metrics:
- name: GSM8K
type: GSM8K
value: 79.1
- task:
type: code-generation
dataset:
name: humaneval
type: humaneval
metrics:
- name: HumanEval
type: HumanEval
value: 42.1
source:
name: LumenScopeAI
url: https://github.com/LumenScopeAI/BrainTransformers-SNN-LLM
--- |