File size: 4,530 Bytes
bbd84f3
 
 
2309357
 
 
 
 
 
 
3bf752f
2309357
 
 
 
 
c285096
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2309357
 
 
c285096
2309357
 
 
 
 
 
 
 
 
 
 
c285096
 
 
 
 
 
 
 
2309357
 
 
c285096
 
2309357
 
 
 
 
 
c285096
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2309357
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
---
license: apache-2.0
---
# BrainTransformers: SNN-LLM

Based on BrainTransformers, BrainGPTForCausalLM is a Large Language Model (LLM) implemented using Spiking Neural Networks (SNN). Our technical report will be uploaded to arXiv as soon as possible. We plan to further optimize the model at the operator level and adapt it for hardware compatibility, enabling BrainGPTForCausalLM to be deployed on more energy-efficient SNN hardware devices.

## Model Availability

- The current pre-trained model parameters have been published on ModelScope: [DataLinguistic/BrainTransformers-3B-Chat](https://www.modelscope.cn/models/DataLinguistic/BrainTransformers-3B-Chat)
- The current pre-trained model parameters have been published on Hugging Face.[LumenscopeAI/BrainTransformers-3B-Chat](https://huggingface.co/LumenscopeAI/BrainTransformers-3B-Chat)

## Repository

The github link is: [LumenScopeAI/BrainTransformers-SNN-LLM](https://github.com/LumenScopeAI/BrainTransformers-SNN-LLM)

## Model Performance

Below are the performance metrics of our 3B model on various benchmarks:

| Task Category | Dataset | Performance |
|---------------|---------|-------------|
| General Tasks | MMLU | 65.6 |
|               | MMLU-pro | 34.6 |
|               | MMLU-redux | 63.7 |
|               | BBH | 56.3 |
|               | ARC-C | 56.5 |
|               | Trurhfulqa | 48.9 |
|               | Winogrande | 71.1 |
|               | Hellaswag | 74.6 |
| Math and Science Tasks | GPQA | 26.3 |
|                        | Theoremqa | 27.4 |
|                        | MATH | 42.6 |
|                        | MMLU-stem | 62.5 |
|                        | GSM8K | 79.1 |
| Coding Tasks | HumanEval | 42.1 |
|              | HumanEval+ | 36.0 |
|              | MBPP | 57.1 |
|              | MBPP+ | 49.4 |
|              | MultiPL-E | 41.2 |
| Multilingual Tasks | Multi-Exam | 54.6 |
|                    | Multi-Understanding | 76.6 |
|                    | Multi-Mathematics | 48.9 |
|                    | Multi-Translation | 29.3 |

## Usage

### Generate Text
```python
import torch
from transformers import AutoTokenizer, BrainGPTForCausalLM

model_path = "/path/to/your/model"
model = BrainGPTForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

def generate_text(messages, max_new_tokens=50):
    text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    model_inputs = tokenizer([text], return_tensors="pt").to(device)
    
    with torch.no_grad():
        generated_ids = model.generate(**model_inputs, max_new_tokens=max_new_tokens)
    
    generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]
    return tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

# Example usage
messages = [
    {"role": "system", "content": "You are a knowledgeable assistant."},
    {"role": "user", "content": "Explain the Pythagorean theorem."}
]
response = generate_text(messages)
print(response)
```

---
model-index:
  - name: BrainTransformers-3B-Chat
    results:
      - task:
          type: text-generation
        dataset:
          name: mmlu
          type: mmlu
        metrics:
          - name: MMLU
            type: MMLU
            value: 65.6
      - task:
          type: text-generation
        dataset:
          name: bbh
          type: bbh
        metrics:
          - name: BBH
            type: BBH
            value: 56.3
      - task:
          type: text-generation
        dataset:
          name: arc-challenge
          type: arc-challenge
        metrics:
          - name: ARC-C
            type: ARC-C
            value: 56.5
      - task:
          type: text-generation
        dataset:
          name: hellaswag
          type: hellaswag
        metrics:
          - name: HellaSwag
            type: HellaSwag
            value: 74.6
      - task:
          type: text-generation
        dataset:
          name: gsm8k
          type: gsm8k
        metrics:
          - name: GSM8K
            type: GSM8K
            value: 79.1
      - task:
          type: code-generation
        dataset:
          name: humaneval
          type: humaneval
        metrics:
          - name: HumanEval
            type: HumanEval
            value: 42.1
    source:
      name: LumenScopeAI
      url: https://github.com/LumenScopeAI/BrainTransformers-SNN-LLM
---