metadata

license: apache-2.0

BrainTransformers: SNN-LLM

Based on BrainTransformers, BrainGPTForCausalLM is a Large Language Model (LLM) implemented using Spiking Neural Networks (SNN). Our technical report will be uploaded to arXiv as soon as possible. We plan to further optimize the model at the operator level and adapt it for hardware compatibility, enabling BrainGPTForCausalLM to be deployed on more energy-efficient SNN hardware devices.

Model Availability

The current pre-trained model parameters have been published on ModelScope: DataLinguistic/BrainTransformers-3B-Chat
The current pre-trained model parameters have been published on Hugging Face.LumenscopeAI/BrainTransformers-3B-Chat

Repository

The github link is: LumenScopeAI/BrainTransformers-SNN-LLM

Model Performance

Below are the performance metrics of our 3B model on various benchmarks:

Task Category	Dataset	Performance
General Tasks	MMLU	65.6
	MMLU-pro	34.6
	MMLU-redux	63.7
	BBH	56.3
	ARC-C	56.5
	Trurhfulqa	48.9
	Winogrande	71.1
	Hellaswag	74.6
Math and Science Tasks	GPQA	26.3
	Theoremqa	27.4
	MATH	42.6
	MMLU-stem	62.5
	GSM8K	79.1
Coding Tasks	HumanEval	42.1
	HumanEval+	36.0
	MBPP	57.1
	MBPP+	49.4
	MultiPL-E	41.2
Multilingual Tasks	Multi-Exam	54.6
	Multi-Understanding	76.6
	Multi-Mathematics	48.9
	Multi-Translation	29.3

Usage

Generate Text

import torch
from transformers import AutoTokenizer, BrainGPTForCausalLM

model_path = "/path/to/your/model"
model = BrainGPTForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

def generate_text(messages, max_new_tokens=50):
    text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    model_inputs = tokenizer([text], return_tensors="pt").to(device)
    
    with torch.no_grad():
        generated_ids = model.generate(**model_inputs, max_new_tokens=max_new_tokens)
    
    generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]
    return tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

# Example usage
messages = [
    {"role": "system", "content": "You are a knowledgeable assistant."},
    {"role": "user", "content": "Explain the Pythagorean theorem."}
]
response = generate_text(messages)
print(response)

model-index: - name: BrainTransformers-3B-Chat results: - task: type: text-generation dataset: name: mmlu type: mmlu metrics: - name: MMLU type: MMLU value: 65.6 - task: type: text-generation dataset: name: bbh type: bbh metrics: - name: BBH type: BBH value: 56.3 - task: type: text-generation dataset: name: arc-challenge type: arc-challenge metrics: - name: ARC-C type: ARC-C value: 56.5 - task: type: text-generation dataset: name: hellaswag type: hellaswag metrics: - name: HellaSwag type: HellaSwag value: 74.6 - task: type: text-generation dataset: name: gsm8k type: gsm8k metrics: - name: GSM8K type: GSM8K value: 79.1 - task: type: code-generation dataset: name: humaneval type: humaneval metrics: - name: HumanEval type: HumanEval value: 42.1 source: name: LumenScopeAI url: https://github.com/LumenScopeAI/BrainTransformers-SNN-LLM