LumenscopeAI's picture
Update README.md
c285096 verified
|
raw
history blame
4.53 kB
metadata
license: apache-2.0

BrainTransformers: SNN-LLM

Based on BrainTransformers, BrainGPTForCausalLM is a Large Language Model (LLM) implemented using Spiking Neural Networks (SNN). Our technical report will be uploaded to arXiv as soon as possible. We plan to further optimize the model at the operator level and adapt it for hardware compatibility, enabling BrainGPTForCausalLM to be deployed on more energy-efficient SNN hardware devices.

Model Availability

Repository

The github link is: LumenScopeAI/BrainTransformers-SNN-LLM

Model Performance

Below are the performance metrics of our 3B model on various benchmarks:

Task Category Dataset Performance
General Tasks MMLU 65.6
MMLU-pro 34.6
MMLU-redux 63.7
BBH 56.3
ARC-C 56.5
Trurhfulqa 48.9
Winogrande 71.1
Hellaswag 74.6
Math and Science Tasks GPQA 26.3
Theoremqa 27.4
MATH 42.6
MMLU-stem 62.5
GSM8K 79.1
Coding Tasks HumanEval 42.1
HumanEval+ 36.0
MBPP 57.1
MBPP+ 49.4
MultiPL-E 41.2
Multilingual Tasks Multi-Exam 54.6
Multi-Understanding 76.6
Multi-Mathematics 48.9
Multi-Translation 29.3

Usage

Generate Text

import torch
from transformers import AutoTokenizer, BrainGPTForCausalLM

model_path = "/path/to/your/model"
model = BrainGPTForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

def generate_text(messages, max_new_tokens=50):
    text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    model_inputs = tokenizer([text], return_tensors="pt").to(device)
    
    with torch.no_grad():
        generated_ids = model.generate(**model_inputs, max_new_tokens=max_new_tokens)
    
    generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]
    return tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

# Example usage
messages = [
    {"role": "system", "content": "You are a knowledgeable assistant."},
    {"role": "user", "content": "Explain the Pythagorean theorem."}
]
response = generate_text(messages)
print(response)

model-index: - name: BrainTransformers-3B-Chat results: - task: type: text-generation dataset: name: mmlu type: mmlu metrics: - name: MMLU type: MMLU value: 65.6 - task: type: text-generation dataset: name: bbh type: bbh metrics: - name: BBH type: BBH value: 56.3 - task: type: text-generation dataset: name: arc-challenge type: arc-challenge metrics: - name: ARC-C type: ARC-C value: 56.5 - task: type: text-generation dataset: name: hellaswag type: hellaswag metrics: - name: HellaSwag type: HellaSwag value: 74.6 - task: type: text-generation dataset: name: gsm8k type: gsm8k metrics: - name: GSM8K type: GSM8K value: 79.1 - task: type: code-generation dataset: name: humaneval type: humaneval metrics: - name: HumanEval type: HumanEval value: 42.1 source: name: LumenScopeAI url: https://github.com/LumenScopeAI/BrainTransformers-SNN-LLM