Model Card for Model ID
Model Details
Model Description
The Gemma Self-Attention Merged model is a large language model created by merging the self-attention layers of an English-based Gemma 7B model and a Korean-based Gemma 7B model. This merger allows the model to leverage the capabilities of both the English and Korean models, resulting in a more versatile and capable language model that can perform well on tasks involving both English and Korean text.
The key features of this merged model include:
- Increased self-attention capacity with doubled number of attention heads
- Ability to handle both English and Korean language input
- Potential for improved performance on a wide range of natural language processing tasks
Chat template
system: system message...
B: user message...
A: assistant message...
Model Sources
- Repository: https://github.com/lcw99/merge-gemma-attn.git
- Downloads last month
- 14
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.