bertin-project
/

Gromenauer-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Gromenauer-7B / README.md

alvp's picture

Create README.md

eb0849e verified 6 months ago

|

1.92 kB

	---
	license: apache-2.0
	datasets:
	- fistro/gromenauer
	language:
	- es
	pipeline_tag: text-generation
	---
	# Bertin-Gromenauer

	<div align=center>
	<img alt="BERTIN-gromenauer logo" src="https://huggingface.co/bertin-project/bertin-gromenauer/resolve/main/images/gromenauer.png" width="200px">
	</div>

	## Overview

	Bertin-Gromenauer is a Spanish language model designed to understand and generate high-quality Spanish text. Developed using the robust Mistral architecture, this model has been trained on an extensive literary corpus, ensuring it captures a wide range of linguistic nuances, styles, and contexts found in Spanish literature.
	## Model Details

	- Model Type: Mistral
	- Sequence Length: 8192
	- Hidden Dimension: 4096
	- Intermediate Dimension: 14336
	- Number of Layers: 32
	- Number of Attention Heads: 32
	- Number of Key-Value Heads: 8
	- Activation Function: SiLU
	- Initializer Range: 0.02
	- Layer Norm Epsilon: 1.0e-05
	- Use Flash Attention: Yes
	- Gradient Checkpointing: Enabled (Block Size: 5)
	- Sliding Window Attention: 4096
	- Use Bias: No

	## Training Details

	- Tokenizer: [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
	- Batch Size: 512
	- Learning Rate: 1e-5
	- Optimizer: Adam with beta1=0.9, beta2=0.95, epsilon=1e-8
	- Weight Decay: 0.1
	- Warmup Steps: 200
	- Learning Rate Schedule: Cosine
	- Number of Training Steps: 7000

	## Usage

	To load the model in your project, you can use the following code:

	```python
	from transformers import AutoModel, AutoTokenizer

	# Load the tokenizer
	tokenizer = AutoTokenizer.from_pretrained("bertin-project/bertin-gromenauer")

	# Load the model
	model = AutoModel.from_pretrained("bertin-project/bertin-gromenauer")

	# Example usage
	text = "Introduce aquí tu texto en español."
	inputs = tokenizer(text, return_tensors="pt")
	outputs = model(**inputs)