--- license: apache-2.0 language: - en library_name: transformers pipeline_tag: text-generation --- # BabyMistral Model Card ## Model Overview **BabyMistral** is a compact yet powerful language model designed for efficient text generation tasks. Built on the Mistral architecture, this model offers impressive performance despite its relatively small size. ### Key Specifications - **Parameters:** 1.5 billion - **Training Data:** 1.5 trillion tokens - **Architecture:** Based on Mistral - **Training Duration:** 70 days - **Hardware:** 4x NVIDIA A100 GPUs ## Model Details ### Architecture BabyMistral utilizes the Mistral AI architecture, which is known for its efficiency and performance. The model scales this architecture to 1.5 billion parameters, striking a balance between capability and computational efficiency. ### Training - **Dataset Size:** 1.5 trillion tokens - **Training Approach:** Trained from scratch - **Hardware:** 4x NVIDIA A100 GPUs - **Duration:** 70 days of continuous training ### Capabilities BabyMistral is designed for a wide range of natural language processing tasks, including: - Text completion and generation - Creative writing assistance - Dialogue systems - Question answering - Language understanding tasks ## Usage ### Getting Started To use BabyMistral with the Hugging Face Transformers library: ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("Aarifkhan/BabyMistral") tokenizer = AutoTokenizer.from_pretrained("Aarifkhan/BabyMistral") # Define the chat input chat = [ # { "role": "system", "content": "You are BabyMistral" }, { "role": "user", "content": "Hey there! How are you? 😊" } ] inputs = tokenizer.apply_chat_template( chat, add_generation_prompt=True, return_tensors="pt" ).to(model.device) # Generate text outputs = model.generate( inputs, max_new_tokens=256, do_sample=True, temperature=0.6, top_p=0.9, eos_token_id=tokenizer.eos_token_id, ) response = outputs[0][inputs.shape[-1]:] print(tokenizer.decode(response, skip_special_tokens=True)) #I am doing well! How can I assist you today? 😊 ``` ### Ethical Considerations While BabyMistral is a powerful tool, users should be aware of its limitations and potential biases: - The model may reproduce biases present in its training data - It should not be used as a sole source of factual information - Generated content should be reviewed for accuracy and appropriateness ### Limitations - May struggle with very specialized or technical domains - Lacks real-time knowledge beyond its training data - Potential for generating plausible-sounding but incorrect information