--- language: en license: apache-2.0 tags: - text-generation-inference - transformers - ruslanmv - llama - trl base_model: unsloth/llama-3-8b-bnb-4bit datasets: - ruslanmv/ai-medical-chatbot --- # Medical-Llama3-8B-16bit: Fine-Tuned Llama3 for Medical Q&A This repository provides a fine-tuned version of the powerful Llama3 8B model, specifically designed to answer medical questions in an informative way. It leverages the rich knowledge contained in the AI Medical Chatbot dataset ([ruslanmv/ai-medical-chatbot](https://huggingface.co/datasets/ruslanmv/ai-medical-chatbot)). **Model & Development** - **Developed by:** ruslanmv - **License:** Apache-2.0 - **Finetuned from model:** unsloth/llama-3-8b-bnb-4bit **Key Features** - **Medical Focus:** Optimized to address health-related inquiries. - **Knowledge Base:** Trained on a comprehensive medical chatbot dataset. - **Text Generation:** Generates informative and potentially helpful responses. **Installation** This model is accessible through the Hugging Face Transformers library. Install it using pip: ```bash pip install transformers ``` **Usage Example** Here's a Python code snippet demonstrating how to interact with the `Medical-Llama3-8B-16bit` model and generate answers to your medical questions: ```python from transformers import AutoTokenizer, AutoModelForCausalLM # Load tokenizer and model tokenizer = AutoTokenizer.from_pretrained("ruslanmv/Medical-Llama3-8B-16bit") model = AutoModelForCausalLM.from_pretrained("ruslanmv/Medical-Llama3-8B-16bit").to("cuda") # If using GPU # Function to format and generate response with prompt engineering def askme(question): medical_prompt = """You are an AI Medical Assistant trained on a vast dataset of health information. Below is a medical question: Question: {} Please provide an informative and comprehensive answer: Answer: """.format(question) EOS_TOKEN = tokenizer.eos_token def format_prompt(question): return medical_prompt + question + EOS_TOKEN inputs = tokenizer(format_prompt(question), return_tensors="pt").to("cuda") # If using GPU outputs = model.generate(**inputs, max_new_tokens=64, use_cache=True) # Adjust max_new_tokens for longer responses answer = tokenizer.batch_decode(outputs)[0].strip() return answer # Example usage question = "What should I do to reduce my weight gained due to genetic hypothyroidism?" print(askme(question)) ``` **Important Note** This model is intended for informational purposes only and should not be used as a substitute for professional medical advice. Always consult with a qualified healthcare provider for any medical concerns. **License** This model is distributed under the Apache License 2.0 (see LICENSE file for details). **Contributing** We welcome contributions to this repository! If you have improvements or suggestions, feel free to create a pull request. **Disclaimer** While we strive to provide informative responses, the accuracy of the model's outputs cannot be guaranteed. It is crucial to consult a doctor or other healthcare professional for definitive medical advice. ```