Medical-Llama3-8B / README.md
ruslanmv's picture
Create README.md
6fba484 verified
|
raw
history blame
3.14 kB
metadata
language: en
license: apache-2.0
tags:
  - text-generation-inference
  - transformers
  - ruslanmv
  - llama
  - trl
base_model: unsloth/llama-3-8b-bnb-4bit
datasets:
  - ruslanmv/ai-medical-chatbot

Medical-Llama3-8B-16bit: Fine-Tuned Llama3 for Medical Q&A

This repository provides a fine-tuned version of the powerful Llama3 8B model, specifically designed to answer medical questions in an informative way. It leverages the rich knowledge contained in the AI Medical Chatbot dataset (ruslanmv/ai-medical-chatbot).

Model & Development

  • Developed by: ruslanmv
  • License: Apache-2.0
  • Finetuned from model: unsloth/llama-3-8b-bnb-4bit

Key Features

  • Medical Focus: Optimized to address health-related inquiries.
  • Knowledge Base: Trained on a comprehensive medical chatbot dataset.
  • Text Generation: Generates informative and potentially helpful responses.

Installation

This model is accessible through the Hugging Face Transformers library. Install it using pip:

pip install transformers

Usage Example

Here's a Python code snippet demonstrating how to interact with the Medical-Llama3-8B-16bit model and generate answers to your medical questions:

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("ruslanmv/Medical-Llama3-8B-16bit")
model = AutoModelForCausalLM.from_pretrained("ruslanmv/Medical-Llama3-8B-16bit").to("cuda")  # If using GPU

# Function to format and generate response with prompt engineering
def askme(question):
    medical_prompt = """You are an AI Medical Assistant trained on a vast dataset of health information. Below is a medical question:

    Question: {}

    Please provide an informative and comprehensive answer:

    Answer: """.format(question)

    EOS_TOKEN = tokenizer.eos_token

    def format_prompt(question):
        return medical_prompt + question + EOS_TOKEN

    inputs = tokenizer(format_prompt(question), return_tensors="pt").to("cuda")  # If using GPU
    outputs = model.generate(**inputs, max_new_tokens=64, use_cache=True)  # Adjust max_new_tokens for longer responses
    answer = tokenizer.batch_decode(outputs)[0].strip()
    return answer

# Example usage
question = "What should I do to reduce my weight gained due to genetic hypothyroidism?"
print(askme(question))

Important Note

This model is intended for informational purposes only and should not be used as a substitute for professional medical advice. Always consult with a qualified healthcare provider for any medical concerns.

License

This model is distributed under the Apache License 2.0 (see LICENSE file for details).

Contributing

We welcome contributions to this repository! If you have improvements or suggestions, feel free to create a pull request.

Disclaimer

While we strive to provide informative responses, the accuracy of the model's outputs cannot be guaranteed. It is crucial to consult a doctor or other healthcare professional for definitive medical advice. ```