Doctor Llama Chat
This repository contains a version of TeenyTinyLlama-460m fine-tuned on the aira-med-training-pt dataset.
The main objective of the Doctor Llama model was to study the step-by-step process involved in fine-tuning models in Portuguese, taking into account the challenges encountered in the medical field.
This model was created as part of the course completion project for Biomedical Informatics at the Federal University of Paraná. For more information, access the full text at the following link.
Author
Mariana Moreira dos Santos (LinkedIn)
Code
You can check the codes used to fine-tune the model at the following Google Colab link.
Fine-tuning details
- Base model: TeenyTinyLlama 460m
- Context length: 2048 tokens
- Dataset for fine-tuning: aira-med-training-pt
- Dataset for evaluation: medicine-evaluation-pt
- Language: Portuguese
- GPU: NVIDIA A100-SXM4-40GB
- Training time: ~5 hours
Parameters
- Number of Epochs: 4
- Batch size: 3
- Optimizer: torch.optim.AdamW (warmup_steps = 1e3, learning_rate = 1e-5, epsilon = 1e-8)
Evaluations
Model | Perplexity | Evaluation Loss |
---|---|---|
TeenyTinyLlama 160m | 22.51 | 3.11 |
Doctor Llama 160m | 15.68 | 2.75 |
TeenyTinyLlama 460m | 13.09 | 2.57 |
Doctor Llama 460m | 10.94 | 2.39 |
TeenyTinyLlama 460m Chat | 21.22 | 3.05 |
Doctor Llama Chat | 11.13 | 2.41 |
Basic usage
Using the pipeline
:
from transformers import pipeline
generator = pipeline("text-generation", model="mmoreirast/Doctor-Llama-Chat")
completions = generator("Me fale sobre o sistema nervoso", num_return_sequences=2, max_new_tokens=100)
for comp in completions:
print(f"🤖 {comp['generated_text']}")
Using the AutoTokenizer
and AutoModelForCausalLM
:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load model and the tokenizer
tokenizer = AutoTokenizer.from_pretrained("mmoreirast/Doctor-Llama-Chat", revision='main')
model = AutoModelForCausalLM.from_pretrained("mmoreirast/Doctor-Llama-Chat", revision='main')
# Pass the model to your device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.eval()
model.to(device)
# Tokenize the inputs and pass them to the device
inputs = tokenizer("Me fale sobre o sistema nervoso", return_tensors="pt").to(device)
# Generate some text
completions = model.generate(**inputs, num_return_sequences=2, max_new_tokens=100)
# Print the generated text
for i, completion in enumerate(completions):
print(f'🤖 {tokenizer.decode(completion)}')
Intended Uses
The main objective of the Doctor Llama model was to study the step-by-step process involved in fine-tuning models in Portuguese, taking into account the challenges encountered in the medical field. You may also further fine-tune and adapt Doctor Llama for deployment, as long as your use is following the Apache 2.0 license. If you decide to use pre-trained Doctor Llama as a basis for your fine-tuned model, please conduct your own risk and bias assessment.
Out-of-scope Use
Doctor Llama is not intended for deployment. It is not a product and should not be used for human-facing interactions.
Doctor Llama models are Brazilian Portuguese language only and are not suitable for translation or generating text in other languages.
Limitations
As described in the Teeny Tiny Llama model, the Doctor Llama also has the following limitations:
Hallucinations: This model can produce content that can be mistaken for truth but is, in fact, misleading or entirely false, i.e., hallucination.
Biases and Toxicity: This model inherits the social and historical stereotypes from the data used to train it. Given these biases, the model can produce toxic content, i.e., harmful, offensive, or detrimental to individuals, groups, or communities.
Unreliable Code: The model may produce incorrect code snippets and statements. These code generations should not be treated as suggestions or accurate solutions.
Language Limitations: The model is primarily designed to understand standard Brazilian Portuguese. Other languages might challenge its comprehension, leading to potential misinterpretations or errors in response.
Repetition and Verbosity: The model may get stuck on repetition loops (especially if the repetition penalty during generations is set to a meager value) or produce verbose responses unrelated to the prompt it was given.
Hence, even though our models are released with a permissive license, we urge users to perform their risk analysis on these models if intending to use them for real-world applications and also have humans moderating the outputs of these models in applications where they will interact with an audience, guaranteeing users are always aware they are interacting with a language model.
Cite as 🤗
@misc{moreira2024docllama,
title = {Um Estudo sobre LLMs em Português para a Área Médica},
author = {Mariana Moreira dos Santos, André Ricardo Abed Grégio},
url = {},
year={2024}
}
Acknowledgements
The TeenyTinyLlama base models used here were created by Nicholas Kluge Corrêa and his team. For more information, visit TeenyTinyLlama.
License
Doctor Llama is licensed under the Apache License, Version 2.0. See the LICENSE file for more details.
- Downloads last month
- 82