LLaMA 3.1 8B Instruct - Healthcare Fine-tuned Model

This is a model that fine-tuned the Llama-3.1-8B-Instruct model from Unidocs using Healthcare data.
유니닥스(주)에서 Llama-3.1-8B-Instruct 모델을 Healthcare 데이터로 미세조정한 모델임

Model Description

sLLM model used in Unidoc's ezMyAIDoctor, released on October 16, 2024 as a result of the AIDC-HPC project
of the Artificial Intelligence Industry Convergence Business Group (AICA)
meta-llama/Llama-3.1-8B-Instruct wiki, kowiki, super-large AI healthcare question-answer data,
A model that has been pretrained (Full Finetuning) by referring to the super-large AI corpus with improved Korean performance,
and the medical and legal professional book corpus.

유니닥스(주)의 ezMyAIDoctor에서 사용되는 sLLM 모델로 인공지능산업융합사업단(AICA)의 AIDC-HPC 사업의 결과로 2024년 10월 16일 공개함
meta-llama/Llama-3.1-8B-Instruct에 wiki, kowiki, AIHub(aihub.or.kr)의 (초거대AI 헬스케어 질의응답데이터, 한국어 성능이 개선된 초거대 AI 말뭉치, 의료/법률 전문서적 말뭉치)를 참고하여 Pretrain(Full Finetuning)된 모델임

Intended Uses & Limitations

The model is designed to assist with healthcare-related queries and tasks.
However, it should not be used as a substitute for professional medical advice, diagnosis, or treatment.
Always consult with a qualified healthcare provider for medical concerns.

이 모델은 Healthcare 관련 질의 및 작업을 지원하도록 설계되었습니다.
그러나 전문적인 의학적 조언, 진단 또는 치료를 대체하는 데 사용되어서는 안 됩니다.
의료 관련 문제는 항상 자격을 갖춘 의료 서비스 제공자와 상의하십시오.

Training Data

The model was fine-tuned on a proprietary healthcare dataset.
Due to privacy concerns, details of the dataset cannot be disclosed.

wiki, kowiki 데이터 이외
과학기술정보통신부, 한국지능정보사회진흥원에서 관리하고 있는 AIHub의

초거대AI 헬스케어 질의응답데이터
한국어 성능이 개선된 초거대 AI 말뭉치
의료, 법률 전문서적 말뭉치
등을 활용함

Training Procedure

Full fine-tuning was performed on the base LLaMA 3.1 8B Instruct model using the healthcare dataset.
Healthcare 데이터 세트를 사용하여 기본 LLaMA 3.1 8B Instruct 모델에서 전체 미세 조정을 수행했습니다.

Evaluation Results

Accuracy by category of mmlu benchmark

category	Accuracy
anatomy	0.68 (92/135)
clinical_knowledge	0.75 (200/265)
college_medicine	0.68 (117/173)
medical_genetics	0.70 (70/100)
professional_medicine	0.76 (208/272)

All Accuracy Mean value: 0.72

Use with transformers

Starting with transformers >= 4.43.1 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.

Make sure to update your transformers installation via pip install --upgrade transformers.

import transformers
import torch

model_id = "unidocs/llama-3.1-8b-komedic-instruct"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "당신은 의료전문가입니다. 질병의 정의, 원인, 증상, 검진, 진단, 치료, 약물, 식이, 생활 측면에서 답변해 주세요"},
    {"role": "user", "content": "공복혈당이 120이상인 경우 제1형 당뇨와 제2형 당뇨 환자는 각각 어떻게 치료를 받아야 하나요?"},
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])

Note: You can also find detailed recipes on how to use the model locally, with torch.compile(), assisted generations, quantised and more at huggingface-llama-recipes

Limitations and Bias

This model may produce biased or inaccurate results. It should not be solely relied upon for critical healthcare decisions.
The model's knowledge is limited to its training data and cut-off date.
It may exhibit biases present in the training data.
The model may occasionally produce incorrect or inconsistent information.
모델의 지식은 훈련 데이터와 마감일로 제한됩니다.
훈련 데이터에 편향이 있을 수 있습니다.
모델은 가끔 잘못되거나 일관되지 않은 정보를 생성할 수 있습니다.
이 모델은 편향되거나 부정확한 결과를 생성할 수 있습니다. 중요한 의료 결정에 이 모델에만 의존해서는 안 됩니다.

Legal Disclaimer

The model developers and distributors bear no legal responsibility for any consequences arising from the use of this model.
This includes any direct, indirect, incidental, special, punitive, or consequential damages resulting from the model's output.
By using this model, users assume all risks that may arise, and the responsibility for verifying and appropriately using the model's output lies solely with the user.
This model cannot substitute for medical advice, diagnosis, or treatment, and qualified healthcare professionals should always be consulted for medical decisions.
This disclaimer applies to the maximum extent permitted by applicable law.

법적 책임 면책 조항

본 모델의 사용으로 인해 발생하는 모든 결과에 대해 모델 개발자 및 배포자는 어떠한 법적 책임도 지지 않습니다.
이는 모델의 출력으로 인한 직접적, 간접적, 우발적, 특수한, 징벌적 또는 결과적 손해를 포함합니다.
사용자는 본 모델을 사용함으로써 발생할 수 있는 모든 위험을 감수하며, 모델의 출력에 대한 검증 및 적절한 사용에 대한 책임은 전적으로 사용자에게 있습니다.
본 모델은 의학적 조언, 진단, 또는 치료를 대체할 수 없으며, 의료 관련 결정을 내릴 때는 반드시 자격을 갖춘 의료 전문가와 상담해야 합니다.
이 면책 조항은 관련 법률이 허용하는 최대 범위 내에서 적용됩니다.

Model Card Contact

유석 ([email protected]), 김진실([email protected])

Additional Information

For more details about the base model, please refer to the original LLaMA 3.1 documentation.

unidocs
/

llama-3.1-8b-komedic-instruct