dafajudin
update code
996ed80
<!DOCTYPE html>
<html>
<head>
<link rel="preconnect" href="https://fonts.googleapis.com" />
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
<link href="https://fonts.googleapis.com/css2?family=Source+Sans+Pro:wght@400;600;700&display=swap" rel="stylesheet" />
<title>Visual Question Answering (VQA) for Medical Imaging</title>
<style>
* {
box-sizing: border-box;
}
body {
font-family: 'Source Sans Pro', sans-serif;
font-size: 16px;
}
.container {
width: 100%;
margin: 0 auto;
}
.title {
font-size: 24px !important;
font-weight: 600 !important;
letter-spacing: 0em;
text-align: center;
color: #374159 !important;
}
.subtitle {
font-size: 24px !important;
font-style: italic;
font-weight: 400 !important;
letter-spacing: 0em;
text-align: center;
color: #1d652a !important;
padding-bottom: 0.5em;
}
.overview-heading {
font-size: 24px !important;
font-weight: 600 !important;
letter-spacing: 0em;
text-align: left;
}
.overview-content {
font-size: 14px !important;
font-weight: 400 !important;
line-height: 33px !important;
letter-spacing: 0em;
text-align: left;
}
.content-image {
width: 100% !important;
height: auto !important;
}
.vl {
border-left: 5px solid #1d652a;
padding-left: 20px;
color: #1d652a !important;
}
.grid-container {
display: grid;
grid-template-columns: 1fr 2fr;
gap: 20px;
align-items: flex-start;
margin-bottom: 1em;
}
@media screen and (max-width: 768px) {
.container {
width: 90%;
}
.grid-container {
display: block;
}
.overview-heading {
font-size: 18px !important;
}
}
</style>
</head>
<body>
<div class="container">
<h1 class="title">Visual Question Answering (VQA) for Medical Imaging</h1>
<h2 class="subtitle">Kalbe Digital Lab</h2>
<section class="overview">
<div class="grid-container">
<h3 class="overview-heading"><span class="vl">Overview</span></h3>
<div>
<p class="overview-content">
This project addresses the challenge of accurate and efficient medical imaging analysis in healthcare,
aiming to reduce human error and workload for radiologists. The proposed solution involves developing advanced AI
models for Visual Question Answering (VQA) to assist healthcare professionals in analyzing
medical images (radiology images) quickly and accurately. We fine-tune HuggingFace multimodal model Idefics2-8b using radiology VQA datasets.
</p>
</div>
</div>
<div class="grid-container">
<h3 class="overview-heading"><span class="vl">Dataset</span></h3>
<div>
<p class="overview-content">
We fine-tune pre-trained model using these datasets :
</p>
<ul>
<li><a href="https://huggingface.co/datasets/flaviagiammarino/vqa-rad" target="_blank">VQA-RAD dataset</a></li>
<li><a href="https://huggingface.co/datasets/mdwiratathya/SLAKE-vqa-english" target="_blank">SLAKE dataset</a></li>
<li><a href="https://huggingface.co/datasets/mdwiratathya/ROCO-radiology" target="_blank">ROCO dataset</a></li>
</ul>
</div>
</div>
<div class="grid-container">
<h3 class="overview-heading"><span class="vl">Model Architecture</span></h3>
<div>
<p class="overview-content">The model is trained using Idefics2-8b.</p>
<img class="content-image" src="https://raw.githubusercontent.com/Kalbe-x-Bangkit/C24-RM-Kalbe-Bangkit/main/img/idefics2_architecture.png" alt="model-architecture" />
</div>
</div>
</section>
<h3 class="overview-heading"><span class="vl">Demo</span></h3>
<p class="overview-content">Please upload an image and question or select from the examples to see the answer prediction</p>
</div>
</body>
</html>