Spaces:
Runtime error
Runtime error
Commit
β’
d4450b3
1
Parent(s):
bce77cb
add info about plot axis + trendlines
Browse files
app.py
CHANGED
@@ -174,12 +174,20 @@ with gr.Blocks(
|
|
174 |
gr.Markdown(
|
175 |
"""
|
176 |
<div style="text-align: center; max-width: 650px; margin: auto;">
|
177 |
-
<h1 style="font-weight: 900; margin-top: 5px;">π¬ Progress Tracker: Open vs. Proprietary LLMs
|
178 |
-
</h1>
|
179 |
<p style="text-align: left; margin-top: 5px; margin-bottom: 30px; line-height: 20px;">
|
180 |
This app visualizes the progress of proprietary and open-source LLMs over time as scored by the <a hfref="https://leaderboard.lmsys.org/">LMSYS Chatbot Arena</a>.
|
181 |
The idea is inspired by <a href="https://www.linkedin.com/posts/maxime-labonne_arena-elo-graph-updated-with-new-models-activity-7187062633735368705-u2jB">this great work</a>
|
182 |
from <a href="https://huggingface.co/mlabonne/">Maxime Labonne</a>, and is intended to stay up-to-date as new models are released and evaluated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
183 |
</p>
|
184 |
</div>
|
185 |
"""
|
|
|
174 |
gr.Markdown(
|
175 |
"""
|
176 |
<div style="text-align: center; max-width: 650px; margin: auto;">
|
177 |
+
<h1 style="font-weight: 900; margin-top: 5px;">π¬ Progress Tracker: Open vs. Proprietary LLMs π¬</h1>
|
|
|
178 |
<p style="text-align: left; margin-top: 5px; margin-bottom: 30px; line-height: 20px;">
|
179 |
This app visualizes the progress of proprietary and open-source LLMs over time as scored by the <a hfref="https://leaderboard.lmsys.org/">LMSYS Chatbot Arena</a>.
|
180 |
The idea is inspired by <a href="https://www.linkedin.com/posts/maxime-labonne_arena-elo-graph-updated-with-new-models-activity-7187062633735368705-u2jB">this great work</a>
|
181 |
from <a href="https://huggingface.co/mlabonne/">Maxime Labonne</a>, and is intended to stay up-to-date as new models are released and evaluated.
|
182 |
+
<div style="text-align: left;">
|
183 |
+
<strong>Plot info:</strong>
|
184 |
+
<br>
|
185 |
+
<ul style="padding-left: 20px;">
|
186 |
+
<li> The ELO score (y-axis) is a measure of the relative strength of a model based on its performance against other models in the arena. </li>
|
187 |
+
<li> The Release Date (x-axis) corresponds to when the model was first publicly released or when its ELO results were first reported (for ease of automated updates). </li>
|
188 |
+
<li> Trend lines are based on Ordinary Least Squares (OLS) regression and adjust based on the filter criteria. </li>
|
189 |
+
<ul>
|
190 |
+
</div>
|
191 |
</p>
|
192 |
</div>
|
193 |
"""
|