Corey Morris
commited on
Commit
•
7f24726
1
Parent(s):
80c79bd
Updated description with more models
Browse files
app.py
CHANGED
@@ -123,10 +123,10 @@ def find_top_differences_table(df, target_model, closest_models, num_differences
|
|
123 |
data_provider = ResultDataProcessor()
|
124 |
|
125 |
# st.title('Model Evaluation Results including MMLU by task')
|
126 |
-
st.title('Exploring the Characteristics of Large Language Models: An Interactive Portal for Analyzing
|
127 |
-
st.markdown("""***Last updated August
|
128 |
st.markdown("""
|
129 |
-
Hugging Face has run evaluations on over
|
130 |
[publicly available leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) and [dataset](https://huggingface.co/datasets/open-llm-leaderboard/results).
|
131 |
The Hugging Face leaderboard currently displays the overall result for Measuring Massive Multitask Language Understanding (MMLU), but not the results for individual tasks.
|
132 |
This app provides a way to explore the results for individual tasks and compare models across tasks.
|
|
|
123 |
data_provider = ResultDataProcessor()
|
124 |
|
125 |
# st.title('Model Evaluation Results including MMLU by task')
|
126 |
+
st.title('Exploring the Characteristics of Large Language Models: An Interactive Portal for Analyzing 800+ Open Source Models Across 57 Diverse Evaluation Tasks')
|
127 |
+
st.markdown("""***Last updated August 16th***""")
|
128 |
st.markdown("""
|
129 |
+
Hugging Face has run evaluations on over 800 open source models and provides results on a
|
130 |
[publicly available leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) and [dataset](https://huggingface.co/datasets/open-llm-leaderboard/results).
|
131 |
The Hugging Face leaderboard currently displays the overall result for Measuring Massive Multitask Language Understanding (MMLU), but not the results for individual tasks.
|
132 |
This app provides a way to explore the results for individual tasks and compare models across tasks.
|