Refactoring. Moved ResultDataProcessor class to a separate file to make it easier to use with experimentation in a jupyter notebook 843a5ef Corey Morris commited on Jul 24, 2023
Added updated results from hugging face evaluation runs 51a128e Corey Morris commited on Jul 24, 2023
Improving clarity. Moved MMLU average column to a more appropriate spot 5129f48 Corey Morris commited on Jul 23, 2023
Hiding filters unless box is selected. Removed model name column because it is the index of the table 8488477 Corey Morris commited on Jul 23, 2023
Added a scatter plot with just the top 50 performing models on MMLU average ca8e784 Corey Morris commited on Jul 23, 2023
added MMLU overall average column. added a few charts comparing more moral reasoning and comparing MMLU overall to other data c671de9 Corey Morris commited on Jul 23, 2023
Added statsmodels to be able to use a trendline in plotly ed019c6 Corey Morris commited on Jul 23, 2023
Updated data cleanup so that column names are cleaned up appropriatly with regex=True c1a84da Corey Morris commited on Jul 23, 2023
fixed reversed plot. extracted making chart into a method 337b761 Corey Morris commited on Jul 23, 2023
Update app.py and requirements.txt so that it will work with huggingface streamlit with the pandas 1.x version ba99486 Corey Morris commited on Jul 23, 2023
updated requirements.txt with versions being used locally 7ae46ce Corey Morris commited on Jul 23, 2023
WIP commit. Troubleshoot chart display. Add behavior of filter 43b4e29 Corey Morris commited on Jul 23, 2023
added hugging face evaluation harness results submodule 4dcdfc8 Corey Morris commited on Jul 21, 2023