Update app.py
Browse files
app.py
CHANGED
@@ -95,6 +95,22 @@ Our leaderboard incorporates a comprehensive evaluation using diverse metrics li
|
|
95 |
To measure the risk of data leakage from the test set used in training, we introduce the Data Leakage Test (DLT). The DLT calculates the difference in perplexity between the training set and the test set. A larger difference indicates a lower likelihood of model cheating, while a smaller difference suggests a higher likelihood.
|
96 |
|
97 |
For more details, refer to our [Challenge page](https://sites.google.com/nlg.csie.ntu.edu.tw/finnlp-agentscen/shared-task-finllm?authuser=0).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
98 |
"""
|
99 |
|
100 |
|
|
|
95 |
To measure the risk of data leakage from the test set used in training, we introduce the Data Leakage Test (DLT). The DLT calculates the difference in perplexity between the training set and the test set. A larger difference indicates a lower likelihood of model cheating, while a smaller difference suggests a higher likelihood.
|
96 |
|
97 |
For more details, refer to our [Challenge page](https://sites.google.com/nlg.csie.ntu.edu.tw/finnlp-agentscen/shared-task-finllm?authuser=0).
|
98 |
+
|
99 |
+
**Task 1: Top 3**
|
100 |
+
π₯ [email protected]
|
101 |
+
π₯ [email protected]
|
102 |
+
π₯ [email protected]
|
103 |
+
|
104 |
+
**Task 2: Top 3**
|
105 |
+
π₯ [email protected]
|
106 |
+
π₯ [email protected]
|
107 |
+
π₯ [email protected]
|
108 |
+
|
109 |
+
**Task 3: Top 3**
|
110 |
+
π₯ [email protected]
|
111 |
+
π₯ [email protected]
|
112 |
+
π₯ [email protected]
|
113 |
+
|
114 |
"""
|
115 |
|
116 |
|