ArkaAbacus
commited on
Commit
•
fbb849f
1
Parent(s):
292e522
Update README.md
Browse files
README.md
CHANGED
@@ -29,4 +29,21 @@ We are currently working on writing up this new technique in the form of a techn
|
|
29 |
|
30 |
### Contamination Results
|
31 |
|
32 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
|
30 |
### Contamination Results
|
31 |
|
32 |
+
We generate our contamination numbers using https://github.com/swj0419/detect-pretrain-code-contamination/tree/master, with Llama7B as our reference model.
|
33 |
+
Smaug-72B has the following results:
|
34 |
+
|
35 |
+
| ARC | TruthfulQA | GSM8K |
|
36 |
+
| --- | --- | --- |
|
37 |
+
| 0.20| 0.45| 1.00|
|
38 |
+
|
39 |
+
By comparison, MoMo-72B-lora-1.8.7-DPO has the following results:
|
40 |
+
|
41 |
+
| ARC | TruthfulQA | GSM8K |
|
42 |
+
| --- | --- | --- |
|
43 |
+
| 0.20| 0.39| 1.00|
|
44 |
+
|
45 |
+
Note that GSM8K often scores very highly on this contamination suite - we verified this by also running Llama-2-70B:
|
46 |
+
|
47 |
+
| ARC | TruthfulQA | GSM8K |
|
48 |
+
| --- | --- | --- |
|
49 |
+
| 0.22| 0.51| 0.89|
|