Update README.md
Browse files
README.md
CHANGED
@@ -28,6 +28,18 @@ that facilitates the fine-tuning of various well-known LLMs on custom data.
|
|
28 |
Parameter-efficient fine-tuning is achieved via the QLoRA method [Dettmers et al., 2023](https://proceedings.neurips.cc/paper_files/paper/2023/file/1feb87871436031bdc0f2beaa62a049b-Paper-Conference.pdf).
|
29 |
|
30 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
31 |
## Usage Guide
|
32 |
|
33 |
This project was executed on an Ubuntu 22.04.3 system running Linux kernel 6.8.0-40-generic.
|
|
|
28 |
Parameter-efficient fine-tuning is achieved via the QLoRA method [Dettmers et al., 2023](https://proceedings.neurips.cc/paper_files/paper/2023/file/1feb87871436031bdc0f2beaa62a049b-Paper-Conference.pdf).
|
29 |
|
30 |
|
31 |
+
Number of conversation turns and words in the original datasets and after splitting long conversations:
|
32 |
+
|
33 |
+
| **Dataset** | **Turns (Original)** | **Words (Original)** | **Turns (Split turns)** | **Words (Split turns)** |
|
34 |
+
|------------------|:--------------------:|:--------------------:|:-----------------------:|:-----------------------:|
|
35 |
+
| TSCC v2 | 570 | 788k | 1074 | 786k |
|
36 |
+
| CIMA | 1135 | 44k | 1135 | 38k |
|
37 |
+
| MathDial | 2861 | 923k | 2876 | 879k |
|
38 |
+
| Multicultural | 5 | 614k | 643 | 614k |
|
39 |
+
| Uptake | 774 | 35k | 775 | 34k |
|
40 |
+
| **Total** | **5345** | **2404k** | **6503** | **2351k** |
|
41 |
+
|
42 |
+
|
43 |
## Usage Guide
|
44 |
|
45 |
This project was executed on an Ubuntu 22.04.3 system running Linux kernel 6.8.0-40-generic.
|