monsterapi
/

Falcon_40B_dolly15k

@@ -11,35 +11,39 @@ datasets:
 base_model: tiiuae/falcon-40b
 ---
-For our finetuning process, we utilized the tiiuae/falcon-40b model and the Databricks-dolly-15k dataset.
-This dataset, a meticulous compilation of over 15,000 records, was a result of the dedicated work of thousands of Databricks professionals. It was specifically designed to further improve the interactive capabilities of ChatGPT-like systems.
-The dataset contributors crafted prompt / response pairs across eight distinct instruction categories. Besides the seven categories mentioned in the InstructGPT paper, they also ventured into an open-ended, free-form category. The contributors, emphasizing genuine and original content, refrained from sourcing information online, except in special cases where Wikipedia was the source for certain instruction categories. There was also a strict directive against the use of generative AI for crafting instructions or responses.
-The contributors could address questions from their peers. Rephrasing the original question was encouraged, and there was a clear preference to answer only those queries they were certain about.
-In some categories, the data comes with reference texts sourced from Wikipedia. Users might find bracketed Wikipedia citation numbers (like [42]) within the context field of the dataset. For smoother downstream applications, it's advisable to exclude these.
-Our finetuning was conducted using the [MonsterAPI](https://monsterapi.ai)'s intuitive, no-code [LLM finetuner](https://docs.monsterapi.ai/fine-tune-a-large-language-model-llm).
-Highlighting the cost-effectiveness and efficiency of the process,
-the entire session was finished in just 5 hours and 40 minutes, leveraging an A6000 48GB GPU.
-The total cost for this efficient run was a mere `$11.8`.
-#### Hyperparameters & Run details:
-- Epochs: 1
-- Cost: $11.8
-- Model Path: tiiuae/falcon-40b
-- Dataset: databricks/databricks-dolly-15k
-- Learning rate: 0.0002
-- Data split: Training 90% / Validation 10%
-- Gradient accumulation steps: 4
-license: apache-2.0
----
-######
-Prompt Used:
 ### INSTRUCTION:
 [instruction]
@@ -47,8 +51,9 @@ Prompt Used:
 ### RESPONSE:
 [response]
 Loss metrics
-Training loss (Blue) Validation Loss (orange):
-![training loss](train-loss.png "Training loss")

 base_model: tiiuae/falcon-40b
 ---
+### Finetuning Overview:
+**Model Used:** tiiuae/falcon-40b
+**Dataset:** Databricks-dolly-15k
+#### Dataset Insights:
+The Databricks-dolly-15k dataset, comprising over 15,000 records, stands as a testament to the dedication of numerous Databricks professionals. Aimed at refining the interactive capabilities of systems like ChatGPT, the dataset offers:
+- Prompt/response pairs across eight distinct instruction categories.
+- A blend of the seven categories from the InstructGPT paper and an open-ended category.
+- Original content, devoid of generative AI influence and primarily offline-sourced, with exceptions for Wikipedia references.
+- Interactive sessions where contributors could address and rephrase peer questions.
+Note: Some data categories incorporate Wikipedia references, evident from bracketed citation numbers, e.g., [42]. Exclusion is recommended for downstream applications.
+#### Finetuning Details:
+Leveraging [MonsterAPI](https://monsterapi.ai)'s no-code [LLM finetuner](https://docs.monsterapi.ai/fine-tune-a-large-language-model-llm), our finetuning emphasized:
+- **Cost-Effectiveness:** A complete run at just `$11.8`.
+- **Efficiency:** Using an A6000 48GB GPU, the session concluded in 5 hours and 40 minutes.
+#### Hyperparameters & Additional Details:
+- **Epochs:** 1
+- **Learning Rate:** 0.0002
+- **Data Split:** Training 90% / Validation 10%
+- **Gradient Accumulation Steps:** 4
+---
+### Prompt Structure:
+```
 ### INSTRUCTION:
 [instruction]
 ### RESPONSE:
 [response]
+```
 Loss metrics
+Training loss:
+![training loss](train-loss.png "Training loss")