nvidia
/

Nemotron-4-Mini-Hindi-4B-Base

Model card Files Files and versions Community

ravirajoshi commited on 30 days ago

Commit

53324d8

•

1 Parent(s): c040672

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -12,14 +12,14 @@ library_name: nemo
 # Model Overview
-Nemotron-4-Mini-Hindi-4B-Base is a base model pre-trained on Hindi and English corpus. The Nemotron-Mini-4B-Base (Minitron-4B) is subject to continuous pre-training using Hindi and English data (400B tokens) exclusively to create a strong base model for Hindi, English, and Hinglish. We make extensive use of synthetic data during the continuous pre-training stage. The base small language model (SLM) is optimized through distillation, pruning, and quantization for speed and on-device deployment. VRAM usage has been minimized to approximately 2 GB, providing significantly faster time to first token compared to LLMs.
 Please refer to our [arXiv paper](https://arxiv.org/abs/2410.14815) for more details.
 This model is for research and development only.
 **Model Developer:** NVIDIA
-**Model Dates:** Nemotron-4-Mini-Hindi-4B-Base was trained between June 2024 and Oct 2024.
 ## License
@@ -93,7 +93,7 @@ print(output_text)
 ## Evaluation Results
-*Zero-shot performance.* Evaluated using select datasets from the [IndicInstruct](https://github.com/AI4Bharat/IndicInstruct) with additions:
 | MMLU | ARC-C | ARC-E | HellaSwag | BoolQ |
 | :------------- | :------------- | :------------- | :------------- | :------------- |

 # Model Overview
+Nemotron-4-Mini-Hindi-4B-Base is a base model pre-trained on Hindi and English corpus. The Nemotron-Mini-4B-Base (Minitron-4B) is subject to continuous pre-training using Hindi and English data (400B tokens) exclusively to create a strong base model for Hindi, English, and Hinglish. We make extensive use of synthetic data during the continuous pre-training stage. The base small language model (SLM) is optimized through distillation, pruning, and quantization for speed and on-device deployment.
 Please refer to our [arXiv paper](https://arxiv.org/abs/2410.14815) for more details.
 This model is for research and development only.
 **Model Developer:** NVIDIA
+**Model Dates:** Nemotron-4-Mini-Hindi-4B-Base was trained between June 2024 and Sept 2024.
 ## License
 ## Evaluation Results
+*Zero-shot performance.* Evaluated using select Hindi datasets from the [Airavata Evaluation Framework](https://github.com/AI4Bharat/IndicInstruct) with additions:
 | MMLU | ARC-C | ARC-E | HellaSwag | BoolQ |
 | :------------- | :------------- | :------------- | :------------- | :------------- |