NeMo
PyTorch
English
Hindi
nemotron
ravirajoshi commited on
Commit
53324d8
1 Parent(s): c040672

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -12,14 +12,14 @@ library_name: nemo
12
 
13
  # Model Overview
14
 
15
- Nemotron-4-Mini-Hindi-4B-Base is a base model pre-trained on Hindi and English corpus. The Nemotron-Mini-4B-Base (Minitron-4B) is subject to continuous pre-training using Hindi and English data (400B tokens) exclusively to create a strong base model for Hindi, English, and Hinglish. We make extensive use of synthetic data during the continuous pre-training stage. The base small language model (SLM) is optimized through distillation, pruning, and quantization for speed and on-device deployment. VRAM usage has been minimized to approximately 2 GB, providing significantly faster time to first token compared to LLMs.
16
  Please refer to our [arXiv paper](https://arxiv.org/abs/2410.14815) for more details.
17
 
18
  This model is for research and development only.
19
 
20
  **Model Developer:** NVIDIA
21
 
22
- **Model Dates:** Nemotron-4-Mini-Hindi-4B-Base was trained between June 2024 and Oct 2024.
23
 
24
  ## License
25
 
@@ -93,7 +93,7 @@ print(output_text)
93
 
94
  ## Evaluation Results
95
 
96
- *Zero-shot performance.* Evaluated using select datasets from the [IndicInstruct](https://github.com/AI4Bharat/IndicInstruct) with additions:
97
 
98
  | MMLU | ARC-C | ARC-E | HellaSwag | BoolQ |
99
  | :------------- | :------------- | :------------- | :------------- | :------------- |
 
12
 
13
  # Model Overview
14
 
15
+ Nemotron-4-Mini-Hindi-4B-Base is a base model pre-trained on Hindi and English corpus. The Nemotron-Mini-4B-Base (Minitron-4B) is subject to continuous pre-training using Hindi and English data (400B tokens) exclusively to create a strong base model for Hindi, English, and Hinglish. We make extensive use of synthetic data during the continuous pre-training stage. The base small language model (SLM) is optimized through distillation, pruning, and quantization for speed and on-device deployment.
16
  Please refer to our [arXiv paper](https://arxiv.org/abs/2410.14815) for more details.
17
 
18
  This model is for research and development only.
19
 
20
  **Model Developer:** NVIDIA
21
 
22
+ **Model Dates:** Nemotron-4-Mini-Hindi-4B-Base was trained between June 2024 and Sept 2024.
23
 
24
  ## License
25
 
 
93
 
94
  ## Evaluation Results
95
 
96
+ *Zero-shot performance.* Evaluated using select Hindi datasets from the [Airavata Evaluation Framework](https://github.com/AI4Bharat/IndicInstruct) with additions:
97
 
98
  | MMLU | ARC-C | ARC-E | HellaSwag | BoolQ |
99
  | :------------- | :------------- | :------------- | :------------- | :------------- |