[email protected]
commited on
Commit
•
ebd553d
1
Parent(s):
2cebb4b
Update readme
Browse files
README.md
CHANGED
@@ -14,10 +14,12 @@ tags:
|
|
14 |
|
15 |
## Model Details
|
16 |
|
17 |
-
Today (September 17th, 2024), we introduce [NVLM 1.0](https://arxiv.org/abs/2409.11402), a family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models (e.g., Llama 3-V 405B and InternVL 2). Remarkably, NVLM 1.0 shows improved text-only performance over its LLM backbone after multimodal training.
|
|
|
|
|
18 |
|
19 |
## Other Resources
|
20 |
-
[Inference Code (HF)](https://huggingface.co/nvidia/NVLM-
|
21 |
|
22 |
## Benchmark Results
|
23 |
We train our model with legacy [Megatron-LM](https://github.com/NVIDIA/Megatron-LM/tree/main/megatron/legacy) and adapt the codebase to Huggingface for model hosting, reproducibility, and inference.
|
|
|
14 |
|
15 |
## Model Details
|
16 |
|
17 |
+
Today (September 17th, 2024), we introduce [NVLM 1.0](https://arxiv.org/abs/2409.11402), a family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models (e.g., Llama 3-V 405B and InternVL 2). Remarkably, NVLM 1.0 shows improved text-only performance over its LLM backbone after multimodal training.
|
18 |
+
|
19 |
+
In this repo, we are open-sourcing NVLM-1.0-D-72B (decoder-only architecture), the decoder-only model weights and code for the community.
|
20 |
|
21 |
## Other Resources
|
22 |
+
[Inference Code (HF)](https://huggingface.co/nvidia/NVLM-D-72B/tree/main)   [Training Code (Coming soon)]()   [Website](https://nvlm-project.github.io/)   [Paper](https://arxiv.org/abs/2409.11402)
|
23 |
|
24 |
## Benchmark Results
|
25 |
We train our model with legacy [Megatron-LM](https://github.com/NVIDIA/Megatron-LM/tree/main/megatron/legacy) and adapt the codebase to Huggingface for model hosting, reproducibility, and inference.
|