mgoin
/

Nemotron-4-340B-Instruct-vllm

Text Generation

Inference Endpoints

Model card Files Files and versions Community

mgoin commited on Jul 21

Commit

eedfde2

•

1 Parent(s): 4c851cc

Update README.md

Files changed (1) hide show

README.md +1 -3

README.md CHANGED Viewed

@@ -5,9 +5,7 @@ license_link: >-
   https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf
 ---
-# NOTICE; PLEASE READ. NO INFERENCE. (YET)
-**This has no support for inference, yet.** All I've done is move the weights out of NVIDIAs NeMo architecture so people smarter than me can get a headstart on making it work with other backends.
 ## Nemotron-4-340B-Instruct

   https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf
 ---
+Based on [nemotron3-8b](https://huggingface.co/thhaus/nemotron3-8b) and [Nemotron-4-340B-Instruct-SafeTensors](https://huggingface.co/failspy/Nemotron-4-340B-Instruct-SafeTensors) with quite a few changes to make compatible with vLLM, PR here: https://github.com/vllm-project/vllm/pull/6611
 ## Nemotron-4-340B-Instruct