Qwen
/

Qwen2-VL-7B-Instruct

Image-Text-to-Text

Inference Endpoints

Model card Files Files and versions Community

Updated README for GPU configuration.

#51

by aliasgerovs - opened 2 days ago

base: refs/heads/main

←

from: refs/pr/51

Discussion Files changed

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -109,6 +109,13 @@ model = Qwen2VLForConditionalGeneration.from_pretrained(
     "Qwen/Qwen2-VL-7B-Instruct", torch_dtype="auto", device_map="auto"
 )
 # We recommend enabling flash_attention_2 for better acceleration and memory saving, especially in multi-image and video scenarios.
 # model = Qwen2VLForConditionalGeneration.from_pretrained(
 #     "Qwen/Qwen2-VL-7B-Instruct",

     "Qwen/Qwen2-VL-7B-Instruct", torch_dtype="auto", device_map="auto"
 )
+# alternative if you are facing nan issues on output with  device_map="auto" settings.
+device = torch.device(f"cuda:xxx" if torch.cuda.is_available() else "cpu")
+# model = Qwen2VLForConditionalGeneration.from_pretrained(
+#    "Qwen/Qwen2-VL-7B-Instruct", torch_dtype=torch.bfloat16, device_map=device
+#)
+# model.eval()
 # We recommend enabling flash_attention_2 for better acceleration and memory saving, especially in multi-image and video scenarios.
 # model = Qwen2VLForConditionalGeneration.from_pretrained(
 #     "Qwen/Qwen2-VL-7B-Instruct",