MAGAer13 commited on
Commit
c451238
1 Parent(s): 18736c0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md CHANGED
@@ -1,3 +1,69 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
+ - zh
6
+ - fr
7
+ - ja
8
+ - multilingual
9
+ pipeline_tag: image-to-text
10
+ tags:
11
+ - mplug-owl
12
  ---
13
+
14
+ # Usage
15
+ ## Get the latest codebase from Github
16
+ ```Bash
17
+ git clone https://github.com/X-PLUG/mPLUG-Owl.git
18
+ ```
19
+
20
+ ## Model initialization
21
+ ```Python
22
+ from transformers import AutoTokenizer
23
+ from mplug_owl.modeling_mplug_owl import MplugOwlForConditionalGeneration
24
+ from mplug_owl.processing_mplug_owl import MplugOwlImageProcessor, MplugOwlProcessor
25
+
26
+ pretrained_ckpt = 'MAGAer13/mplug-owl-llama-7b'
27
+ model = MplugOwlForConditionalGeneration.from_pretrained(
28
+ pretrained_ckpt,
29
+ torch_dtype=torch.bfloat16,
30
+ )
31
+ image_processor = MplugOwlImageProcessor.from_pretrained(pretrained_ckpt)
32
+ tokenizer = AutoTokenizer.from_pretrained(pretrained_ckpt)
33
+ processor = MplugOwlProcessor(image_processor, tokenizer)
34
+ ```
35
+
36
+ ## Model inference
37
+ Prepare model inputs.
38
+ ```Python
39
+ # We use a human/AI template to organize the context as a multi-turn conversation.
40
+ # <image> denotes an image placeholder.
41
+ prompts = [
42
+ '''The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
43
+ Human: <image>
44
+ Human: Explain why this meme is funny.
45
+ AI: ''']
46
+
47
+ # The image paths should be placed in the image_list and kept in the same order as in the prompts.
48
+ # We support urls, local file paths, and base64 string. You can customise the pre-processing of images by modifying the mplug_owl.modeling_mplug_owl.ImageProcessor
49
+ image_list = ['https://xxx.com/image.jpg']
50
+ ```
51
+
52
+ Get response.
53
+ ```Python
54
+ # generate kwargs (the same in transformers) can be passed in the do_generate()
55
+ generate_kwargs = {
56
+ 'do_sample': True,
57
+ 'top_k': 5,
58
+ 'max_length': 512
59
+ }
60
+ from PIL import Image
61
+ images = [Image.open(_) for _ in image_list]
62
+ inputs = processor(text=prompts, images=images, return_tensors='pt')
63
+ inputs = {k: v.bfloat16() if v.dtype == torch.float else v for k, v in inputs.items()}
64
+ inputs = {k: v.to(model.device) for k, v in inputs.items()}
65
+ with torch.no_grad():
66
+ res = model.generate(**inputs, **generate_kwargs)
67
+ sentence = tokenizer.decode(res.tolist()[0], skip_special_tokens=True)
68
+ print(sentence)
69
+ ```