openbmb
/

VisCPM-Paint

VisCPMPaintBeePipeline

Model card Files Files and versions Community

JamesHujy commited on Jun 25, 2023

Commit

247801a

•

1 Parent(s): ca83f46

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -4,6 +4,9 @@ language:
 - zh
 - en
 ---
 [GITHUB](https://github.com/OpenBMB/VisCPM)
 `VisCPM` is a family of open-source large multimodal models, which support multimodal conversational capabilities (`VisCPM-Chat` model) and text-to-image generation capabilities (`VisCPM-Paint` model) in both Chinese and English, achieving state-of-the-art peformance among Chinese open-source multimodal models. `VisCPM` is trained based on the large language model [CPM-Bee](https://huggingface.co/openbmb/cpm-bee-10b) with 10B parameters, fusing visual encoder (`Q-Former`) and visual decoder (`Diffusion-UNet`) to support visual inputs and outputs. Thanks to the good bilingual capability of `CPM-Bee`, `VisCPM` can be pre-trained with English multimodal data only and well generalize to achieve promising Chinese multimodal capabilities.

 - zh
 - en
 ---
+# VisCPM
 [GITHUB](https://github.com/OpenBMB/VisCPM)
 `VisCPM` is a family of open-source large multimodal models, which support multimodal conversational capabilities (`VisCPM-Chat` model) and text-to-image generation capabilities (`VisCPM-Paint` model) in both Chinese and English, achieving state-of-the-art peformance among Chinese open-source multimodal models. `VisCPM` is trained based on the large language model [CPM-Bee](https://huggingface.co/openbmb/cpm-bee-10b) with 10B parameters, fusing visual encoder (`Q-Former`) and visual decoder (`Diffusion-UNet`) to support visual inputs and outputs. Thanks to the good bilingual capability of `CPM-Bee`, `VisCPM` can be pre-trained with English multimodal data only and well generalize to achieve promising Chinese multimodal capabilities.