|
--- |
|
pipeline_tag: text-generation |
|
inference: true |
|
widget: |
|
- text: 'Hello!' |
|
example_title: Hello world |
|
group: Python |
|
library_name: transformers |
|
--- |
|
|
|
This model is randomly initialized, using the config from [Qwen/Qwen-VL-Chat](https://huggingface.co/Qwen/Qwen-VL-Chat/blob/main/config.json) but with smaller size. |
|
Note the model is in float16. |
|
|
|
Notable modifications: |
|
|
|
```python |
|
config.fp16 = True |
|
config.hidden_size = 8 |
|
config.intermediate_size = 16 |
|
config.kv_channels = 4 |
|
config.num_attention_heads = 2 |
|
config.num_hidden_layers = 2 |
|
config.seq_length = 2048 |
|
|
|
config.visual = { |
|
"heads": 2, |
|
"image_size": 448, |
|
"image_start_id": 151857, |
|
"layers": 2, |
|
"mlp_ratio": 1.0, |
|
"output_dim": 8, |
|
"patch_size": 14, |
|
"width": 8, |
|
} |
|
``` |
|
|
|
Also, we changed the visual model attention head dim. See `upload_model.py` for details. |
|
|