File size: 2,355 Bytes
a9dd927 fee23b7 3e80e97 fee23b7 eeda3fe fee23b7 eeda3fe fee23b7 eeda3fe fee23b7 e09ff81 eeda3fe fee23b7 eeda3fe fee23b7 e09ff81 eeda3fe fee23b7 3e80e97 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
---
license: apache-2.0
pipeline_tag: text-generation
---
<p align="center">
<b><font size="6">SongComposer</font></b>
<p>
<div align="center">
[💻Github Repo](https://github.com/pjlab-songcomposer/songcomposer)
[📖Paper](https://arxiv.org/abs/2402.17645)
</div>
**SongComposer** is a language large model (LLM) based on [InternLM2](https://github.com/InternLM/InternLM) for lyric and melody composition in song generation.
We release SongComposer series in two versions:
- SongComposer_pretrain: The pretrained SongComposer with InternLM2 as the initialization of the LLM, gains basic knowledge of lyric and melody.
- SongComposer_sft: The finetuned SongComposer for *instruction-following song generation* including lyric to melody, melody to lyric, song continuation, text to song.
### Import from Transformers
To load the SongComposer_pretrain model using Transformers, use the following code:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
ckpt_path = "Mar2Ding/songcomposer_pretrain"
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True)
model = AutoModel.from_pretrained(ckpt_path, trust_remote_code=True).cuda().half()
prompt = '<bop> Total 7 lines. The first line:可,<D4>,<137>,<79>|惜,<D#4>,<137>,<79>|这,<F4>,<137>,<88>|是,<F4>,<121>,<79>|属,<F4>,<121>,<79>|于,<D#4>,<214>,<88>|你,<D#4>,<141>,<79>|的,<D4>,<130>,<79>|风,<C4>,<151>,<79>|景,<A#3> <F3>,<181><137>,<79>\n'
model.inference_pretrain(prompt, tokenizer, model)
```
### 通过 Transformers 加载
通过以下的代码加载 SongComposer_pretrain 模型
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
ckpt_path = "Mar2Ding/songcomposer_pretrain"
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True)
model = AutoModel.from_pretrained(ckpt_path, trust_remote_code=True).cuda().half()
prompt = '<bop> Total 7 lines. The first line:可,<D4>,<137>,<79>|惜,<D#4>,<137>,<79>|这,<F4>,<137>,<88>|是,<F4>,<121>,<79>|属,<F4>,<121>,<79>|于,<D#4>,<214>,<88>|你,<D#4>,<141>,<79>|的,<D4>,<130>,<79>|风,<C4>,<151>,<79>|景,<A#3> <F3>,<181><137>,<79>\n'
model.inference_pretrain(prompt, tokenizer, model)
```
### Open Source License
The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow free commercial usage. |