Update README.md, link demo in Spaces
Browse files
README.md
CHANGED
@@ -27,6 +27,8 @@ Inspired by [KeyBERT](https://github.com/MaartenGr/KeyBERT), KeyBERTVi implement
|
|
27 |
This implementation took inspiration from the simple yet intuitive and powerful method of [KeyBERT](https://github.com/MaartenGr/KeyBERT/), applied for the Vietnamese language. PhoBERT are used to generate both document-level embeddings and word-level embeddings for extracted N-grams. Cosine similarity is then used to compute which N-grams are most similar to the document-level embedding, thus can be perceived as most representative of the document.
|
28 |
Preprocessing catered to the Vietnamese language was applied.
|
29 |
|
|
|
|
|
30 |
<a name="gettingstarted"/></a>
|
31 |
## 2. Getting Started
|
32 |
<a name="installation"/></a>
|
@@ -48,6 +50,8 @@ You can use existing pre-trained models in the repo or download your own and put
|
|
48 |
torch.save(ner_model, f'{dir_path}/pretrained-models/ner-vietnamese-electra-base.pt')
|
49 |
```
|
50 |
|
|
|
|
|
51 |
As [PhoBERT](https://huggingface.co/vinai/phobert-base) requires [VnCoreNLP](https://github.com/vncorenlp/VnCoreNLP) as part of pre-processing, the folder `pretrained-models/vncorenlp` is required. To download your own:
|
52 |
```bash
|
53 |
pip install py_vncorenlp
|
|
|
27 |
This implementation took inspiration from the simple yet intuitive and powerful method of [KeyBERT](https://github.com/MaartenGr/KeyBERT/), applied for the Vietnamese language. PhoBERT are used to generate both document-level embeddings and word-level embeddings for extracted N-grams. Cosine similarity is then used to compute which N-grams are most similar to the document-level embedding, thus can be perceived as most representative of the document.
|
28 |
Preprocessing catered to the Vietnamese language was applied.
|
29 |
|
30 |
+
Test with your own documents at [KeyBERTVi Space](https://huggingface.co/spaces/tpha4308/keybertvi-app).
|
31 |
+
|
32 |
<a name="gettingstarted"/></a>
|
33 |
## 2. Getting Started
|
34 |
<a name="installation"/></a>
|
|
|
50 |
torch.save(ner_model, f'{dir_path}/pretrained-models/ner-vietnamese-electra-base.pt')
|
51 |
```
|
52 |
|
53 |
+
**Note:** `dir_path` is the absolute path to the repo.
|
54 |
+
|
55 |
As [PhoBERT](https://huggingface.co/vinai/phobert-base) requires [VnCoreNLP](https://github.com/vncorenlp/VnCoreNLP) as part of pre-processing, the folder `pretrained-models/vncorenlp` is required. To download your own:
|
56 |
```bash
|
57 |
pip install py_vncorenlp
|