justin13barrett
commited on
Commit
•
87e115d
1
Parent(s):
334df3f
Update README.md
Browse files
README.md
CHANGED
@@ -12,18 +12,33 @@ model-index:
|
|
12 |
# bert-base-multilingual-cased-finetuned-openalex-topic-classification-title-abstract
|
13 |
|
14 |
This model is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) on a labeled dataset provided by CWTS:
|
15 |
-
[CWTS Labeled Data]
|
16 |
|
17 |
-
This is NOT the full model being used to tag OpenAlex works with a topic. For that, check out the following github repo:
|
18 |
[OpenAlex Topic Classification](https://github.com/ourresearch/openalex-topic-classification)
|
19 |
|
|
|
|
|
20 |
## Model description
|
21 |
|
22 |
-
The input data is
|
23 |
|
24 |
"\<TITLE\> {insert-processed-title-here}\n\<ABSTRACT\> {insert-processed-abstract-here}"
|
25 |
|
26 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
|
28 |
## Intended uses & limitations
|
29 |
|
|
|
12 |
# bert-base-multilingual-cased-finetuned-openalex-topic-classification-title-abstract
|
13 |
|
14 |
This model is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) on a labeled dataset provided by CWTS:
|
15 |
+
[CWTS Labeled Data]
|
16 |
|
17 |
+
This is NOT the full model being used to tag [OpenAlex](https://openalex.org/) works with a topic. For that, check out the following github repo:
|
18 |
[OpenAlex Topic Classification](https://github.com/ourresearch/openalex-topic-classification)
|
19 |
|
20 |
+
That repository will also contain information about text preprocessing, modeling, testing, and deployment.
|
21 |
+
|
22 |
## Model description
|
23 |
|
24 |
+
The input data was trained using the following format (so it is recommended the data be in this format as well):
|
25 |
|
26 |
"\<TITLE\> {insert-processed-title-here}\n\<ABSTRACT\> {insert-processed-abstract-here}"
|
27 |
|
28 |
+
The quickest way to use this model in Python is with the following code (assuming you have the transformers library installed):
|
29 |
+
|
30 |
+
```
|
31 |
+
from transformers import pipeline
|
32 |
+
|
33 |
+
title = "{insert-processed-title-here}"
|
34 |
+
abstract = "{insert-processed-abstract-here}"
|
35 |
+
|
36 |
+
classifier = \
|
37 |
+
pipeline(model="OpenAlex/bert-base-multilingual-cased-finetuned-openalex-topic-classification-title-abstract", top_k=10)
|
38 |
+
|
39 |
+
classifier(f"""<TITLE> {title}\n<ABSTRACT> {abstract}""")
|
40 |
+
|
41 |
+
```
|
42 |
|
43 |
## Intended uses & limitations
|
44 |
|