Update README.md
Browse files
README.md
CHANGED
@@ -1,25 +1,26 @@
|
|
1 |
# MatSciBERT
|
2 |
## A Materials Domain Language Model for Text Mining and Information Extraction
|
3 |
|
4 |
-
This is the pretrained model presented in [MatSciBERT: A
|
5 |
|
6 |
-
The training corpus comprises papers related to the broad category of materials: alloys, glasses, metallic glasses, cement and concrete. We have utilised the abstracts and full
|
7 |
|
8 |
The codes for pretraining and finetuning on downstream tasks are shared on [GitHub](https://github.com/m3rg-repo/MatSciBERT).
|
9 |
|
10 |
If you find this useful in your research, please consider citing:
|
11 |
```
|
12 |
-
@article{
|
13 |
-
title = {{{MatSciBERT}}: A
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
keywords = {Computer Science - Computation and Language,Condensed Matter - Materials Science}}
|
24 |
}
|
25 |
-
```
|
|
|
|
|
|
1 |
# MatSciBERT
|
2 |
## A Materials Domain Language Model for Text Mining and Information Extraction
|
3 |
|
4 |
+
This is the pretrained model presented in [[MatSciBERT: A materials domain language model for text mining and information extraction](https://rdcu.be/cMAp5), which is a BERT model trained on material science research papers.
|
5 |
|
6 |
+
The training corpus comprises papers related to the broad category of materials: alloys, glasses, metallic glasses, cement and concrete. We have utilised the abstracts and full text of papers(when available). All the research papers have been downloaded from [ScienceDirect](https://www.sciencedirect.com/) using the [Elsevier API](https://dev.elsevier.com/). The detailed methodology is given in the paper.
|
7 |
|
8 |
The codes for pretraining and finetuning on downstream tasks are shared on [GitHub](https://github.com/m3rg-repo/MatSciBERT).
|
9 |
|
10 |
If you find this useful in your research, please consider citing:
|
11 |
```
|
12 |
+
@article{gupta_matscibert_2022,
|
13 |
+
title = {{{MatSciBERT}}: {{A}} Materials Domain Language Model for Text Mining and Information Extraction},
|
14 |
+
author = {Gupta, Tanishq and Zaki, Mohd and Krishnan, N. M. Anoop and {Mausam}},
|
15 |
+
year = {2022},
|
16 |
+
month = may,
|
17 |
+
journal = {npj Computational Materials},
|
18 |
+
volume = {8},
|
19 |
+
number = {1},
|
20 |
+
pages = {102},
|
21 |
+
issn = {2057-3960},
|
22 |
+
doi = {10.1038/s41524-022-00784-w}
|
|
|
23 |
}
|
24 |
+
```
|
25 |
+
widget:
|
26 |
+
- text: "Na2O is a network [MASK]."
|