m3rg-iitd commited on
Commit
d270f93
1 Parent(s): aabd7e5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -15
README.md CHANGED
@@ -1,25 +1,26 @@
1
  # MatSciBERT
2
  ## A Materials Domain Language Model for Text Mining and Information Extraction
3
 
4
- This is the pretrained model presented in [MatSciBERT: A Materials Domain Language Model for Text Mining and Information Extraction](https://arxiv.org/abs/2109.15290), which is a BERT model trained on material science research papers.
5
 
6
- The training corpus comprises papers related to the broad category of materials: alloys, glasses, metallic glasses, cement and concrete. We have utilised the abstracts and full length of papers(when available). All the research papers have been downloaded from [ScienceDirect](https://www.sciencedirect.com/) using the [Elsevier API](https://dev.elsevier.com/). The detailed methodology is given in the paper.
7
 
8
  The codes for pretraining and finetuning on downstream tasks are shared on [GitHub](https://github.com/m3rg-repo/MatSciBERT).
9
 
10
  If you find this useful in your research, please consider citing:
11
  ```
12
- @article{gupta_matscibert_2021,
13
- title = {{{MatSciBERT}}: A {{Materials Domain Language Model}} for {{Text Mining}} and {{Information Extraction}}},
14
- shorttitle = {{{MatSciBERT}}},
15
- author = {Gupta, Tanishq and Zaki, Mohd and Krishnan, N. M. Anoop and Mausam},
16
- year = {2021},
17
- month = sep,
18
- journal = {arXiv:2109.15290 [cond-mat]},
19
- eprint = {2109.15290},
20
- eprinttype = {arxiv},
21
- primaryclass = {cond-mat},
22
- archiveprefix = {arXiv},
23
- keywords = {Computer Science - Computation and Language,Condensed Matter - Materials Science}}
24
  }
25
- ```
 
 
 
1
  # MatSciBERT
2
  ## A Materials Domain Language Model for Text Mining and Information Extraction
3
 
4
+ This is the pretrained model presented in [[MatSciBERT: A materials domain language model for text mining and information extraction](https://rdcu.be/cMAp5), which is a BERT model trained on material science research papers.
5
 
6
+ The training corpus comprises papers related to the broad category of materials: alloys, glasses, metallic glasses, cement and concrete. We have utilised the abstracts and full text of papers(when available). All the research papers have been downloaded from [ScienceDirect](https://www.sciencedirect.com/) using the [Elsevier API](https://dev.elsevier.com/). The detailed methodology is given in the paper.
7
 
8
  The codes for pretraining and finetuning on downstream tasks are shared on [GitHub](https://github.com/m3rg-repo/MatSciBERT).
9
 
10
  If you find this useful in your research, please consider citing:
11
  ```
12
+ @article{gupta_matscibert_2022,
13
+ title = {{{MatSciBERT}}: {{A}} Materials Domain Language Model for Text Mining and Information Extraction},
14
+ author = {Gupta, Tanishq and Zaki, Mohd and Krishnan, N. M. Anoop and {Mausam}},
15
+ year = {2022},
16
+ month = may,
17
+ journal = {npj Computational Materials},
18
+ volume = {8},
19
+ number = {1},
20
+ pages = {102},
21
+ issn = {2057-3960},
22
+ doi = {10.1038/s41524-022-00784-w}
 
23
  }
24
+ ```
25
+ widget:
26
+ - text: "Na2O is a network [MASK]."