CosmicRoBERTa
This model is a further pre-trained version of RoBERTa for space science on a domain-specific corpus, which includes abstracts from the NTRS library, abstracts from SCOPUS, ECSS requirements, and other sources from this domain. This totals to a pre-training corpus of around 75 mio words.
The model performs slightly better on a subset (0.6 of total data set) of the CR task presented in our paper SpaceTransformers: Language Modeling for Space Systems.
RoBERTa | CosmiRoBERTa | SpaceRoBERTa | |
---|---|---|---|
Parameter | 0.475 | 0.515 | 0.485 |
GN&C | 0.488 | 0.609 | 0.602 |
System engineering | 0.523 | 0.559 | 0.555 |
Propulsion | 0.403 | 0.521 | 0.465 |
Project Scope | 0.493 | 0.541 | 0.497 |
OBDH | 0.717 | 0.789 | 0.794 |
Thermal | 0.432 | 0.509 | 0.491 |
Quality control | 0.686 | 0.704 | 0.678 |
Telecom. | 0.360 | 0.614 | 0.557 |
Measurement | 0.833 | 0.849 | 0.858 |
Structure & Mechanism | 0.489 | 0.581 | 0.566 |
Space Environment | 0.543 | 0.681 | 0.605 |
Cleanliness | 0.616 | 0.621 | 0.651 |
Project Organisation / Documentation | 0.355 | 0.427 | 0.429 |
Power | 0.638 | 0.735 | 0.661 |
Safety / Risk (Control) | 0.647 | 0.727 | 0.676 |
Materials / EEEs | 0.585 | 0.642 | 0.639 |
Nonconformity | 0.365 | 0.333 | 0.419 |
weighted | 0.584 | 0.652(+7%) | 0.633(+5%) |
Valid. Loss | 0.605 | 0.505 | 0.542 |
BibTeX entry and citation info
@ARTICLE{
9548078,
author={Berquand, Audrey and Darm, Paul and Riccardi, Annalisa},
journal={IEEE Access},
title={SpaceTransformers: Language Modeling for Space Systems},
year={2021},
volume={9},
number={},
pages={133111-133122},
doi={10.1109/ACCESS.2021.3115659}
}
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.