Manager for Intelligent Knowledge Acess (MIKA)
SafeAeroBERT: A Safety-Informed Aviation-Specific Langauge Model
base-bert-uncased model first further pre-trained on the set of Aviation Safety Reporting System (ASRS) documents up to November of 2022 and National Trasportation Safety Board (NTSB) accident reports up to November 2022. A total of 2,283,435 narrative sections are split 90/10 for training and validation, with 1,052,207,104 tokens from over 350,000 NTSB and ASRS documents used for pre-training.
The model was trained on two epochs using AutoModelForMaskedLM.from_pretrained
with a learning_rate=1e-5
, and total batch size of 128 for just over 32100 training steps.
An earlier version of the model was evaluted on a downstream binary document classification task by fine-tuning the model with AutoModelForSequenceClassification.from_pretrained
. SafeAeroBERT was compared to SciBERT and base-BERT on this task, with the following performance:
Contributing Factor | Metric | BERT | SciBERT | SafeAeroBERT |
---|---|---|---|---|
Aircraft | Accuracy | 0.747 | 0.726 | 0.740 |
Precision | 0.716 | 0.691 | 0.548 | |
Recall | 0.747 | 0.726 | 0.740 | |
F-1 | 0.719 | 0.699 | 0.629 | |
Human Factors | Accuracy | 0.608 | 0.557 | 0.549 |
Precision | 0.618 | 0.586 | 0.527 | |
Recall | 0.608 | 0.557 | 0.549 | |
F-1 | 0.572* | 0.426 | 0.400 | |
Procedure | Accuracy | 0.766 | 0.755 | 0.845 |
Precision | 0.766 | 0.762 | 0.742 | |
Recall | 0.766 | 0.755 | 0.845 | |
F-1 | 0.766 | 0.758 | 0.784 | |
Weather | Accuracy | 0.807 | 0.808 | 0.871 |
Precision | 0.803 | 0.769 | 0.759 | |
Recall | 0.807 | 0.808 | 0.871 | |
F-1 | 0.805 | 0.788 | 0.811 |
More infomation on training data, evaluation, and intended use can be found in the original publication
Citation: Sequoia R. Andrade and Hannah S. Walsh. "SafeAeroBERT: Towards a Safety-Informed Aerospace-Specific Language Model," AIAA 2023-3437. AIAA AVIATION 2023 Forum. June 2023.
Notices:
Copyright © 2023 United States Government as represented by the Administrator of the National Aeronautics and Space Administration. All Rights Reserved.
Disclaimers
No Warranty: THE SUBJECT SOFTWARE IS PROVIDED "AS IS" WITHOUT ANY WARRANTY OF ANY KIND, EITHER EXPRESSED, IMPLIED, OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTY THAT THE SUBJECT SOFTWARE WILL CONFORM TO SPECIFICATIONS, ANY IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR FREEDOM FROM INFRINGEMENT, ANY WARRANTY THAT THE SUBJECT SOFTWARE WILL BE ERROR FREE, OR ANY WARRANTY THAT DOCUMENTATION, IF PROVIDED, WILL CONFORM TO THE SUBJECT SOFTWARE. THIS AGREEMENT DOES NOT, IN ANY MANNER, CONSTITUTE AN ENDORSEMENT BY GOVERNMENT AGENCY OR ANY PRIOR RECIPIENT OF ANY RESULTS, RESULTING DESIGNS, HARDWARE, SOFTWARE PRODUCTS OR ANY OTHER APPLICATIONS RESULTING FROM USE OF THE SUBJECT SOFTWARE. FURTHER, GOVERNMENT AGENCY DISCLAIMS ALL WARRANTIES AND LIABILITIES REGARDING THIRD-PARTY SOFTWARE, IF PRESENT IN THE ORIGINAL SOFTWARE, AND DISTRIBUTES IT "AS IS."
Waiver and Indemnity: RECIPIENT AGREES TO WAIVE ANY AND ALL CLAIMS AGAINST THE UNITED STATES GOVERNMENT, ITS CONTRACTORS AND SUBCONTRACTORS, AS WELL AS ANY PRIOR RECIPIENT. IF RECIPIENT'S USE OF THE SUBJECT SOFTWARE RESULTS IN ANY LIABILITIES, DEMANDS, DAMAGES, EXPENSES OR LOSSES ARISING FROM SUCH USE, INCLUDING ANY DAMAGES FROM PRODUCTS BASED ON, OR RESULTING FROM, RECIPIENT'S USE OF THE SUBJECT SOFTWARE, RECIPIENT SHALL INDEMNIFY AND HOLD HARMLESS THE UNITED STATES GOVERNMENT, ITS CONTRACTORS AND SUBCONTRACTORS, AS WELL AS ANY PRIOR RECIPIENT, TO THE EXTENT PERMITTED BY LAW. RECIPIENT'S SOLE REMEDY FOR ANY SUCH MATTER SHALL BE THE IMMEDIATE, UNILATERAL TERMINATION OF THIS AGREEMENT.
- Downloads last month
- 64