Model Card for Model ID
This repository contains the embedding model used to embed artifact for traceability link prediction.
Model Details
used in the siamese models
Model Description
This embedding model is the encoder portion of the siamese model used in the paper cited. This model utilized a relational classifier to create similarity scores between text pairs resembling a cross-encoder and consistently ranked almost as high as the top performer.
- Developed by: Jinfeng Lin (translated by Alberto Rodriguez)
- Model type: Roberta encoder trained on automatic traceability link prediction.
- Language(s) (NLP): en
- License: mit
- Finetuned from model [optional]: See Cited Ppaer.
Model Sources [optional]
- Repository: https://github.com/jinfenglin/TraceBERT
- Paper: https://arxiv.org/abs/2102.04411
Uses
Used to embed software artifacts intended to be compared via cosine similarity.
Direct Use
Software traceability link prediction, Retrieval Augmented Generation, Artifact Clustering.
Downstream Use [optional]
The intended vision for this model within a traceability link prediction pipeline, used to retrieve software artifacts for an LLM prompt, and for clustering.
Out-of-Scope Use
This model could be used for a good set of starting weights for requirements classification.
Bias, Risks, and Limitations
This data uses open source git data which can be inaccurate and lead to unexpected results.
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
parent_artifacts = [
"Display Artifacts",
]
texts = [
"Display Artifacts", // parent artifact
"A table view should be provided to display all project artifacts.", // child 1
"The system should be able to generate documentation for a set of artifacts." // child 2
]
embeddings = model.encode(texts, convert_to_tensor=False)
parent_embedding = embeddings[0:1]
children_embeddings = embeddings[1:]
# Compute cosine similarity
sim_matrix = cosine_similarity(parent_embedding, children_embeddings)
Training, Evaluation, and Results Details
Please see cited paper for more information on training method, evaluation, and resuts.
- Downloads last month
- 2