Update README.md
Browse files
README.md
CHANGED
@@ -3,37 +3,36 @@ base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
|
|
3 |
library_name: peft
|
4 |
---
|
5 |
|
6 |
-
# Model Card for LLaMA 3.1 8B Instruct -
|
7 |
|
8 |
-
This model is a fine-tuned version of the LLaMA 3.1 8B Instruct model, specifically adapted for cybersecurity-related tasks.
|
9 |
|
10 |
## Model Details
|
11 |
|
12 |
### Model Description
|
13 |
|
14 |
-
This model is based on the LLaMA 3.1 8B Instruct model and has been fine-tuned on a custom dataset of cybersecurity-related
|
15 |
|
16 |
-
- **Developed by:**
|
17 |
- **Model type:** Instruct-tuned Large Language Model
|
18 |
- **Language(s) (NLP):** English (primary), with potential for limited multilingual capabilities
|
19 |
- **License:** [Specify the license, likely related to the original LLaMA 3.1 license]
|
20 |
- **Finetuned from model:** meta-llama/Meta-Llama-3.1-8B-Instruct
|
21 |
|
22 |
-
### Model Sources
|
23 |
|
24 |
-
- **Repository:**
|
25 |
-
- **Paper [optional]:** [If you've written a paper about this fine-tuning, link it here]
|
26 |
-
- **Demo [optional]:** [If you have a demo of the model, link it here]
|
27 |
|
28 |
## Uses
|
29 |
|
30 |
### Direct Use
|
31 |
|
32 |
This model can be used for a variety of cybersecurity-related tasks, including:
|
33 |
-
-
|
|
|
|
|
34 |
- Providing explanations of cybersecurity threats and vulnerabilities
|
35 |
-
-
|
36 |
-
- Offering guidance on best practices for cyber defense
|
37 |
|
38 |
### Out-of-Scope Use
|
39 |
|
@@ -41,16 +40,18 @@ This model should not be used for:
|
|
41 |
- Generating or assisting in the creation of malicious code
|
42 |
- Providing legal or professional security advice without expert oversight
|
43 |
- Making critical security decisions without human verification
|
|
|
44 |
|
45 |
## Bias, Risks, and Limitations
|
46 |
|
47 |
- The model may reflect biases present in its training data and the original LLaMA 3.1 model.
|
48 |
-
- It may occasionally generate incorrect or inconsistent
|
49 |
- The model's knowledge is limited to its training data cutoff and does not include real-time threat intelligence.
|
|
|
50 |
|
51 |
### Recommendations
|
52 |
|
53 |
-
Users should verify
|
54 |
|
55 |
## How to Get Started with the Model
|
56 |
|
@@ -61,7 +62,7 @@ from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
61 |
from peft import PeftModel, PeftConfig
|
62 |
|
63 |
# Load the model
|
64 |
-
model_name = "
|
65 |
config = PeftConfig.from_pretrained(model_name)
|
66 |
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
|
67 |
model = PeftModel.from_pretrained(model, model_name)
|
@@ -70,9 +71,9 @@ model = PeftModel.from_pretrained(model, model_name)
|
|
70 |
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
|
71 |
|
72 |
# Example usage
|
73 |
-
prompt = "
|
74 |
inputs = tokenizer(prompt, return_tensors="pt")
|
75 |
-
outputs = model.generate(**inputs, max_length=
|
76 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
77 |
```
|
78 |
|
@@ -80,7 +81,7 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
80 |
|
81 |
### Training Data
|
82 |
|
83 |
-
The model was fine-tuned on a custom dataset of cybersecurity-related questions and answers. [
|
84 |
|
85 |
### Training Procedure
|
86 |
|
@@ -96,25 +97,25 @@ The model was fine-tuned on a custom dataset of cybersecurity-related questions
|
|
96 |
|
97 |
## Evaluation
|
98 |
|
99 |
-
|
|
|
100 |
## Environmental Impact
|
101 |
|
102 |
- **Hardware Type:** NVIDIA A100
|
103 |
- **Hours used:** 12 Hours
|
104 |
- **Cloud Provider:** vast.io
|
105 |
|
106 |
-
|
107 |
-
## Technical Specifications [optional]
|
108 |
|
109 |
### Model Architecture and Objective
|
110 |
|
111 |
-
This model uses the LLaMA 3.1 8B architecture with additional LoRA adapters for fine-tuning. It was trained using a causal language modeling objective on cybersecurity-specific data.
|
112 |
|
113 |
### Compute Infrastructure
|
114 |
|
115 |
#### Hardware
|
116 |
|
117 |
-
|
118 |
|
119 |
#### Software
|
120 |
|
@@ -123,13 +124,10 @@ This model uses the LLaMA 3.1 8B architecture with additional LoRA adapters for
|
|
123 |
- Transformers 4.28+
|
124 |
- PEFT 0.12.0
|
125 |
|
126 |
-
## Model Card
|
127 |
|
128 |
Wyatt Roersma
|
129 |
|
130 |
## Model Card Contact
|
131 |
|
132 |
-
|
133 |
-
```
|
134 |
-
|
135 |
-
This README.md provides a comprehensive overview of your fine-tuned model, including its purpose, capabilities, limitations, and technical details. You should replace the placeholder text (like "[Your Name/Organization]") with the appropriate information. Additionally, you may want to expand on certain sections, such as the evaluation metrics and results, if you have more specific data available from your fine-tuning process.
|
|
|
3 |
library_name: peft
|
4 |
---
|
5 |
|
6 |
+
# Model Card for LLaMA 3.1 8B Instruct - YARA Rule Generation Fine-tuned
|
7 |
|
8 |
+
This model is a fine-tuned version of the LLaMA 3.1 8B Instruct model, specifically adapted for YARA rule generation and cybersecurity-related tasks.
|
9 |
|
10 |
## Model Details
|
11 |
|
12 |
### Model Description
|
13 |
|
14 |
+
This model is based on the LLaMA 3.1 8B Instruct model and has been fine-tuned on a custom dataset of YARA rules and cybersecurity-related content. It is designed to assist in generating YARA rules and provide more accurate and relevant responses to queries in the cybersecurity domain, with a focus on malware detection and threat hunting.
|
15 |
|
16 |
+
- **Developed by:** Wyatt Roersma (No organization affiliation)
|
17 |
- **Model type:** Instruct-tuned Large Language Model
|
18 |
- **Language(s) (NLP):** English (primary), with potential for limited multilingual capabilities
|
19 |
- **License:** [Specify the license, likely related to the original LLaMA 3.1 license]
|
20 |
- **Finetuned from model:** meta-llama/Meta-Llama-3.1-8B-Instruct
|
21 |
|
22 |
+
### Model Sources
|
23 |
|
24 |
+
- **Repository:** https://huggingface.co/vtriple/Llama-3.1-8B-yara
|
|
|
|
|
25 |
|
26 |
## Uses
|
27 |
|
28 |
### Direct Use
|
29 |
|
30 |
This model can be used for a variety of cybersecurity-related tasks, including:
|
31 |
+
- Generating YARA rules for malware detection
|
32 |
+
- Assisting in the interpretation and improvement of existing YARA rules
|
33 |
+
- Answering questions about YARA syntax and best practices
|
34 |
- Providing explanations of cybersecurity threats and vulnerabilities
|
35 |
+
- Offering guidance on malware analysis and threat hunting techniques
|
|
|
36 |
|
37 |
### Out-of-Scope Use
|
38 |
|
|
|
40 |
- Generating or assisting in the creation of malicious code
|
41 |
- Providing legal or professional security advice without expert oversight
|
42 |
- Making critical security decisions without human verification
|
43 |
+
- Replacing professional malware analysis or threat intelligence processes
|
44 |
|
45 |
## Bias, Risks, and Limitations
|
46 |
|
47 |
- The model may reflect biases present in its training data and the original LLaMA 3.1 model.
|
48 |
+
- It may occasionally generate incorrect or inconsistent YARA rules, especially for very specific or novel malware families.
|
49 |
- The model's knowledge is limited to its training data cutoff and does not include real-time threat intelligence.
|
50 |
+
- Generated YARA rules should always be reviewed and tested by security professionals before deployment.
|
51 |
|
52 |
### Recommendations
|
53 |
|
54 |
+
Users should verify and test all generated YARA rules before implementation. The model should be used as an assistant tool to aid in rule creation and cybersecurity tasks, not as a replacement for expert knowledge or up-to-date threat intelligence. Always consult with cybersecurity professionals for critical security decisions and rule deployments.
|
55 |
|
56 |
## How to Get Started with the Model
|
57 |
|
|
|
62 |
from peft import PeftModel, PeftConfig
|
63 |
|
64 |
# Load the model
|
65 |
+
model_name = "vtriple/Llama-3.1-8B-yara"
|
66 |
config = PeftConfig.from_pretrained(model_name)
|
67 |
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
|
68 |
model = PeftModel.from_pretrained(model, model_name)
|
|
|
71 |
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
|
72 |
|
73 |
# Example usage
|
74 |
+
prompt = "Generate a YARA rule to detect a PowerShell-based keylogger"
|
75 |
inputs = tokenizer(prompt, return_tensors="pt")
|
76 |
+
outputs = model.generate(**inputs, max_length=500)
|
77 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
78 |
```
|
79 |
|
|
|
81 |
|
82 |
### Training Data
|
83 |
|
84 |
+
The model was fine-tuned on a custom dataset of YARA rules, cybersecurity-related questions and answers, and malware analysis reports. [You may want to add more specific details about your dataset here]
|
85 |
|
86 |
### Training Procedure
|
87 |
|
|
|
97 |
|
98 |
## Evaluation
|
99 |
|
100 |
+
A custom YARA evaluation dataset was used to assess the model's performance in generating accurate and effective YARA rules. [You may want to add more details about your evaluation process and results]
|
101 |
+
|
102 |
## Environmental Impact
|
103 |
|
104 |
- **Hardware Type:** NVIDIA A100
|
105 |
- **Hours used:** 12 Hours
|
106 |
- **Cloud Provider:** vast.io
|
107 |
|
108 |
+
## Technical Specifications
|
|
|
109 |
|
110 |
### Model Architecture and Objective
|
111 |
|
112 |
+
This model uses the LLaMA 3.1 8B architecture with additional LoRA adapters for fine-tuning. It was trained using a causal language modeling objective on YARA rules and cybersecurity-specific data.
|
113 |
|
114 |
### Compute Infrastructure
|
115 |
|
116 |
#### Hardware
|
117 |
|
118 |
+
Single NVIDIA A100 GPU
|
119 |
|
120 |
#### Software
|
121 |
|
|
|
124 |
- Transformers 4.28+
|
125 |
- PEFT 0.12.0
|
126 |
|
127 |
+
## Model Card Author
|
128 |
|
129 |
Wyatt Roersma
|
130 |
|
131 |
## Model Card Contact
|
132 |
|
133 |
+
For questions about this model, please email Wyatt Roersma at [email protected].
|
|
|
|
|
|